# DS106 Modeling : Lesson Five Companion Notebook

### Table of Contents <a class="anchor" id="DS106L5_toc"></a>

* [Table of Contents](#DS106L5_toc)
    * [Page 1 - Introduction](#DS106L5_page_1)
    * [Page 2 - Can You Pick a Number Randomly?](#DS106L5_page_2)
    * [Page 3 - Generating Random Numbers - Uniform Distribution](#DS106L5_page_3)
    * [Page 4 - Generating Random Numbers - Normal Distribution](#DS106L5_page_4)
    * [Page 5 - Generating Discrete Distributions](#DS106L5_page_5)
    * [Page 6 - Simulation](#DS106L5_page_6)
    * [Page 7 - Key Terms](#DS106L5_page_7)
    * [Page 8 - Lesson 5 Hands-On](#DS106L5_page_8)
    

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 1 - Overview of this Module<a class="anchor" id="DS106L5_page_1"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Introduction

In many domains, a model can be created to explain various natural phenomena. Once the model is created, there is usually a desire to determine if the model is any good or not. Sometimes this can be a tedious process, especially if data are collected very slowly. However, some models lend themselves to being tested by simulating what will happen using random numbers. In these situations, testing a model becomes a statistical exercise instead of a painstaking waiting game.

In order to run a good simulation, you need to collect data that fits the expected distribution. In this lesson, you will explore different distributions, and the process of simulating them.

By the end of this lesson, you should be able to:

* Recognize that humans are incapable of random number generation
* Generate random numbers in spreadsheet applications, R, and Python
* Generate normal distributions in spreadsheet applications, R, and Python
* Generate discrete distributions in spreadsheet applications
* Simulate data using spreadsheet applications

This lesson will culminate in a hands-on in which you will simulate profit for a retail company.

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>You may want to watch this <a href="https://vimeo.com/465628018"> recorded live workshop </a> that goes over the concepts in this lesson.</p>
    </div>
</div>

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 2 - Can You Pick a Number Randomly?<a class="anchor" id="DS106L5_page_2"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Can You Pick a Number Randomly?

Throughout this lesson, you will use functions to generate random numbers. The science of generating random numbers is actually a lot more complicated than you might suspect, since humans are particularly bad at thinking of random numbers off the top of their heads. The key to randomness is that every output has an equal probability of occurring. If you were to select 10,000 of your closest friends and ask each of them to come up with a random number between 1 and 10, and then you made a histogram of their responses, you might be surprised at the results.

If humans could truly generate random numbers off the top of their head, you would expect the histogram of 10,000 random numbers from 1 to 10 to look something like this:

![A histogram of ten thousand random numbers from one to ten. The height of vertical bars of the histogram are all very close to each other.](Media/L05-01.png)

There have been many studies done to observe what happens when humans try to come up with random numbers off the top of their head. The results of one such study can be found in this bar graph:

![A bar graph with an x axis ranging from one to twenty. The y axis ranges from zero percent to twenty percent. A key shows that dark blue bars represent human. Light blue bars represent computer. A horizontal dotted line crosses the graph at five percent. The graph shows that when generating random numbers, the number seventeen gets picked a lot more often by humans than by computers.](Media/L05-02.png)

In this example, numbers from 1 to 20 were generated both by a computer (light blue bars), and by humans (dark blue bars). The graph indicates that when generating "random" numbers, the number 17 gets picked a lot more often than if the numbers were randomly selected - about 3x more likely! That's a pretty good indication that humans don't pick random numbers! Most studies indicate that humans are much more likely to choose numbers in the middle of the possible values than on the ends of the possible values. Either way, it is a mistake to depend on humans to create random variables, which is why you'll explore computer methods in this lesson!

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 3 - Generating Random Numbers - Uniform Distribution<a class="anchor" id="DS106L5_page_3"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Generating Random Numbers - Uniform Distribution

As a quick reminder, a variable has a uniform distribution when any value between the minimum and maximum value is equally likely. Commonly, these numbers are distributed between zero and one.

There are many ways to generate random numbers using software. All four of the software tools you have been using in this program are capable of creating random numbers that are uniformly distributed.

---

## Spreadsheet Programs 

Creating a random number in a spreadsheet progrma, whether MS Excel, Google Sheets, or Libre Office, just requires a single function inside a cell. 

---

### Generating Numbers Between 0 and 1

If you put the command:

`=RAND()`

in a cell, a random number (uniformly distributed) between 0 and 1 will appear after you hit return.

---

### Generating User-Specied Ranges

What if you wanted to generate a number between 8 and 22? Or any other set of numbers that are not 0 and 1? This requires a bit of modification to the command, in two steps. First of all, you need to calculate the _range_ of the desired output. In this case, the range is 22 - 8 = 14. Next, you need to find the min value, which is 8 in this case.

Now, you will make a slight modification to the function to achieve the desired outcome:

`=RAND()*14 + 8`


You will multiply the ```RAND()``` function by the range of your output and then specify the minimum value with ```+ 8```. 

---

## R

In R, the ```runif()``` function will generate a random number that is uniformly distributed. The function and arguments are as follows:

```{r}
runif(n, min, max)
```

where ```n``` is the quantity of numbers to be generated, ```min``` is the minimum value, and ```max``` is the maximum value. The ```min``` and ```max``` arguments default to 0 and 1, so if you just want a uniform random variable between 0 and 1 you can omit the other two arguments.

---

### Generating Numbers Between 0 and 1

For example, the function below will create 15 randomly generated numbers between 0 and 1:

```{r}
runif(15)
```

It might look something like this (but remember, these are randomly generated, so yours won't be the same numbers!)

```text
 [1] 0.1169464 0.6411236 0.6279605 0.9703532 0.7562719 0.7834032 0.9531575 0.6610444 0.9100471 0.3565478
[11] 0.2172311 0.4392505 0.2889421 0.7529348 0.8727064
```

---

### Generating User-Specified Values

This function will create 8 randomly generated numbers between 12 and 20:

```{r}
runif(8, 12, 20)
```

It might look something like this:

```text
[1] 13.18007 12.68679 17.86285 14.93277 18.02345 12.30586 12.49944 16.10525
```

---

## Python

There are several different ways to generate random numbers using Python. For this example, you will use the ```random``` package, so import it: 

```python
import random
```

---

### Generating Numbers Between 0 and 1

The function to generate a single value from a uniform random variable between 0 and 1 is ```random.random()```. You can save the number to a variable if you like (this one is called ```value``` and then use the ```print()``` function on it:

```python
value = random.random()
print(value)
```

Running the above in Python will result in a single number between 0 and 1, with about 15 digits past the decimal.

---

### Generating User-Specified Values

In order to get a random value between two other points - say 30 to 50, for example - you need to change the code just a bit. Now, instead of using the ```random``` function within the ```random``` package, you will use the ```random.uniform()``` function, as follows:

```python
value = random.uniform(30, 50)
print(value)
```

All you need to do is specify the starting point (30) and the ending point (50). 

Note that for all 4 methods listed above, random numbers were generated that are always positive. The method for generating a random number that is negative, say -30 to -12, or generating a random number that could be either negative or positive, say -5 to 5 is the same.

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 4 - Generating Random Numbers - Normal Distribution<a class="anchor" id="DS106L5_page_4"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Generating Random Numbers - Normal Distribution

Now take a look at how to generate random numbers that are normally distributed. For a visual, you have been generating random numbers whose distribution looks like this:

![A chart titled uniform distribution from twenty to eighty. The y axis runs from zero to forty five. Vertical bars run across the chart. The lowest is twenty five, and the highest is forty one.](Media/L05-06.png)

and now want to generate random numbers whose distribution looks like this:

![A chart titled normal distribution with mean equals fifty and standard deviation equals ten. The y axis runs from zero to ninety in increments of ten. Vertical bars run left to right across the chart, forming a typical bell curve look of a normal distribution.](Media/L05-07.png)

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>Did you notice that neither of these distributions is perfectly smooth, or perfectly symmetric? If they were, you should be suspicious that the numbers had not been randomly generated. The fact that these numbers are random is the reason they don't have perfect symmetry or perfect smoothness.</p>
    </div>
</div>

---

## Spreadsheet Programs

Both Google Sheets and MS Excel leverage the uniform distribution to create a normal distribution, with the ```NORMINV()``` function. The ```NORMINV``` function has three arguments:

* Probability
* Mean
* Standard deviation

So the function below will take a probability of .3 and generate a normal distribution with a mean of 25 and a standard deviation of 8.

`=NORMINV(0.3, 25, 8)`

But what if you didn't want to include a specific probability, and instead make everything random? Then you can include the ```rand()``` function within your ```norminv()``` function! So, take the above command and modifying it slightly, you have:

`=NORMINV(rand(), 25, 8)`

---

## R

In R, there are several functions that will generate random normally distributed variables. One simple function is ```rnorm()```, which has the arguments of sample size, mean, and standard deviation.  Generically, it would look like this:

```{r}
rnorm(n, mean, sd)
```

Taking it for a spin, the function well will generate a normal distribution of 20 cases that has a mean of 35 and a standard deviation of 7:

```{r}
x = rnorm(20, 35, 7)
print(x)
```

And you might receive something that looks something like this: 

```text
[1] 36.49904 18.71189 33.19152 28.49317 39.98666 32.00083 47.50260 30.27756 36.29465 42.88910 32.84797
[12] 43.53574 43.03957 30.84006 36.63394 28.59698 42.38224 38.43967 39.42955 45.83769
```

---

## Python

There are also several ways to do this in Python. 

---

### Generating the Standard Normal Distribution

As shown here, you can generate 5 random numbers according to the standard normal distribution (mean of zero, standard deviation of 1) with the package ```numpy```. So start by importing it: 

```python
import numpy as np
```

Then you can create a vector of random numbers using the function ```np.random.normal()``` with the argument ```size=``` to specify how many random numbers you'd like:

```python
x = np.random.normal(size = 5)
```

and then print it: 

```python
print(x)
```

Your results should look something like this:

```text
[ 2.22430932 -0.81752269 -1.96034014 -0.44019808  2.08944964]
```

---

### Generating a Normal Distribution Centered Elsewhere

In order to have the normal distribution centered somewhere other than 0, with a standard deviation other than 1, you need to do a little manipulation. Suppose you wanted a normal distribution centered at 50, with a standard deviation of 10? You could simply multiply the function by the standard deviation (10), and then add the mean (50):

```python
x = (np.random.normal(size=5))*10 + 50
print(x)
```

The code above should result in something like this:

```text
[58.60412529 41.24657778 66.94803538 62.92204703 40.13522851]
```

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 5 - Generating Discrete Distributions<a class="anchor" id="DS106L5_page_5"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Generating Discrete Distributions

So far, you have learned about simulating continuous distributions, where a random number can take on any value. Another common type of randomly generated variable is the discrete type. For instance, you might want to simulate a coin toss.  There are only two options there - heads or tails.  You can't get anything else, like feet or stomachs. Other examples of discrete distributions are the roll of a 6-sided die, or the drawing of a card from a standard deck of playing cards. In each of these cases, it is common to simply take the uniform distribution and break it into the number of possible outcomes, and then map the outcome for an individual trial to the discrete variable. 

Suppose you want to do some sort of project where you need to flip a coin. If you only need to do it once to make a decision, that is no big deal - you just grab a coin from your pocket or purse, or from that one dish you have in the kitchen and flip it, then record the results. Well, what if the project calls for 5 coin flips? Still no big deal, right? Just flip the coin 5 times and record the results. But suppose you need to flip the coin 10,000 times! Even if you were inclined to do that by hand, it would take a long time. Even if you could get it down to 5 flips per minute (including the time waiting for the coin to stop bouncing or spinning, and then recording the result), that is 2000 minutes worth of work, which is 33 hours and 20 minutes, and that doesn't account for bathroom breaks, sleeping, or eating.

![A hand using its thumb to flip a coin.](Media/L05-14.png)

There is a much easier way. In a matter of just a couple of minutes, you can generate all those coin flips! 

---

## Generating Coin Flips

Open a new spreadsheet, and in the first cell, A1, create a random number using the command:

`=rand()`

Now, move to cell B1, and create a function that interprets the random number into a "H" or a "T" for 'heads' or 'tails.'  You can use the ```if()``` function to s tart off with, and then, if the probability of your first cell is less than 1/2, or .5, it will generate an "H." Otherwise, it will generate a "T". 

`if(A1 < (1/2), "H", "T")`

Something like this should result: 

![A portion of a spreadsheet. Cell A one shows zero point seven one nine eight eight two nine nine three. Cell B one shows T.](Media/L05-15.png)

The cell B1 will be populated based on your formula, which is based on the value that is generated in A1. 

Now select both cells, and note the autofill block at the bottom right of the selected cells:

![A portion of a spreadsheet. The function field reads equals sign rand open parentheses close parentheses. Cell A one shows zero point seven one nine eight eight two nine nine three. Cell B one shows T. Both cells are selected and highlighted. At the bottom right of the selected area is a dark blue square. A red arrow points to the square and text reads, this is the auto fill block.](Media/L05-16.png)

If you take the cursor and hover over the autofill block, it will change from an arrow to a cross. Clicking on the autofill block and then dragging down will copy those cells down, no matter how far you drag it, whether it is 50 rows, 3000 rows, or 20,000 rows. For now, create 100 coin flips by clicking and dragging.

![A portion of a spreadsheet with data showing the probability of heads or tails when flipping a coin. The function field reads equals sign rand open parentheses close parentheses. Each row in column A contains a random number between zero and one. Each row in column B contains either H for heads or T for tails.](Media/L05-17.png)

Now, do a sanity check: note that whenever column A is less than 1/2, column B says "H." Whenever column A is greater than 1/2, column B says "T." 

---

## Rolling a Six-Sided Die

Now you can simulate rolling a die. This is the same as simulating a coin flip, except there are six possible outcomes instead of two.

Setting up a spreadsheet to simulate the roll of a die starts the same way as above. You need to set up a column of random numbers using the ```rand()``` function:

`=rand()`

Just to keep the table simple, try simulating 15 rolls of a die, so drag the formula down 15 rows:

![Fifteen rows of column A in a spreadsheet. Each cell contains a number between zero and one, to many decimal places.](Media/L05-19.png)

The next step, creating the formula in Column B, is a bit more complex. This time, if the random number is less than 1/6, you want to report a "1" in column B. If the random number is greater than 1/6 but less than 2/6, you want to report a "2" in column B, and so forth.

---

### Nesting If() Functions

This will require a nested approach to the ```if()``` function. The ```if()``` function has three arguments. The first is the conditional statement, the second is what to do if the conditional statement is true, and the third is what to do if the conditional statement is false.

If you want to nest two ```if()``` functions, you would put the second ```if()``` function in either the second or third argument. Consider this nested function:

`=IF(A1 < (1/6), 1, IF(A1 < (2/6), 2, "other"))`

Notice how the second ```if()``` function is contained within third argument for the first ```if()``` function. The way the command is executed is as follows:

* Evaluate the first ```if()``` function
* If the first ```if()``` function is true, then print a '1' in that cell, and DO NOT EVALUATE ANY OTHER ```IF()``` FUNCTIONS (This is important)
* If the first ```if()``` function is false, this triggers the second ```if()``` function
* Evaluate the second ```if``` function
* If the second ```if()``` function is true, then print a '2' in that cell
* If the second ```if()``` function is false, then print 'other' in that cell

Note that a nested ```if()``` function is similar to an 'else if' scenario in many programming languages.

Okay, now expand the 'if' statement to cover all 6 cases. This will require a lengthy nesting of 'if' statements:

`=IF(A1 < (1/6), 1, IF(A1 < (2/6), 2, IF(A1 < (3/6), 3, IF(A1 < (4/6), 4, IF(A1 < (5/6), 5, 6)))))`

This is what it might look like:

![Fifteen rows of column A and column B in a spreadsheet. Each row for column A contains a number between zero and one, to many decimal places. Each row for column B contains a whole number from one to six.](Media/L05-20.png)

---

### Generating Random Values and Modifying Them

This is not the only way to create discrete random variables. Another way to simulate rolling a die would be to take the random variable, multiply it by six, add one, and then chop off the number at the decimal point. This could be accomplished using the following function:

`=INT(rand()*6 + 1)`

Obviously, this is a much more compact approach. The nested ```if()``` functions did the trick, but could become very cumbersome when there are several different options for the output.

---

## Other Discrete Distributions

Sometimes the discrete distributions are unbalanced. For example, suppose you had a box containing nine balls - four red, three blue, and two yellow. If you wanted to simulate drawing a single ball from the box, and you wanted to use the nested ```if``` statements approach, your function would probably look something like this:

`=IF(A1 < (4/9), "red", IF(A1 < (7/9), "blue", "yellow"))`

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 6 - Simulation<a class="anchor" id="DS106L5_page_6"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Simulation

Now you are at the "so what?" point of this discussion on generating random numbers. What is the benefit of being able to generate random numbers? The answer is "simulation." Simulation is practically free, whereas generating actual data can be costly in terms of both time and resources. It also allows a company to try out a scenario by simulation without the risk attached to changing a business or marketing strategy.

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>Check out some of these interesting data simulation scenarios on the <a href="https://towardsdatascience.com/every-data-scientist-needs-to-read-these-simulation-stories-7be0531e782f"> Towards Data Science blog.</a></p>
    </div>
</div>

---

## Shipping Costs

A large manufacturing company wants to estimate the cost of shipping next quarter. Specifically, they want to assess the risk of spending more that $1.48M on shipping next quarter. The company could just wait around and see how much they spend next quarter, or they could take figures that are pretty well known, and go from there. For example, suppose the company knows the following:

* The amount of goods that will be shipped next quarter has a normal distribution with mean 300,000 kg, and a standard deviation of 20,000 kg.
* The total cost of shipping on kg of material ranges from $4.10 per kg to $4.75 per kg, and is distributed uniformly.
* Shipping by ocean is much cheaper than shipping by air. Most of the goods are shipped by ocean, and shipping by air is only done when an unavoidable rush shipment is required. There is a multiplier that accounts for the ocean/air ratio, and it ranges from 1.08 to 1.12, and is uniformly distributed.
* The equation for total shipping cost is this: Cost = OAM _ ((total kg) _ ($ per kg)), where OAM is the Ocean-Air Multiplier.

You can simulate all three variables, and calculate a total cost for each simulation, and then look at the distribution of the simulated costs. You will need two uniform distributions, and one normal distribution. For this simulation, you will use a spreadsheet program. 

---

### Generating Shipping Volume

For column A, which is the shipping volume, you need a normally distributed variable with a mean of 300,000 and a standard deviation of 20,000.

`=NORMINV(RAND(), 300000, 20000)`

---

### Generating Cost per Kilogram

For column B, which is the cost per kg,  need a uniformly distributed variable with min of $4.10 and a max of $4.75.

`=RAND()*0.65 + 4.1`

---

### Generating the Ocean-Air Multiplier (OAM)

For column C, which is the OAM, you need a uniformly distributed variable with min of 1.08, and max of 1.12.

`=RAND()*0.04 + 1.08`

---

### Generating Total Shipping Cost

Finally, in column D, you have the total simulated shipping cost for the quarter, which is simply columns A, B, and C multiplied together.

`=A2 * B2 * B2`

The first simulation looks something like this:

![A spreadsheet with four columns. A one, volume, A two, two seven two seven eight one point seven eight two seven. B one, cost per kilogram, B two, four dollars forty six cents. C one, O A M, C two, one point one zero five two seven three four five nine. D one, total cost. D two, one million three hundred forty five thousand seven hundred and seventy seven dollars and fifty one cents.](Media/L05-28.png)

---

### Generating Many Rows of Data

The first trial indicates you have budgeted enough for shipping at $1.48M for next quarter. But is one simulation enough? Of course not! You are just simulating, so you can do a whole bunch. You are going to simulate this expense 100 times. This requires use of the autofill block - either that, or a whole lot of typing:

![A spreadsheet with four columns with headings. A one, volume. B one, cost per kilogram. C one, O A M. D one, total cost. The eighteen rows beneath show data pertaining to each heading.](Media/L05-29.png)

A quick scan down column D shows that there are plenty of values greater than $1.48M, as well as plenty that are less.

Now you can make a histogram of the total cost in column D:

![A histogram title simulated shipping cost for next quarter. The y axis runs from zero to eight hundred in increments of one hundred. Vertical bars are distributed in a bell curve shape. A red vertical line extends much higher than the other vertical bars and represents one point four eight million dollars.](Media/L05-30.png)

The red vertical line represents $1.48M. It looks like you had better tell you manager that based on the simulation, you will probably spend less than $1.48M (because more of the histogram is to the left of the vertical red line). However, you could end up spending even as much as about $1.7M, so you had better budget accordingly to be on the safe side!

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 7 - Key Terms<a class="anchor" id="DS106L5_page_7"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Key Terms

Below is a list and short description of the important keywords learned in this lesson. Please read through and go back and review any concepts you do not fully understand. Great Work!

---

## Key Spreadsheet Code

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>=rand()</td>
        <td>Generates a number between 0 and 1.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>=norminv()</td>
        <td>Generates a normal distribution from the probability, mean, and standard deviation.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>=if()</td>
        <td>To provide conditional logic.</td>
    </tr>
</table>

---

## Key R Code

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>runif()</td>
        <td>Generates a set number of random numbers between a minimum and maximum value.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>rnorm()</td>
        <td>Generates a normal distribution of a certain size with a certain mean and standard deviation.</td>
    </tr>
</table>

---

## Key Python Packages

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>random</td>
        <td>Used for generating random numbers.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>numpy</td>
        <td>Used for generating random numbers and working with arrays.</td>
    </tr>
</table>



---

## Key Python Code

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>random.random()</td>
        <td>Generates a random number between 0 and 1.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>random.uniform()</td>
        <td>Generates a random number between a minimum and maximum value.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>np.random.normal()</td>
        <td>Generates a random distribution of a certain size.</td>
    </tr>
</table>

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 8 - Lesson 5 Hands-On<a class="anchor" id="DS106L5_page_8"></a>

[Back to Top](#DS106L5_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">



This Hands-­On **will** be graded, so make sure you complete each part. When you are done, please submit one document with all of your findings for grading.

<div class="panel panel-danger">
    <div class="panel-heading">
        <h3 class="panel-title">Caution!</h3>
    </div>
    <div class="panel-body">
        <p>Do not submit your project until you have completed all requirements, as you will not be able to resubmit.</p>
    </div>
</div>

---

## Simulation Hands-On

A retailer wants to create a simulation to predict the profit on the sales of a certain tool she carries.  She knows the profit is a function of several factors, for which she has historical data: 

* **Units Sold:** Normal distribution, with a mean of 26 units and a standard deviation of 5.7 units.
* **Price:** Discrete distribution. 55% of the time the price is $38, 30% of the time the price is $41.50, and 15% of the time is $36.25.
* **Cost:** Uniform distribution, with a max of $33.72 and a min of $26.88.
* **Resource Factor:** Normal distribution, with a mean of 3 and a standard deviation of 1.2.

The function for profit is as follows:

```Profit = (RF * (Units sold) * (Price)) - ((0.2) * (RF) * (Units sold) * (Cost)) + $320```

Create a simulation that has 100 rows of monthly profits. Once you have completed this simulation exercise, prepare a report stating what you did, what you learned, and your results. You have the option to complete your simulation in either R or Excel.  Then submit it for grading.

<div class="panel panel-danger">
    <div class="panel-heading">
        <h3 class="panel-title">Caution!</h3>
    </div>
    <div class="panel-body">
        <p>Be sure to zip and submit your entire directory when finished!</p>
    </div>
</div>