### Codio Activity 2.2: Gaussian Distributions

**Expected Time**: 45 Minutes

**Total Points**: 10

This activity focuses on generating and examining gaussian distributions using `scipy.stats`.  The main idea is to use the distribution to generate a sample and compare the sample statistics to those known in your distribution.   Note that you are to use the `.rvs` method of the distribution object rather than any numpy methods directly for generating samples.  

#### Index:

- [Problem 1](#Problem-1:-Creating-Gaussian-Distribution-Object)
- [Problem 2](#Problem-2:-Random-Samples-from-Distribution)
- [Problem 3](#Problem-3:-Statistics-of-Sample)
- [Problem 4](#Problem-4:-Plotting-the-Distribution-and-Sample)


In [1]:
from scipy.stats import uniform
from scipy.stats import norm
import matplotlib.pyplot as plt
import numpy as np

[Back to top](#Index:) 

### Problem 1: Creating a Gaussian Distribution Object

**2 Points**

Note that the normal distribution function has been imported as `norm` from the `scipy.stats` library above.  Use this distribution to create a normal distribution centered at 5 with standard deviation 2.  Assign your solution as a distribution object to `gauss1` below.
    

    

In [2]:
### GRADED
# Create a gausssian distribution from sample set

gauss1 = None

### BEGIN SOLUTION
gauss1 = norm(loc = 5, scale = 2)
### END SOLUTION

# ANSWER CHECK
print(type(gauss1))

<class 'scipy.stats._distn_infrastructure.rv_frozen'>


In [3]:
### BEGIN HIDDEN TESTS 
gauss1_ = norm(loc = 5, scale = 2)
#
#
#
assert type(gauss1) == type(gauss1_), 'Make sure this is a distribution object'
### END HIDDEN TESTS

[Back to top](#Index:) 

### Problem 2: Random Samples from Distribution

**3 Points**

Use the `.rvs` method of `gauss1` to generate 100 random samples from the distribution.  Be sure to set the `random_state = 12` in the `.rvs` method.  Assign your response as an array to `samples` below.

In [4]:
### GRADED

samples = ''

### BEGIN SOLUTION
samples = gauss1.rvs(100, random_state = 12)
### END SOLUTION

# ANSWER CHECK
print(type(samples))
print(len(samples))

<class 'numpy.ndarray'>
100


In [5]:
### BEGIN HIDDEN TESTS 
samples_ = gauss1_.rvs(100, random_state = 12)
#
#
#
assert type(samples) == type(samples_), 'Make sure your solution is an array'
np.testing.assert_array_equal(samples, samples_)
### END HIDDEN TESTS

[Back to top](#Index:) 

### Problem 3: Statistics of Sample

**2 Points**

Using your `samples` array from above, determine the mean and standard deviation of the sample values.  Use `np.mean` and `np.std` to determine these and assign your solutions as type `numpy.float64` to `sample_mean` and `sample_std` below.  Compare these to the actual mean and standard deviation.  Are they close?

In [6]:
### GRADED

sample_mean = ''
sample_std = ''

### BEGIN SOLUTION
sample_mean = np.mean(samples)
sample_std = np.std(samples)
### END SOLUTION

# ANSWER CHECK
print(sample_mean)
print(type(sample_mean))

4.711385004722623
<class 'numpy.float64'>


In [7]:
### BEGIN HIDDEN TESTS 
sample_mean_ = np.mean(samples_)
sample_std_ = np.std(samples_)
#
#
#
assert sample_mean == sample_mean_
assert sample_std == sample_std_
### END HIDDEN TESTS

[Back to top](#Index:) 

### Problem 4: Plotting the Distribution and Sample

**3 Points**

The distribution created is centered at 5 with a standard deviation of 2.  By properties of the normal distribution, 99% of the data will be with $\pm ~ 3\sigma$.  Accordingly, create an array `x` using `np.linspace` with 1000 evenly spaced values $\mu \pm 3\sigma$ (plus and minus three standard deviations from the mean).  Use this distribution to plot the `.pdf` of the `gauss1` distribution.  On the same plot, add a histogram of the sample data. 

Once `x` has been defined, the code below will produce the accompanying plot.

```python
plt.plot(x, gauss1.pdf(x), color = 'black', linewidth = 4, label = 'distribution')
plt.hist(samples, density=True, alpha = 0.2, bins = 10, edgecolor = 'black', label = 'sample')
plt.legend();
```

![](images/distandsamples.png)

In [8]:
### GRADED

x = ''


### BEGIN SOLUTION
x = np.linspace(-1, 11, 1000)
### END SOLUTION

###uncomment the code below to make the plot
# plt.plot(x, gauss1.pdf(x), color = 'black', linewidth = 4, label = 'distribution')
# plt.hist(samples, density=True, alpha = 0.2, bins = 10, edgecolor = 'black', label = 'sample')
# plt.legend();

# ANSWER CHECK
print(type(x))
print(len(x))

<class 'numpy.ndarray'>
1000


In [9]:
### BEGIN HIDDEN TESTS 
x_ = np.linspace(-1, 11, 1000)
#
#
#
assert len(x) == len(x_)
np.testing.assert_allclose(x, x_, err_msg='Array should have 1000 points from -1 to 11')
### END HIDDEN TESTS