### Day 6 - Central Limit Theorem
________________________________________________

  <br/>

- [Background](#Background)
- [Task 1](#Task)
- [Task 2](#task2)
- [Task 3](#task3)

  <br/>
  
#### Background 
The central limit theorem (CLT) states that, for a large enough sample $ n $, the distribution of the sample mean will approach normal distribution. This holds for a sample of independent random variables from any distribution with a finite standard deviation.


Let $ {x_1, x_2, ..., x_n} $ be a random data set of size $ n $, that is, a sequence of independent and identically distributed random variables drawn from distributions of expected values given by $ \mu $ and finite variances given by $ \sigma^2 $. The sample average is:

$$
S_n = \frac{\sum_{i=1}^n {x_i}}{N}
$$

For large $ n $, the distribution of sample sums $ S_n $ is close to normal distribution $ N(\mu´,\sigma´) $  where:

$$
\begin{eqnarray}
\mu & = n \times \mu \\
\sigma´ & = \sqrt{n} \times \sigma
\end{eqnarray}
$$

[Full tutorial link on HackerRank](https://www.hackerrank.com/challenges/s10-the-central-limit-theorem-1/tutorial)


#### Task

A large elevator can transport a maximum of __9800__ pounds. Suppose a load of cargo containing __49__ boxes must be transported via the elevator. The box weight of this type of cargo follows a distribution with a mean of __205__ pounds and a standard deviation of __15__ pounds. Based on this information, what is the probability that all __49__ boxes can be safely loaded into the freight elevator and transported?

##### Input Format

There are 4 lines of input (shown below):

    9800
    49
    205
    15

The first line contains the maximum weight the elevator can transport. The second line contains the number of boxes in the cargo. The third line contains the mean weight of a cargo box, and the fourth line contains its standard deviation.

##### Output Format

Print the probability that the elevator can successfully transport all 49 boxes, rounded to a scale of 4 decimal places.

In [1]:
import math

x=int(input())
n=int(input())
mean=int(input())
std=int(input())

# x=9800
# n=49
# mean=205
# std=15

# According to the central limit theorem, for a large enough sample n,
# the distribution of the sample mean will approach normal distribution N(_mu,_sigma), where
# _mu = n * mu
# _sigma = n**0.5 * sigma

mean1=n*mean
std1=(n**0.5) * std

def normal_distr(x,mu,sigma):
    z=(x-mu)/(sigma*2**0.5)
    return (1+math.erf(z))/2

print(f"{normal_distr(x,mean1,std1):0.4f}")

9800
49
205
15
0.0098


#### Task<a name="task2" />

The number of tickets purchased by each student for the University X vs. University Y football game follows a distribution that has a mean of __&mu; = 2.4__ and a standard deviation of __&sigma; = 2.0__ .

A few hours before the game starts, __100__ eager students line up to purchase last-minute tickets. If there are only __250__ tickets left, what is the probability that all __100__ students will be able to purchase tickets?


###### Input Format

There are 4 lines of input (shown below):

    250
    100
    2.4
    2.0

The first line contains the number of last-minute tickets available at the box office. The second line contains the number of students waiting to buy tickets. The third line contains the mean number of purchased tickets, and the fourth line contains the standard deviation.

###### Output Format

Print the probability that __100__ students can successfully purchase the remaining __250__ tickets, rounded to a scale of __4__ decimal places.

In [2]:
import math

x=int(input())
n=int(input())
mean=float(input())
std=float(input())

#x=250
#n=100
#mean=2.4
#std=2.0

mean1=n*mean
std1=(n**0.5) * std

def normal_distr(x,mu,sigma):
    z=(x-mu)/(sigma*2**0.5)
    return (1+math.erf(z))/2

print(f"{normal_distr(x,mean1,std1):0.4f}")

250
100
2.4
2.0
0.6915


#### Task<a name="task3" />

You have a sample of __100__ values from a population with mean __&mu; = 500__ and with standard deviation __&sigma; = 80__. Compute the interval that covers the middle __95%__ of the distribution of the sample mean; in other words, compute __A__ and __B__ such that __P(A < x < B) = 95%__. Use the value of __z = 1.96__. Note that  is the [z-score](https://en.wikipedia.org/wiki/Standard_score).

##### Input Format

There are five lines of input (shown below):

    100
    500
    80
    .95
    1.96

The first line contains the sample size. The second and third lines contain the respective mean (__&mu;__) and standard deviation (__&sigma;__). The fourth line contains the distribution percentage we want to cover (as a decimal), and the fifth line contains the value of __z__.

If you do not wish to read this information from stdin, you can hard-code it into your program.

##### Output Format

Print the following two lines of output, rounded to a scale of __2__ decimal places:

- On the first line, print the value of __A__.
- On the second line, print the value of __B__.

In [3]:
import math
n=int(input())
mean=int(input())
std=int(input())
distr=float(input())
z=float(input())

#n = 100
#mean = 500
#std = 80
#distr = 0.95
#z = 1.96

# we are given std of population, not sample, so we will need to find std of sample
std1 = std/n**0.5

A=mean-z*std1
B=mean+z*std1

print(f"{A:0.2f}")
print(f"{B:0.2f}")


100
500
80
0.95
1.96
484.32
515.68
