# Binomial Distribution

### Exercise: Checking If a Random Variable Follows a Binomial Distribution

In this exercise, we will practice how to verify if a random variable follows a binomial distribution. We will also create a random variable using scipy.stats and plot the distribution. This will be a mostly conceptual exercise.

Here, we will check if the random variable, Z: number of defective auto parts in a 12-box pack, follows a binomial distribution (remember that we consider 4% of the auto parts are defective). Follow these steps to complete this exercise:

1.Import NumPy, Matplotlib, and scipy.stats following the usual conventions:

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats

2.Try to conceptually check if Z fulfills the properties given for a binomial random variable:

📌 Discrete Probability Distribution;alabilecegi sinirli degerler vardir.Binomial dagilimda bir discrete dagilimdir.

📌Binomial dagilim olup olmadigina karar vermek icin asagidaki 4 ozelligi check etmeliyiz:

    ✔️Tekrarlanan denemeler
    ✔️Iki olasi sonuc
        📎basari -->success
        📎basarisizlik --> failure
    ✔️Basari olasiligi sabit
    ✔️Denemeler bagimsiz
      

![image.png](attachment:image.png)

3.Determine the p and n parameters for the distributions of this variable, that is, p = 0.04 and n = 12.

4.Use the theoretical formula with the former parameters to get the exact theoretical probability of getting exactly one defective piece per box (using x = 1): 

In [15]:
import math
print('P(Z = 1) = ', (math.factorial(12)/(math.factorial(1)*math.factorial(12-1)))*p**1*(1-p)**(12-1))

P(Z = 1) =  0.3063548786648836


5.Use the scipy.stats module to produce an instance of the Z random variable. Name it Z_rv:

![image.png](attachment:image.png)

In [6]:
Z_rv=stats.binom.pmf(1,12,0.04)
Z_rv

0.30635487866488303

6.Plot the probability mass function of Z:

In [8]:
n=12
p=0.04

dist=[]

for i in range(n+1):
    dist.append(stats.binom.pmf(i,n,p))
    print(str(i) + "\t" + str(stats.binom.pmf(i,n,p)))

0	0.6127097573297674
1	0.30635487866488303
2	0.0702063263607024
3	0.00975087866120869
4	0.0009141448744883129
5	6.094299163255422e-05
6	2.9625065376936106e-06
7	1.0580380491762873e-07
8	2.755307419729918e-09
9	5.1024211476480043e-11
10	6.378026434559986e-13
11	4.831838207999973e-15
12	1.6777216000000067e-17


In [10]:
dist

[0.6127097573297674,
 0.30635487866488303,
 0.0702063263607024,
 0.00975087866120869,
 0.0009141448744883129,
 6.094299163255422e-05,
 2.9625065376936106e-06,
 1.0580380491762873e-07,
 2.755307419729918e-09,
 5.1024211476480043e-11,
 6.378026434559986e-13,
 4.831838207999973e-15,
 1.6777216000000067e-17]

In [17]:
import plotly.express as px

fig = px.bar(x=list(range(n+1)), y=dist)
fig.show()

# Normal Distribution

### Exercise: Using the Normal Distribution in Education

In this exercise, we'll use a normal distribution object from scipy.stats and the cdf and its inverse, ppf, to answer questions about education.

In psychometrics and education, it is a well-known fact that many variables relevant to education policy are normally distributed. For instance, scores in standardized mathematics tests follow a normal distribution. In this exercise, we'll explore this phenomenon: in a certain country, high school students take a standardized mathematics test whose scores follow a normal distribution with the following parameters: mean = 100, standard deviation = 15. Follow these steps to complete this exercise:

1.Import NumPy, Matplotlib, and scipy.stats following the usual conventions:

📌Normal dagilim,simetrik ,can seklindedir.
![image.png](attachment:image.png)

![image.png](attachment:image.png)

2.Use the scipy.stats module to produce an instance of a normally distributed random variable, named X_rv, with mean = 100 and standard deviation = 15:

In [21]:
mean=100
sd=15

stats.norm.pdf(x=1,loc =mean, scale = sd)

9.244533294435448e-12

3.Plot the probability distribution of X:

In [50]:
a=np.linspace(40,160,1000)
a

array([ 40.        ,  40.12012012,  40.24024024,  40.36036036,
        40.48048048,  40.6006006 ,  40.72072072,  40.84084084,
        40.96096096,  41.08108108,  41.2012012 ,  41.32132132,
        41.44144144,  41.56156156,  41.68168168,  41.8018018 ,
        41.92192192,  42.04204204,  42.16216216,  42.28228228,
        42.4024024 ,  42.52252252,  42.64264264,  42.76276276,
        42.88288288,  43.003003  ,  43.12312312,  43.24324324,
        43.36336336,  43.48348348,  43.6036036 ,  43.72372372,
        43.84384384,  43.96396396,  44.08408408,  44.2042042 ,
        44.32432432,  44.44444444,  44.56456456,  44.68468468,
        44.8048048 ,  44.92492492,  45.04504505,  45.16516517,
        45.28528529,  45.40540541,  45.52552553,  45.64564565,
        45.76576577,  45.88588589,  46.00600601,  46.12612613,
        46.24624625,  46.36636637,  46.48648649,  46.60660661,
        46.72672673,  46.84684685,  46.96696697,  47.08708709,
        47.20720721,  47.32732733,  47.44744745,  47.56

In [51]:
stats.norm.pdf(a,mean,sd)

array([8.92201505e-06, 9.21213643e-06, 9.51108187e-06, 9.81909877e-06,
       1.01364408e-05, 1.04633679e-05, 1.08001466e-05, 1.11470502e-05,
       1.15043587e-05, 1.18723589e-05, 1.22513451e-05, 1.26416184e-05,
       1.30434876e-05, 1.34572690e-05, 1.38832865e-05, 1.43218720e-05,
       1.47733654e-05, 1.52381148e-05, 1.57164767e-05, 1.62088160e-05,
       1.67155066e-05, 1.72369309e-05, 1.77734808e-05, 1.83255572e-05,
       1.88935704e-05, 1.94779404e-05, 2.00790971e-05, 2.06974802e-05,
       2.13335397e-05, 2.19877361e-05, 2.26605403e-05, 2.33524340e-05,
       2.40639102e-05, 2.47954726e-05, 2.55476368e-05, 2.63209297e-05,
       2.71158901e-05, 2.79330692e-05, 2.87730299e-05, 2.96363481e-05,
       3.05236121e-05, 3.14354234e-05, 3.23723966e-05, 3.33351597e-05,
       3.43243544e-05, 3.53406363e-05, 3.63846750e-05, 3.74571548e-05,
       3.85587744e-05, 3.96902473e-05, 4.08523025e-05, 4.20456841e-05,
       4.32711518e-05, 4.45294815e-05, 4.58214650e-05, 4.71479106e-05,
      

In [54]:
import plotly.express as px


fig = px.line(x=a, y=stats.norm.pdf(a,mean,sd))
fig.show()

4.The Ministry of Education has decided that the minimum score for someone to be considered competent in mathematics is 80. Use the cdf method to calculate the proportion of students that will get a score above that score:

In [29]:
1- stats.norm.cdf(80,100,15)

0.9087887802741321

In [40]:
print(f'P(X>80): {1- stats.norm.cdf(80,100,15):0.5f}')
print(f'P(X>80): {100*(1- stats.norm.cdf(80,100,15)):0.2f}%')

P(X>80): 0.90879
P(X>80): 90.88%


5.A very selective university wants to set very high standards for high school students that are admitted to their programs. The policy of the university is to only admit students with mathematics scores in the top 2% of the population. Use the ppf method (which is essentially the inverse function of the cdf method) with an argument of 1 - 0.02 = 0.98 to get the cut-off score for admission:

In [41]:
stats.norm.ppf(0.98, mean, sd)

130.80623365947733

In this exercise, we used a normal distribution and the cdf and ppf methods to answer real-world questions about education policy.