#### Run the below code to import all libraries required to run sample code within this notebook

In [14]:
# Just run the below code

import numpy as np 
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets
 
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
from plotly import graph_objs as go
init_notebook_mode(connected=True)
import numpy as np

import random

from IPython.display import display, Math

from bokeh.io import show, output_notebook
from bokeh.plotting import figure, show
from scipy.stats import norm 
from bokeh import plotting as pl
from bokeh.models import HoverTool, Arrow, OpenHead, NormalHead, VeeHead


output_notebook()

### Solution code

```python
# Just run above code
```

Now that we have established an understand of ideas like what is statistical population or sample and looked at elementary ideas of estimates, we are going to utilize all of these to actually start doing some statistics. The first thing we are going to tackle is called hypothesis testing. There is a bit of vocabulary that you will need for this. Fear not we will cover it in detail with lots of examples. 

But first we need to motivate why we need to learn hypothesis testing. Suppose I give you a problem where 


To deal with problems like this we need to utilize hypothesis testing. This notebook is going be broken down into the following sections- 

1) Basic idea behind hypothesis testing <br>
2) Types of error <br>
3) Examples of hypothesis testing <br>
4) One sided and two sided tests <br>
5) p values <br>


# Basic idea behind hypothesis testing 


Suppose you are a web designer. Your are rebuilding a really sweet website for this new company that you are working with. Your are done everything except for the opening page, you can't seem to decide what color you want to choose for it hence you ask your boss. Your boss says we have had a red opening page for the longest time and that it works. You say, na! If we switch the color to green we will get more people to stay on the site since green is pleasing to the eye. 

How would you resolve this situation by using data ? 

This is where we can utilize hypothesis testing. The various steps of hypothesis testing are- 

* Define what statistic you want to use for doing hypothesis test- we are going to be mostly using the mean 
* Define the alternate and the null hypothesis in terms of step 1 
* Define a significance level for your problem   
* Calculate the test statistic  
* Compare the test statistic to the significance level and based on the condition either accept or reject the null hypothesis

1) Defining the statistic that you are going to be using 

Let's take problem that we defined earlier. One statistic we can use is the mean time spent on the website. Suppose your boss gives you the data of how much time people spend on the current opening page which is red. You get the mean value of the time spent there, say it's 5 mins. Hence, our definition of Null and Alternate hypothesis will have to be in terms of this parameter, which is the mean or average amount of time that people spend on the website. 

2) Define the null and alternate hypothesis 

In step 1, we mentioned that the average time spent on the website opening page was currently 5 mins. So your alternate hypothesis will be that average time spent on the website will be greater than 5 mins after the change in the page color. 
Your null hypothesis is that even after the change of color to green, the average time spent on the opening page will be less than or equal to 5. So let's write this down in a more formal way- 

$H_o : \mu_o \leq 5$  mins  <br>

$H_a : \mu_a > 5$ mins 


3) Defining the significance level 

The significance level (also known as the alpha level) is a probability value. This is a threshold beyond which you will reject the null hypothesis. This is the equivalent of saying that If more than 5 in 100 samples, the mean is different from a threshold we set, then we will reject the null hypothesis. This will become clear as we go along. Typically when a significance level is not given, you can assume a significance level denoted by $\alpha$ as $\alpha= 0.05$

4) Calculate the test statistic

This is the main calculation that we have to do. In our example, what our boss gave us essentially defines a population of time spent on the opening page. We can get a mean from that. Then what we do is generate our own sample. We do this by first making a change in the opening page to green and then recording average time people spend on it. Suppose we took a sample, after the color change, of 10 people and found that people spent on average about 6 mins on the website. Given the population standard deviation of 1 min. Then our sample mean $\bar x= 6 $ mins. Once we have this information we plug this into - 


$\begin{align}z_c = \dfrac{({\bar x - \mu_p})}{\bar \sigma} \end{align}$

where

${\bar \sigma} = {\sigma_p}/{\sqrt{n}}$

$z_c$ is called the critical value or sometimes the test statistic <br>
$\bar x$ is the sample mean  <br>
$\mu $ is the population mean  <br>
$\bar \sigma $ is the sample standard deviation <br>
$\sigma_p $ is the population standard deviation <br>
$n$ is the sample size. In our case $n =10$

Note, so we can only define the test statistic above since we have chose our statistic to be mean, if we change that to something else. The expression above will also change 

What we are actually looking at in equation (1) is nothing more than normal distribution. See, since in our sample size of the population will be greater than 30 units we can represent the population mean as a normal distribution. Once we do that calculating $z_c$ is easy since all we have to do is plug in the numbers and we will get $z_c$. So let's actually do that. 

Question: calculate the value of z_c from equation(1). Hint the rest of the numbers are in the problem definition

In [15]:
# Just run the below code

# answer 
sigma = 1 
mean_pop = 5 
mean_sample =6 
sample_size = 10

z_c = (mean_sample-mean_pop)* (np.sqrt(sample_size)/sigma)
print("critical value is ", z_c )
 

critical value is  3.1622776601683795


### Solution code

```python
# Just run above code
```

The critical value is shown above. so what's next 

5) Compare the significance level and the critical value to decide whether accept or reject the null hypothesis 

In order to do this we actually need to get the z value for the significance level. So how do we do that? Well we go back to our normal probability distribution. All we need to do is figure out what is the z value for a given value of probability. This is done using the percent point function. Below we have provided an interactive interface where you can change the values of the significance level and print out the corresponding z value. 

In [27]:
# Just run the below code

conf_int =0 
xrange =np.linspace(-5,5,10000)
pdf = norm(0,1).pdf(xrange)
#answer
tools_to_show= 'box_zoom,pan,save,hover,reset,tap,wheel_zoom'        



def get_sig_lvl(significance_level): 
    
    fig = pl.figure(x_range=[-5,5], 
                    plot_height=400,
                    tools = tools_to_show,
                    title="Enter significance level (in %) - Figure 1",
                    x_axis_label= "z-values",
                    y_axis_label ="Count")
    
    fig.line(x=xrange, y= pdf, line_width = 4)

    fig.xgrid.grid_line_color = None
    fig.y_range.start = 0
    
    hover = fig.select(dict(type=HoverTool))
    hover.tooltips = [("xvalue", "@x"), ("yvalue", "@y")]
   
    # calculate the z value 
    z_value = norm.ppf((100-significance_level)/100)
   
    shade_x = np.arange(z_value,5.0,0.001)
    shade_region=norm.pdf(shade_x)
    
    shade_region[0] = 0 
    shade_region[-1] = 0
    
    fig.patch(shade_x, shade_region, color="red", alpha =0.4)
    
    fig.text(x=-0.8, y=0.2, text=["Acceptance \nregion"], )
    
    reject_text_x = 2.15
    reject_text_y = 0.15
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Rejection region"])
    arrow_y_end = shade_region[int(shade_region.size/2)]
    arrow_x_end = shade_x[int(shade_x.size/5)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+0.5,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    
    show(fig)

    print("z value for the given significance level: ",z_value)
    return None 




interact(get_sig_lvl, 
                 significance_level = widgets.FloatText(value = 5, 
                                                        min =50,
                                                        max = 99.9, 
                                                        step =0.001)
                );


interactive(children=(FloatText(value=5.0, description='significance_level', step=0.001), Output()), _dom_clas…

### Solution code

```python
# Just run above code
```

We call the red region in the above plot as the rejection region. This is because all the points in that region. All the z values in that region are above threshold that we have set for ourselves. If your calculated test statistic and the number that we get from equation (1) is in the rejection region, then we can say that hey! let's reject the null hypothesis. Else we have to accept the null hypothesis. So in our case - 

$z_c  =  3.16$<br> 
$z_\alpha  =  1.65$<br>

therefore <br>
$z_{\alpha} < z_c$

where

$z_{\alpha}$ is the z value at the significance level given by $\alpha =0.05$


Hence we can reject the null hypothesis. What does it mean when it comes to the actual problem? This means the change in the amount of time the user spends on the website is "statistically significant". What this also means is that in some ways the new mean, i.e sample mean $\bar x$ can be thought of as a new distribution and not something that falls within the current distribution. 


# One tailed tests 

What we saw above was an example of a one tailed test or , more specifically, a right tailed test. It is a right tailed test since we were looking at condition where the sample mean was greater than the population mean. In z-value distribution the rejection region will lie to the right of the mean. Suppose you find that the average time the user spends on the page is actually 4.5 mins rather than the population mean of 5. Would you then say it's statistically different from what you had before? 


Question : Given the new sample mean $\bar x = 4.5$ min calculate the test statistic (equation (1)), assume a significance level of 5 % for this problem 


In [28]:
# Just run the below code

# answer 
sigma = 1 
mean_pop = 5 
mean_sample =4.5
sample_size = 10

z_c = (mean_sample-mean_pop)* (np.sqrt(sample_size)/sigma)
print("critical value is ", z_c )
 

critical value is  -1.5811388300841898


### Solution code

```python
# Just run above code
```

What is this...our z value is negative, looking at figure 1. It makes sense, since a negative value means that it is to the left side of the mean. But to go to step 5 we need the $z_\alpha$ value

Question:  Calculate the $z_\alpha$ value given $\alpha = 5$% and using the value of $z_c$  from the last question, should we accept or reject the null hypothesis? 

In [18]:
# Just run the below code

# answer 
alpha = 5
z_alpha = norm.ppf(alpha/100)
print("z_alpha for 5 % significance level is: ", z_alpha)

z_alpha for 5 % significance level is:  -1.6448536269514729


### Solution code

```python
# Just run above code
```

So we have- 


$z_c  =  -1.58$<br> 
$z_\alpha  =  -1.65$<br>

therefore <br>
$z_{\alpha} < z_c$

In this case we need $z_{\alpha} > z_c$ to reject the null hypothesis, hence we are accepting the null hypothesis.  Remember the larger the magnitude of the negative number the small a number it is so -100 is smaller than -50. 

Similar to how we built a visualization for a right tailed test. Here is one for a left tailed test. 

In [19]:
# Just run the below code

conf_int =0 
xrange =np.linspace(-5,5,10000)
pdf = norm(0,1).pdf(xrange)
#answer
tools_to_show= 'box_zoom,pan,save,hover,reset,tap,wheel_zoom'        



def get_sig_lvl(significance_level): 
    
    fig = pl.figure(x_range=[-5,5], 
                    plot_height=400,
                    tools = tools_to_show,
                    title="Enter significance level (in %) - Figure 2",
                    x_axis_label= "z-values",
                    y_axis_label ="Count")
    
    fig.line(x=xrange, y= pdf, line_width = 4)

    fig.xgrid.grid_line_color = None
    fig.y_range.start = 0
    
    hover = fig.select(dict(type=HoverTool))
    hover.tooltips = [("xvalue", "@x"), ("yvalue", "@y")]
   
    # calculate the z value 
    z_value = norm.ppf((significance_level)/100)
   
    shade_x = np.arange(-5.0,z_value, 0.001)
    shade_region=norm.pdf(shade_x)
    
    shade_region[0] = 0 
    shade_region[-1] = 0
    
    fig.patch(shade_x, shade_region, color="red", alpha =0.4)
    
    fig.text(x=-0.8, y=0.2, text=["Acceptance \nregion"], )
    
    reject_text_x = -4.15
    reject_text_y = 0.15
    
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Rejection region"])
    
    arrow_y_end = shade_region[int(shade_region.size/2)]
    arrow_x_end = shade_x[int(shade_x.size*0.8)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+0.5,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    
    show(fig)

    print("z value for the given significance level: ",z_value)
    return None 




interact(get_sig_lvl, 
                 significance_level = widgets.FloatText(value = 5, 
                                                        min =50,
                                                        max = 99.9, 
                                                        step =0.001)
                );



interactive(children=(FloatText(value=5.0, description='significance_level', step=0.001), Output()), _dom_clas…

### Solution code

```python
# Just run above code
```

So now we have a calculator for a right tailed test as well. So we have seen simple examples of both right tailed and left tailed distributions. What about a two tailed test? That is next. 

# Two tailed test

To look at the two tailed test, let's redefine the problem your boss gave you. Suppose your boss said hey! as long as variation in count is not more than 3%, it's alright. So from this we can get that standard deviation will essentially be 1.5 mins. So given this information we can calculate the value of $z_c$ 

In [20]:
# Just run the below code

# answer 
sigma =1.5
mean_pop = 5 
mean_sample =4.5
sample_size = 10

z_c = (mean_sample-mean_pop)* (np.sqrt(sample_size)/sigma)
print("critical value is ", z_c )
 

critical value is  -1.0540925533894598


### Solution code

```python
# Just run above code
```

Our critical value is slightly different from the last time since our standard deviation changed. Now our significance level will remain the same BUT the significance level is 5% across both tails so we are going to have to divide it hence- 

$\begin{align} \alpha =0.05 \end{align}$ <br>

$\begin{align} \alpha_{tt} = \alpha /2  = 0.025 \end{align}$

where $\alpha_{tt}$ is the two tailed value of alpha. So we will infact have two values for $z_\alpha$ since we are considering the two tailed test. Below is the calculation.


In [21]:
# Just run the below code

alpha_tt = 2.5
z_alpha = norm.ppf(alpha_tt/100)
print("z_alpha for 5 % significance level is:{} and {} " .format(z_alpha, -z_alpha))


z_alpha for 5 % significance level is:-1.9599639845400545 and 1.9599639845400545 


### Solution code

```python
# Just run above code
```

What we did here was calculate the z value for a significance level of 2.5 and since are considering both ends of the distribution. Remember we can do this ONLY because the normal distribution is symmetric. If we had a non-symmetric
distribution then we CANNOT do this. This is very important because if we naively apply a hypothesis test to a problem where the sample mean is not approximated by a normal distribution then you are in trouble. 

So now that we have these value we can go to step 5) and compare the values of $z_c$ and $z_{\alpha_{tt}}$  

So we have- 


$z_c  =  -1.05$<br> 
$z_\alpha  =  \pm 1.96$<br>

therefore <br>
neither $z_{\alpha_{tt}} < z_c$ nor $z_{\alpha_{tt}} > z_c$ 

So we cannot reject the null hypothesis.


So let's now visualize a two tailed test figure.



In [22]:
# Just run the below code

conf_int =0 
xrange =np.linspace(-5,5,10000)
pdf = norm(0,1).pdf(xrange)
#answer
tools_to_show= 'box_zoom,pan,save,hover,reset,tap,wheel_zoom'        


def shade_reject_region(z_min,z_max ): 
    shade_x  = np.arange(z_min,z_max,0.001)
    shade_region  =norm.pdf(shade_x)
    
    shade_region[0] = 0 
    shade_region[-1] = 0
   
    return shade_x, shade_region

def get_sig_lvl(significance_level): 
    
    fig = pl.figure(x_range=[-5,5], 
                    plot_height=400,
                    tools = tools_to_show,
                    title="Enter significance level (in %) - Figure 3",
                    x_axis_label= "z-values",
                    y_axis_label ="Count")
    
    fig.line(x=xrange, y= pdf, line_width = 4)

    fig.xgrid.grid_line_color = None
    fig.y_range.start = 0
    
    hover = fig.select(dict(type=HoverTool))
    hover.tooltips = [("xvalue", "@x"), ("yvalue", "@y")]
   
    # calculate right 
    z_value = norm.ppf((100-significance_level/2)/100)
    right_shade, right_region= shade_reject_region(z_value,5)
    fig.patch(right_shade, right_region, color="red", alpha =0.4)
    
    # calculate right 
    z_value = norm.ppf((significance_level/2)/100)
    left_shade, left_region= shade_reject_region(-5,z_value)
    fig.patch(left_shade, left_region, color="red", alpha =0.4)
    
    fig.text(x=-0.8, y=0.2, text=["Acceptance \nregion"], )
    
    # right reject region title
    
    reject_text_x = 2.15
    reject_text_y = 0.15
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Rejection region"])
    arrow_y_end = right_region[int(right_region.size/2)]
    arrow_x_end = right_shade[int(right_shade.size/5)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+0.5,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    # left reject region title 
    
    reject_text_x = -4.15
    reject_text_y = 0.15
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Rejection region"])
    arrow_y_end = left_region[int(left_region.size/2)]
    arrow_x_end = left_shade[int(right_shade.size*0.8)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+0.5,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    
    show(fig)

    print("z value for the given significance level: {} and {}".format(z_value,-z_value))
    return None 




interact(get_sig_lvl, 
                 significance_level = widgets.FloatText(value = 5, 
                                                        min =50,
                                                        max = 99.9, 
                                                        step =0.001)
                );


interactive(children=(FloatText(value=5.0, description='significance_level', step=0.001), Output()), _dom_clas…

### Solution code

```python
# Just run above code
```

So that cover all types of hypothesis tests that you can do. Sounds great but there are couple of more topics that we need to cover in order get a better understanding, one import topic is to undestand the kind of errors that can happen while we do hypothesis testing. 


# Types of Errors 

When working with hypothesis testing there are two types of error that you need to be careful about, both are, uninterestingly labeled as - Type I error and Type II error. They are 

* Type I error - Rejecting the null hypothesis when it should be accepted
* Type II error -  Accepting null hypothesis when it should be rejected 

Depending on the situation one type of error will be worse then another type. For instance, in the above example, suppose we made a Type I error and said that the average time after changing the page color went up when in reality it stayed the same, then it's okay. But suppose in second case where the average time spent viewing the page was less than population, we might actually lose revenue by making a Type II error since we assume that the null hypothesis can be accepted but the truth is that we need to actually reject it and accept the alternate hypothesis. This is especially true in the field of medicine where accepting the wrong hypothesis can be fatal to a patient! Either way in all cases. Be careful about making these errors. 

So how can we get around these types of errors? Collect more data! The larger your sample size is the smaller its standard deviation will be. Remember from equation(1) the standard error is proportional to the inverse of the square root of the sample size. So the larger that sample size is the smaller standard error is! So use this to your advantage. 


# P-values

There is one more super important topic we need to cover before we close this section. That is the topic of p-value. So whenever you read about hypothesis testing you will very often come across this term. In fact, in many softwares and statistical packages you will find that apart from giving you $z_c$ values, it will output p-values as well. Sometime studies or academic papers will just report p-values rather any other information 

The technical definition of p-value is-  " the probability for a given statistical model that, when the null hypothesis is true, the statistical summary (such as the sample mean difference between two compared groups) would be greater than or equal to the actual observed results." This is from wikipedia and honestly the first time I read it, it just did not make any sense. So we are going to go slowly and understand what it is. 

For this purpose we are going back to the left tailed condition- 

$\begin{align*} & \text{So we have-} \end{align*}$
<br>
<br>
$\begin{align*} n = 10  \\ \mu_p = 5 \\ \bar x = 4 \\ \bar \sigma = 1 \\ \alpha = 5\% \end{align*}$
<br>
<br>

$\begin{align*} \text{From which we acquired-} \end{align*}$
<br>
<br>

$\begin{align*} z_c  =  -1.58  \\ z_\alpha  =  -1.65 \end{align*}$


$\begin{align*} \text{and we got the conclusion} \end{align*}$
<br>
$z_{\alpha} < z_c$
<br>
$\begin{align*}\text{In this case we had accepted null hypothesis since this was a left tailed distribution.} \end{align*}$



There is another way to go about it. See we used the significance level which is a probability to calculate the $z_\alpha$, we can do the reverse with $z_c$. So let's use the cdf again for this purpose. 

Question: calculate the probability associated with the $z_c$ and compare it to the significance level. What can you tell about accepting rejecting the null hypothesis from this comparison? 


In [23]:
# Just run the below code

#answer 
z_c= -1.58 
probability_c = norm.cdf(z_c)
print("probability asscociated with critical z value is: ", round(probability_c,2))


probability asscociated with critical z value is:  0.06


### Solution code

```python
# Just run above code
```

So if we compare the probability for $z_c$ with that of the significance level then we will find that probability $> \alpha$.  Under this condition is when we accepted the null hypothesis. 

Now let us do the same for the right tailed test here we had -

$z_c  =  3.16$<br> 
$z_\alpha  =  1.65$<br>

Question: Again calculate the probability associated with the $z_c$ and compare it to the significance level. What can you tell about accepting rejecting the null hypothesis from this comparison? Compare this to the case when we had a left tailed test 


In [24]:
# Just run the below code

#answer 
z_c= 3.16
probability_c = 1-norm.cdf(z_c)
print("probability asscociated with critical z value is: ",probability_c)


probability asscociated with critical z value is:  0.0007888456943755395


### Solution code

```python
# Just run above code
```

You see what we did there, we did not directly take the value of $z_c$ that we did $1-$probability since we want the probability value from the right side. We calculated the probability value and found it to be $0.0008$ now $0.0008 < \alpha$ and under this condition we accepted the significance level. So this probability value that we are calculating from the $z_c$ is question useful since we can just ask if it is great or lesser than the significance level. It saves us from writing down a $z_\alpha$ value and let us directly deal with probabilities. This probability value is called p-value. So we can generalize the message and say-  

* $\alpha \gt  \text {p-value} \qquad  \text {then reject the null hypothesis}$
<br>
* $\alpha \lt  \text {p-value}  \qquad \text {then accept the null hypothesis}$

The above general rule applies to the probability calculated from a two tailed test as well. The difference comes in how we calculate the p-value. The p-value for a two tail test is double that for a single tail test of the same $z_c$ value since you can find a sample on the other extreme of  of the the distribution. So now that we can presented the definition of p value, let's visualize it- 


In [25]:
# Just run the below code

conf_int =0 
xrange =np.linspace(-5,5,10000)
pdf = norm(0,1).pdf(xrange)
#answer
tools_to_show= 'box_zoom,pan,save,hover,reset,tap,wheel_zoom'        


def shade_reject_region(z_min,z_max ): 
    shade_x  = np.arange(z_min,z_max,0.001)
    shade_region  =norm.pdf(shade_x)
    
    shade_region[0] = 0 
    shade_region[-1] = 0
   
    return shade_x, shade_region

def get_sig_lvl(significance_level, z_c): 
    
    fig = pl.figure(x_range=[-5,5], 
                    plot_height=400,
                    tools = tools_to_show,
                    title="Enter significance level (in %) - Figure 3",
                    x_axis_label= "z-values",
                    y_axis_label ="Count")
    
    fig.line(x=xrange, y= pdf, line_width = 4)

    fig.xgrid.grid_line_color = None
    fig.y_range.start = 0
    
    hover = fig.select(dict(type=HoverTool))
    hover.tooltips = [("xvalue", "@x"), ("yvalue", "@y")]
   
    # calculate right 
    z_value = norm.ppf((significance_level)/100)
    right_shade, right_region= shade_reject_region(-5, z_value)
    fig.patch(right_shade, right_region, color="red", alpha =0.4)
    
    # calculate pvalue 
    p_value = norm.cdf(z_c)
    left_shade, left_region= shade_reject_region(-5, z_c,)
    fig.patch(left_shade, left_region, color="blue", alpha =0.4)
    
    fig.text(x=-0.8, y=0.2, text=["Acceptance \nregion"], )
    
#     # right reject region title
    
    reject_text_x = -4.15
    reject_text_y = 0.15
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Significance level "])
    arrow_y_end = right_region[int(right_region.size/2)]+0.05
    arrow_x_end = right_shade[int(right_shade.size*0.99)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+1.75,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    # left reject region title 
    
    reject_text_x = -4.15
    reject_text_y = 0.1
    fig.text(x =reject_text_x ,y = reject_text_y, text=["Area = P-value"])
    arrow_y_end = left_region[int(left_region.size/2)]
    arrow_x_end = left_shade[int(left_shade.size*0.8)]
    
    
    fig.add_layout(Arrow(end=NormalHead(fill_color="black"),
                   x_start=reject_text_x+0.5,
                   y_start=reject_text_y-0.001,
                   x_end=arrow_x_end,
                   y_end=arrow_y_end+0.01))

    
    show(fig)

    print("z value for the given significance level: {} ".format(z_value))
    print("p value  :  {}".format(p_value))
    return None 




interact(get_sig_lvl, 
                 significance_level = widgets.FloatText(value = 5, 
                                                        min =50,
                                                        max = 99.9, 
                                                        step =0.001), 
                   z_c = widgets.FloatText(value = -2, 
                                                        min =-5,
                                                        max = 0, 
                                                        step =0.001)
                );


interactive(children=(FloatText(value=5.0, description='significance_level', step=0.001), FloatText(value=-2.0…

### Solution code

```python
# Just run above code
```

In the above illustration, area in the blue region is known as the p value.

So with this we close this notebook. In this notebook book we have looked at a basic overview of hypothesis testing. Next part we will look at another type of hypothesis testing called the chi square test. 



In [26]:
# No exercise

### Solution code

```python
# Just run above code
```