# Inference (Ch 5)

You will be performing the required computations directly as presented in the videos. 

Following are the new constants, objects and methods needed for this chapter. 

New constant in numpy: nan

New methods in numpy: sqrt

New objects in scipy.stats: f, t, chi2

New methods in pandas: dropna and notna

## New Constant in numpy

numpy includes several constant values of which nan represents number not available

### nan
Used to designate missing numbers in a dataset

In [54]:
import numpy as np
print(3,np.nan,5)

3 nan 5


## New methods in numpy

### sqrt
Calculates the square root of a number

In [55]:
np.sqrt(10)

3.1622776601683795

## New continuous distribution objects: t, f, chi2

### t

The t distribution with parameters df. Widely used for small sample sizes in place of normal distribution. 

Following is a sample application of the cdf and ppf methods. 

In [56]:
import scipy.stats as stat
print('Pr( 10 <= z <= 15| df=5) =',stat.t.cdf(15, 5) - stat.t.cdf(10, 5))

Pr( 10 <= z <= 15| df=5) = 7.355159421096324e-05


In [57]:
print('The value of a so that Pr( x <= a | df=5 ) = 0.997 is ',stat.t.ppf( 0.997, 5))

The value of a so that Pr( x <= a | df=5 ) = 0.997 is  4.570347443402741


### f

The F distribution with parameters df1 and df2, used to test if variances of two samples are the same. 

Following is a sample application of the cdf and ppf methods. 

In [58]:
print('Pr( 10 <= z <= 15| df1=5, df2=8) =',stat.f.cdf(15, 5, 8) - stat.f.cdf(10, 5, 8))

Pr( 10 <= z <= 15| df1=5, df2=8) = 0.002050753112142867


In [59]:
print('The value of a so that Pr( x <= a | df1=5 df2=8) = 0.997 is ',stat.f.ppf( 0.997, 5, 8))

The value of a so that Pr( x <= a | df1=5 df2=8) = 0.997 is  9.72880064885152


### chi2

The Chi-square distribution with parameter df, used to test one sample variances.  

Following is a sample application of the cdf and ppf methods. 

In [60]:
print('Pr( 10 <= z <= 15| df=5) =',stat.chi2.cdf(15, 5) - stat.chi2.cdf(10, 5))

Pr( 10 <= z <= 15| df=5) = 0.06487290823072578


In [61]:
print('The value of a so that Pr( x <= a | df=5) = 0.997 is ',stat.chi2.ppf( 0.997, 5))

The value of a so that Pr( x <= a | df=5) = 0.997 is  17.95761226739146


## New methods in pandas: dropna and notna

### dropna
remove empty rows/columns from a dataframe

In [62]:
import pandas as pd
df = pd.DataFrame({'age': [5, np.nan, 6 ]})
print(df)
print(df.dropna())

   age
0  5.0
1  NaN
2  6.0
   age
0  5.0
2  6.0


### notna
detects empty rows/columns in a dataframe 

In [63]:
print(df.notna())

     age
0   True
1  False
2   True
