### A program in python for calculating the following stats for z-dist or t-dist:
- 1. Interval Estimate
- 2. Point Estimate
- 3. Hypothesis Testing
- 4. Solve Hypothesis Test using samples from csv file
-     Here, the program is for hypothesis testing of confidence interval 95%. 
##### note:  exception handling used
 

In [1]:
alpha_dict={'alpha':0.05, 'alpha/2':0.025} #alpha value for 95% confidence level 
z_dist={0.025:(-1.96, 1.96), 0.05:(-1.645, 1.645)} # Z-critical values of two and one tailed test, resp.

# t-critical values of two and one tailed test, resp.
t_dist = { 0.025:{15:2.131, 16:2.120, 17:2.110, 18:2.101, 19:2.093, 20:2.086, 21:2.080, 22:2.074, 23:2.069, 24:2.064, 25:2.060, 26:2.056, 27:2.052, 28:2.048, 29:2.045, 30:2.042}, 
           0.05:{15:1.753, 16:1.746, 17:1.740 , 18:1.734, 19:1.729, 20:1.725, 21:1.721, 22:1.717, 23:1.714, 24:1.711, 25:1.708, 26:1.706, 27:1.703, 28:1.701, 29:1.699, 30:1.697} }


# user defined function  
def stats_calc(population_mean, sample_mean=0, sample_size=0, std_deviation=0, alpha=0.05, flag=0, flag_tail=0, csv_file=None):
    'To calculate the Point Estimate, Interval Estimate, Hypothesis Testing for z-distribution or t-distribution,'

    if csv_file!=None:
        return stats_from_csv(population_mean, csv_file, flag_tail, std_deviation)
    else:
        while flag !='6': 
            print("Enter 1 for Point estimate")
            print("Enter 2 for Interval estimate")
            print("Enter 3 for hypothesis testing")
            print("Enter 4 to do all three above steps")
            print("Enter 5 to exit")
            try:
                flag =  int(input("Enter your selection:"))
                if (flag < 1) or (flag > 5):
                    raise ValueError(flag)
            except ValueError:
                print(flag, "is out of allowed range!")
                break
             
            while flag_tail != '9':
                print("Enter 6 for two-tailed test")
                print("Enter 7 for right-tailed test")
                print("Enter 8 for left-tailed test")
                print("Enter 9 to exit")
                try:
                    flag_tail = int(input("Enter a your selection:"))
                    if (flag_tail < 6) or (flag_tail > 9):
                        raise ValueError(flag_tail)
                except ValueError:
                    print(flag_tail, "is out of allowed range!")
                    break
                
                
                try:
                    if flag == 1:
                        return 'The Point estimate is {}.'.format(sample_mean)  

                    elif flag == 2:
                        return interval_estimate(sample_mean, sample_size)

                    elif flag == 3:
                        return hypothesis_testing(population_mean, sample_mean, sample_size, flag_tail)

                    elif flag == 4:
                        return 'The Point estimate is {}.'.format(sample_mean), interval_estimate(sample_mean, sample_size), hypothesis_testing(population_mean, sample_mean, sample_size, flag_tail)

                    elif flag == 5:
                        return 'Thank You!'
                        break
                    
                    elif flag_tail ==9:
                        return 'Thank You!'
                        break

                    else:
                        print("Incorrect selection! Please retry again!")
                
                except Exception as e:
                    print(e)
                    break

    

# user defined function to calculate the interval estimate of z and t distribution
def interval_estimate(sample_mean, sample_size): 
    import math
    try:
        x=std_deviation/math.sqrt(sample_size)
    except Exception as e:
        print(e)
    dof= sample_size-1
    if (sample_size>=15) and (sample_size<=30):
            t_dist_interval = ( (sample_mean - (t_dist.get(0.025)[dof]*x)), (sample_mean + (t_dist.get(0.025)[dof]*x)) )
            return "The interval estimate of the T-distribution is {}".format(t_dist_interval)
    elif (sample_size > 30) and (sample_size <= 1000):
            z_dist_interval = ( (sample_mean - (1.96*x)), (sample_mean + (1.96*x)) )
            return "The interval estimate of the Z-distribution is {}".format(z_dist_interval)
    else:
        print("Incorrect sample size entered!")
        
        
# user defined function for hypothesis testing of z and t distribution        
def hypothesis_testing(population_mean, sample_mean, sample_size, flag_tail):
    import math
    try:
        t = (sample_mean-population_mean)/(std_deviation/math.sqrt(sample_size))
        z = (sample_mean-population_mean)/(std_deviation/math.sqrt(sample_size))
    except Exception as e:
        print("Error",e,"/nPlease retry again!")
    
    result_z={0:"Hypothesis Test: This is a Z-distribution and Reject the Null Hypothesis!", 
              1:"Hypothesis Test: This is a Z-distribution and Retain the Null Hypothesis!"}
    
    result_t={0:"Hypothesis Test: This is a T-distribution and Reject the Null Hypothesis!", 
              1:"Hypothesis Test: This is a T-distribution and Retain the Null Hypothesis!"}
    
    if (sample_size>=15) and (sample_size<=30):
        dof= sample_size-1    # dof- degrees of freedom                                                                                                                       
        if flag_tail==6:  #two-tailed test
            if abs(t) >= abs(t_dist.get(0.025)[dof]):
                return result_t.get(0)
            else:
                return result_t.get(1)

        elif flag_tail==7: #right tailed test
            if t >= t_dist.get(0.05)[dof]:
                return result_t.get(0)
            else:
                return result_t.get(1)

        elif flag_tail==8: #left tailed test
            if t <= (-t_dist.get(0.05)[dof]):   
                 return result_t.get(0)
            else:
                return result_t.get(1)
        else:
            return "Incorrect input to select one or two tailed test!"

    
    elif ((sample_size > 30) and (sample_size <= 1000)):

        if flag_tail==6: #two-tailed test
            if abs(z) >= abs(z_dist.get(0.025)[1]):
                return result_z.get(0)
            else:
                return result_z.get(1)

        elif flag_tail==7: #right-tailed test
            if z >= z_dist.get(0.05)[1]:
                return result_z.get(0)
            else:
                return result_z.get(1)

        elif flag_tail==8: #left-tailed test
            if z <= z_dist.get(0.05)[0]:
                return result_z.get(0)
            else:
                return result_z.get(1)
        else:
            return "Incorrect input to select one or two tailed test!"

    else:
        return "Incorrect sample size entered!"
    
    
# user defined function to read the csv file and do hypothesis test
def stats_from_csv(population_mean, csv_file, flag_tail, std_deviation=0):
    "The hypothesis testing(single sample) calculation of the csv file is calculated."
    import pandas as pd
    import math
    try:
        data1 = pd.read_csv(csv_file)
    except FileNotFoundError as e:
        print("Error!", e)
    data1 = pd.read_csv(csv_file)
    sample_mean = data1.mean()
    sample_std = data1.std()
    n,m = data1.shape # n refers to rows(sample size) and m columns
    
    z_result={0:"Hypothesis Test \nThis is a Z-distribution and Reject the Null Hypothesis!", 
              1:"Hypothesis Test \nThis is a Z-distribution and Retain the Null Hypothesis!"}
    
    t_result={0:"Hypothesis Test \nThis is a T-distribution and Reject the Null Hypothesis!", 
              1:"Hypothesis Test \nThis is a T-distribution and Retain the Null Hypothesis!"}
    
    if ((n>=15) and (n<=30)):
        try:
            t_stat=(sample_mean-population_mean)/(sample_std/(math.sqrt(n)))
        except Exception as e:
            print(e)
        if flag_tail==6: #two tailed test
            if (abs(t_stat) >= abs(t_dist.get(0.025)[n-1])).bool()==True:
                print(t_result.get(0))
            else:
                print(t_result.get(1))
        elif flag_tail==7:  #right tailed test
            if (t_stat >= t_dist.get(0.05)[n-1]).bool()==True:
                print(t_result.get(0))
            else:
                print(t_result.get(1))
        elif flag_tail==8:  #right tailed test
            if (t_stat <= (-t_dist.get(0.05)[n-1])).bool()==True:
                print(t_result.get(0))
            else:
                print(t_result.get(1))
                
        else:
            print("Incorrect input to select one or two tailed test!")
            
    elif ((n > 30) and (n <= 1000)):
        z_stat=(sample_mean-population_mean)/(std_deviation/(math.sqrt(n)))
        if flag_tail==6: #two tailed test
            if (abs(z_stat) >= abs(z_dist.get(0.025)[1])).bool()==True:
                 print(z_result.get(0))
            else:
                print(z_result.get(1))
        elif flag_tail==7:  #right tailed test
            if (z_stat >= z_dist.get(0.05)[1]).bool()==True:
                print(z_result.get(0))
            else:
                print(z_stat, z_result.get(1))
        elif flag_tail==8:  #right tailed test
            if (z_stat <= z_dist.get(0.05)[0]).bool()==True:
                print(z_result.get(0))
            else:
                print(z_result.get(1))
        else:
            print("Incorrect input to select one or two tailed test!")
    
    else:
        print("Sample size is out of limit!")
        

# user input arguments
try:
    population_mean = float(input('Enter the population mean: '))
    sample_mean = float(input('Enter the sample mean: '))
    
    try:
        sample_size = int(input('Enter a sample size ranging from 15 to 1000: '))
        if (sample_size < 15) or (sample_size > 1000):
            raise ValueError(sample_size)
    except ValueError:
        print(sample_size,"is out of allowed range!")
        
        
    std_deviation = float(input('Enter the standard devaition: '))
    flag_tail = int(input("If csv file choosen, enter 6 for two tailed test, 7 for right tailed test, 8 for left tailed test. Else, enter 0: "))
    try:
        csv_file = eval(input("If csv file choosen, enter the file path. Else, enter <None>: "))
    except FileNotFoundError as e:
        print(e)
    
except Exception as e:
    print("Error Occured!\n{}".format(e))
    print("Exit to retry again")
    
alpha=0.05
flag=0

#function call
stats_calc(population_mean, sample_mean, sample_size, std_deviation, alpha, flag, flag_tail, csv_file)


Enter the population mean: 500
Enter the sample mean: 0
Enter a sample size ranging from 15 to 1000: 0
0 is out of allowed range!
Enter the standard devaition: 0
If csv file choosen, enter 6 for two tailed test, 7 for right tailed test, 8 for left tailed test. Else, enter 0: 7
If csv file choosen, enter the file path. Else, enter <None>: 'production_cost.csv'
Hypothesis Test 
This is a T-distribution and Retain the Null Hypothesis!


- The above output is using hypothesis test for a csv file

### Note 1: The csv file <<'production_cost.csv'>> has 30 samples of movie production cost.

The industry believes that the new movie production house will require at least INR 500 million on average. (i.e. The population mean( $\mu$) is 500.) It is assumed that the bolloywood movie production cost follows a normal distribution. Production cost of 30(samples) movies in million rupees are given in csv file. A hypothesis test at $\alpha$ = 0.05 is conducted to check whether the average production cost is correct.

- Null Hypothesis: $\mu$ $\leq$ 500
  (The average production cost is atmost 500 million)

- Alternative Hypothesis: $\mu$ > 500
  (The average production cost is more than 500 million)

### Note 2: 
- Sample Size(n)
    - t-dist: 15 to 30
    - z-dist: 31 to 1000. 
  
  These sample size(n) are defined based on the followin criteria:
  - In most cases, n = 30 is used.
  - If the population is nearly symmetric, n = 15 can be considered.
  - If the population distribution is highly skewed or has outliers,  n = 50(or more) used.


                                       ***********************************