#**LAB 1: Measures of Center**

The `mtcars` dataset contains data from the 1974 Motor Trends magazine, and includes 10 features of performance and design from a sample of 32 cars.

- Import the csv file `mtcars.csv` as a data frame using a `pandas` module function.

- Find the mean, median, and mode of the column `wt`.

- Print the mean and median.

Ex: for the column `qsec`, the output would be:
```
mean = 17.84875, median = 17.710
```

In [3]:
import pandas as pd

# Read in the file mtcars.csv
cars = pd.read_csv("mtcars.csv")

# Find the mean of the column wt
mean = cars["wt"].mean()

# Find the median of the column wt
median = cars["wt"].median()

# Print mean and median in the form of mean = ..., median = ...
print(f"mean = {mean:.5f}, median = {median:.3f}")

mean = 3.21725, median = 3.325


#**LAB 2: Calculating probabilities using a normal distribution**

The intelligence quotient (IQ) of a randomly selected person follows a normal distribution with a mean of 100 and a standard deviation of 15. Use the `scipy` function `norm`  and user input values for `IQ1` and ` IQ2` to perform the following tasks:

- Calculate the probability that a randomly selected person will have an IQ less than or equal to `IQ1`.
- Calculate the probability that a randomly selected person will have an IQ between `IQ1` and `IQ2`.

For example, if the input is:
```
105
110
```

the output is:
```
The probability that a randomly selected person
 has an IQ less than or equal to 105.0 is 0.631.
The probability that a randomly selected person
 has an IQ between 105.0 and 110.0 is 0.117.
```

In [4]:
# Import norm from scipy.stats
from scipy.stats import norm

# Input two IQs, making sure that IQ1 is less than IQ2
IQ1 = float(input())
IQ2 = float(input())

while IQ1 > IQ2:
    print("IQ1 should be less than IQ2. Enter numbers again.")
    IQ1 = float(input())
    IQ2 = float(input())

mean = 100
std_dev = 15

# Calculate the probability that a randomly selected person has an IQ less than or equal to IQ1.
probLT = norm.cdf(IQ1, mean, std_dev)

# Calculate the probability that a randomly selected person has an IQ between IQ1 and IQ2
probBetw = norm.cdf(IQ2, mean, std_dev) - probLT

print("The probability that a randomly selected person \n has an IQ less than or equal to " + str(IQ1) + " is ", end="")
print('%.3f' % probLT + ".")
print("The probability that a randomly selected person \n has an IQ between " + str(IQ1) + " and " + str(IQ2)+ " is ", end="")
print('%.3f' % probBetw + ".")

105
110
The probability that a randomly selected person 
 has an IQ less than or equal to 105.0 is 0.631.
The probability that a randomly selected person 
 has an IQ between 105.0 and 110.0 is 0.117.


#**LAB 3: One-sample hypothesis test for population proportion**

The `gpa` dataset is a toy dataset containing the features `height` and `gpa` for 35 students. Use the `statsmodels` function `proportions_ztest` and the user defined values for the proportion for the null hypothesis `value` and the gpa cutoff `cutoff` to perform the following tasks:

- Load the `gpa.csv` data set.
- Find the number of students with a gpa greater than `cutoff`.
- Find the total number of students.
- Perform a z-test for the user input expected proportion.
- Determine if the hypothesis that the actual proportion is different from the expected proportion should be rejected at the alpha = 0.01 significance level.

Ex: When the input is:
```
0.5
2.0
```
the ouput is:
```
(4.902, 0.000)
The two-tailed p-value, 0.000, is less than α. Thus, sufficient evidence exists to support the hypothesis that the proportion is different from 0.5
```

In [10]:
import statsmodels.stats as st
from statsmodels.stats.proportion import proportions_ztest
import pandas as pd

# Read in gpa.csv
gpa = pd.read_csv("gpa.csv")

# Get the value of the proportion for the null hypothesis
value = float(input())
# Get the gpa cutoff
cutoff = float(input())

# Determine the number of students with a gpa higher than cutoff
counts = (gpa["gpa"] > cutoff).sum()

# Determine the total number of students
nobs = len(gpa)

# Perform z-test for counts, nobs, and value
ztest = proportions_ztest(counts, nobs, value,alternative='two-sided',prop_var=value)
print("(", end="")
print('%.3f' % ztest[0] + ", ", end="")
print('%.3f' % ztest[1] + ")")


if ztest[1] < 0.01:
    print("The two-tailed p-value, ", end="")
    print('%.3f' % ztest[1] + ", is less than \u03B1. Thus, sufficient evidence exists to support the hypothesis that the proportion is different from", value)
else:
    print("The two-tailed p-value, ", end="")
    print('%.3f' % ztest[1] + ", is greater than \u03B1. Thus, insufficient evidence exists to support the hypothesis that the proportion is different from", value)

0.5
2.0
(4.902, 0.000)
The two-tailed p-value, 0.000, is less than α. Thus, sufficient evidence exists to support the hypothesis that the proportion is different from 0.5
