# Pystats Tutorial

Welcome to the pystats package documentation! This package offers a suite of functions to simulate, and analyze and interpret data that follows a normal distribution curve. This package provides similar functionality to R's 'rnorm', 'pnorm', 'dnorm' and 'qnorm' functions. 

We will illustrate the usage of the 4 functions in our package pystats with a real-life example. Our example features a high school teacher named Mr. Gittu. Mr. Gittu would like to streamline the process of predicting and evaluating test scores, and our package can help him with this! 

## Gittu's Motivation
Mr. Gittu is an experienced computer science teacher at Hogwarts School of Data Science. He will be administering a standardized test and he knows that the test scores for the class he teaches usually follow a normal distribution with:
- A class average of 70% 
- Standard deviation of 10%

He wants to use the pystats package to streamline the process of predicting and evaluating test scores. 

### 1) Simulating Test Scores Distribution (rnorm)
Before administering the test, Mr. Gittu would like to simulate 40 scores so he can understand what results to expect for his next test. Using rnorm, Gittu generates a sample of 40 test scores with a mean of 70 and standard deviation of 10. 

In [None]:
from pystats.rnorm import rnorm
#Simulating test scores with n = 40, mean=70, and sd=10
simulated_scores = rnorm(n=40, mean = 70, sd = 10)

print(simulated_scores)

The rnorm function helped Mr. Gittu generate a simulated dataset. This gives him a realistic expectation of what test scores he may expect for the next test he employs in his class of 40 students. 

### 2) Getting Fail Percentages with pnorm

Mr. Gittu wants all of his students to pass. Unfortunately, there is always a risk of students failing his tests, even when he does his best job teaching. He wants to assess the risk of failure (i.e. score below 50%) given a randomly selected student with the same normal distribution defined above. Using `pnorm`, he can easily get the expected proportion of students getting less than 50% on his test.

In [None]:
from pystats.pnorm import pnorm
failure_rate = pnorm(q=50, mean=70, sd=10)
print(round(failure_rate, 3))

Mr. Gittu also wants the best for his students, so he wants to see the expected proportion of students getting an A on his test, or over 80%.

In [None]:
A_rate = pnorm(q=80, mean=70, sd=10, lower_tail=False)
print(round(A_rate, 3))

Now Mr. Gittu knows what to expect from his best and worst performing students.

### 3) Calculating quantiles with qnorm

Mr. Gittu also wants know what exam score corresponds to the 90th percentile. Using `qnorm`, he can easily find out what test score students need to to be in the top 10% of the class. Using the parameter `lower_tail = False`, Mr. Gittu can also identify the test score that 90% of the class will score above.

In [None]:
from pystats.qnorm import qnorm

quantile1 = qnorm(p=0.9, mean=70, sd=10)
print(round(quantile1, 2))

To be in the top 10% of the class, Mr. Gittu's students will need to score at least 82.82% on the test.

In [None]:
quantile2 = qnorm(p=0.9, mean=70, sd=10, lower_tail = False)
print(round(quantile2, 2))

Conversely, 90% of Mr. Gittus' students should score above 57.18% on the test.

### 4) Calculate probabilities or identify scores that are very far from the mean using dnorm: 

After the test, Mr Gittu finds out that the actual mean was 68% and s.d. of 11%, not bad! He's using dnorm to figure out how much of an outlier certain scores are. If he wants to know how unusual a score of 50% is:

In [None]:
from pystats.dnorm import dnorm
result = dnorm(50, mean = 68, sd = 11)
print(result)

This output shows the probability density function (PDF) value of a normal distribution for a score of 50, given a mean of 68 and a standard deviation of 11. The result indicates that the PDF value at x=50 is approximately 0.009507. This confirms that 50 is on the tail end of the normal distribution curve, which is why it has a low probability density.

## Final Remarks

The `pystats` team hopes you find these examples helpful. If Mr. Gittu's test scores didn't answer all your questions, we suggest looking through the [function documentation](https://github.com/UBC-MDS/Group24-pystats/blob/main/README.md).