## `statistics` Module

In Python, the statistics module is a built-in module that provides functions for mathematical statistics calculations. It offers a wide range of statistical functions to analyze data, perform calculations, and extract useful information from datasets. The module is part of the Python Standard Library, so no additional installation is required.


Here is a list of some useful functions and objects:

1. Functions:

   - `mean(data)`: Computes the arithmetic mean (average) of a sequence of numbers.
   - `median(data)`: Calculates the median (middle value) of a sequence of numbers.
   - `mode(data)`: Determines the mode (most common value) of a sequence of numbers.
   - `variance(data)`: Csample omputes the variance of a sequence of numbers.
   - `stdev(data)`: C samplealculates the standard deviation of a sequenc
   - `pvariance(data)`: Computes the population variance of a sequence of numbers.
   - `pstdev(data)`: Calculates the population standard deviation of a sequence of numbers.e of numbers.
   - `correlation(x, y)`: Computes the correlation coefficient between two datasets.
   - `quantiles(data, *, n=4)`: Computes the n-quantiles (percentiles)between 0 and 1.

1. Objects:

   - `StatisticsError`: An exception class raised for errors encountered in the statistics module.
   - `NormalDist(mu, sigma)`: Represents a normal (G.s and objects. If you have any further questions, please feel free to ask!ssistance, feel free to ask.

### Useful statistical functions

The following cell contains examples of applying some useful functions `statistics` to a randomly generated data set.

In [12]:
import statistics
import random

data = [random.randint(1, 100) for _ in range(100)]
print(f'data: {data} \n')

# Example using mean
mean = statistics.mean(data)
print("Mean:", mean)

# Example using median()
median = statistics.median(data)
print("Median:", median)

# Example using mode()
mode = statistics.mode(data)
print("Mode:", mode)

# Example using variance()
variance = statistics.variance(data)
print("Variance:", variance)

# Example using standard deviation
stdev = statistics.stdev(data)
print("Standard Deviation:", stdev)

# Example using population variance
pvariance = statistics.pvariance(data)
print("Population Variance:", pvariance)

# Example using population standard deviation
pstdev = statistics.pstdev(data)
print("Population Standard Deviation:", pstdev)

# Example using correlation()
x = [1, 2, 3, 4, 5]
y = [2, 4, 3, 8, 10]
correlation = statistics.correlation(x, y)
print("Correlation:", correlation)

# Example using quantiles()
quantiles = statistics.quantiles(data, n=4)
print("Quantiles:", quantiles)

data: [74, 82, 32, 19, 80, 44, 49, 97, 30, 93, 25, 43, 56, 22, 32, 33, 65, 79, 15, 73, 70, 41, 68, 4, 43, 59, 65, 39, 85, 50, 64, 51, 6, 39, 97, 91, 69, 2, 92, 66, 27, 78, 22, 75, 87, 4, 21, 52, 28, 37, 11, 59, 13, 83, 16, 50, 1, 14, 61, 15, 70, 24, 33, 13, 39, 1, 85, 15, 3, 82, 63, 77, 62, 33, 45, 90, 64, 87, 42, 46, 25, 92, 33, 78, 33, 77, 11, 58, 40, 56, 57, 22, 50, 42, 44, 53, 34, 22, 31, 88] 

Mean: 48.18
Median: 45.5
Mode: 33
Variance: 725.4420202020202
Standard Deviation: 26.93403089405706
Population Variance: 718.1876
Population Standard Deviation: 26.799022370228357
Correlation: 0.9205746178983234
Quantiles: [25.5, 45.5, 70.0]


### `NormalDist(mu, sigam)`

Starting Python 3.8, the standard library provides the `NormalDist` object as part of the statistics module.

```python
import statistics

N = statistics.NormalDist(mu, sigma)
```

Here are some methods that can be used with the NormalDist objectution.
- `pdf(x)` : Returns value of the probability density function (pdf) at x.
- `cdf(x)` : Returns value of the cumulative distribution function (cdf) at x.
- `inv_cdf(p)` : Returns value of the inverse cumulative distribution function for probability `p`.
- `quantiles(n=4)`: Divide the normal distribution into n continuous intervals with equal probability. Returns a list of (n - 1) cut points separating the intervals.
- `overlap(other)`: Measures the agreement between two normal probability distributions. Returns a value between 0.0 and 1.0 giving the [overlapping area](https://www.rasch.org/rmt/rmt101r.htm) for the two probability density functions.


In [6]:
import statistics

N_1 = statistics.NormalDist(0, 1)

In [7]:
N_1.pdf(10)

7.69459862670642e-23

In [8]:
N_1.cdf(1)

0.8413447460685429

In [11]:
N_1.inv_cdf(0.9)

1.2815515655446008

In [14]:
N_1.quantiles(n=4)

[-0.6744897501960817, 0.0, 0.6744897501960817]

In [9]:
N_2 = statistics.NormalDist(2, 10)

In [10]:
print(N_2.overlap(N_1))

0.19841439067425348
