# Stats

This section showcases the utility functions found in the [datachart.utils.stats](/references/utils/stats) module.

Let us start by importing the supporting libraries:

In [1]:
import random

## Statistics Submodule

The [dataset.utils.stats](/references/utils/stats) submodule contains functions for calculating statistics. To showcase
its use, let us create a list of random numbers:


In [2]:
random_values = random.sample(range(1, 100), 10)
random_values

[73, 64, 10, 42, 1, 55, 75, 66, 8, 77]

Let us now showcase the functions in the `stats` module.

### Count

The `count` function returns the number of elements in the list.

In [3]:
from datachart.utils.stats import count

In [4]:
count(random_values)

10

### Sum

The `sum_values` function returns the sum of all values in the list.


In [None]:
from datachart.utils.stats import sum_values

In [None]:
sum_values(random_values)

471.0

### Mean

The `mean` function returns the mean of the values.

In [7]:
from datachart.utils.stats import mean

In [8]:
mean(random_values)

47.1

### Median

The `median` function returns the median of the values. 

In [9]:
from datachart.utils.stats import median

In [10]:
median(random_values)

59.5

### Standard Deviation

The `stdev` function returns the standard deviation of the values.

In [11]:
from datachart.utils.stats import stdev

In [12]:
stdev(random_values)

28.469106062537335

### Variance

The `variance` function returns the variance of the values. Variance is the square of the standard deviation.


In [None]:
from datachart.utils.stats import variance

In [None]:
variance(random_values)

810.49

### Quantile

The `quantile` function returns the quantile of the values.

In [15]:
from datachart.utils.stats import quantile

Show the 25th quantile:

In [16]:
quantile(random_values, 25)

18.0

Show the 75th quantile:

In [17]:
quantile(random_values, 75)

71.25

### Interquartile Range (IQR)

The `iqr` function returns the interquartile range, which is the difference between the 75th percentile (Q3) and 25th percentile (Q1). It is useful for identifying outliers and understanding the spread of the middle 50% of the data.


In [18]:
from datachart.utils.stats import iqr


In [19]:
iqr(random_values)


53.25

### Minimum

The `minimum` function returns the minimum of the values.

In [20]:
from datachart.utils.stats import minimum

In [21]:
minimum(random_values)

1.0

### Maximum

The `maximum` function returns the maximum of the values.

In [22]:
from datachart.utils.stats import maximum

In [23]:
maximum(random_values)

77.0

### Correlation

The `correlation` function calculates the Pearson correlation coefficient between two lists of values. It measures the linear relationship between the datasets, ranging from -1 (perfect negative correlation) to 1 (perfect positive correlation).


In [None]:
from datachart.utils.stats import correlation

Create a second list of random values to compare:


In [None]:
random_values_2 = random.sample(range(1, 100), 10)
random_values_2

[77, 4, 28, 12, 13, 75, 48, 37, 7, 41]

In [None]:
correlation(random_values, random_values_2)

0.548995564913195

<div class="admonition note">
    <p class="admonition-title">Under development</p>
    <p style="margin-top: .6rem; margin-bottom: .6rem">
        This theme is still under development. If you are interested in improving it, please let us know.
    </p>
</div>