<a href="https://colab.research.google.com/github/proffranciscofernando/introduction-to-data-science/blob/main/01-basic-statistics-lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Basic Statistical Measures with Pandas
This notebook will guide you through some basic statistical measures using the `pandas` library. We will cover:
1. Mean
2. Median
3. Mode
4. Standard Deviation
5. Variance
6. Quartiles and Percentiles

## Importing Necessary Libraries
First, we will import the necessary libraries.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

## Creating the DataFrame
Let's create a DataFrame with the data we will use in the examples.

In [None]:
# Creating the DataFrame
data = pd.DataFrame({'values': [10, 20, 30, 40, 50]})
print(data)

## 1. Mean
The mean is the sum of all values divided by the number of values.

In [None]:
# Example of calculating the mean
mean = data['values'].mean()
print("Mean:", mean)

## 2. Median
The median is the middle value when the data is ordered.

In [None]:
# Example of calculating the median
median = data['values'].median()
print("Median:", median)

## 3. Mode
The mode is the value that appears most frequently in the data.

In [None]:
# Example of calculating the mode
mode = data['values'].mode()
print("Mode:", mode[0])

## 4. Standard Deviation
The standard deviation measures the dispersion of the data relative to the mean.

In [None]:
# Example of calculating the standard deviation
std_dev = data['values'].std()
print("Standard Deviation:", std_dev)

## 5. Variance
The variance is the square of the standard deviation.

In [None]:
# Example of calculating the variance
variance = data['values'].var()
print("Variance:", variance)

## 6. Quartiles and Percentiles
Quartiles divide the data into four equal parts. Percentiles are similar but divide the data into 100 equal parts.

In [None]:
# Example of calculating quartiles
Q1 = data['values'].quantile(0.25)
Q2 = data['values'].quantile(0.50)
Q3 = data['values'].quantile(0.75)
print("1st Quartile (Q1):", Q1)
print("2nd Quartile (Q2):", Q2)  # This is the same as the median
print("3rd Quartile (Q3):", Q3)

# Example of calculating percentiles
percentile_90 = data['values'].quantile(0.90)
print("90th Percentile:", percentile_90)

## Data Visualisation
Let's use a boxplot to visualise the distribution of the data.

In [None]:
# Example of visualisation with boxplot
plt.boxplot(data['values'])
plt.title("Boxplot of the Data")
plt.show()

## Conclusion
In this notebook, we covered some basic statistical measures, including mean, median, mode, standard deviation, variance, quartiles, and percentiles. Additionally, we visualised the distribution of the data using a boxplot. We hope this notebook helps you better understand these measures and how to apply them to your data.