# Quantiles and Percentiles

Quantiles and percentiles are very important statistical concepts that are used to understand and interpret data. In this notebook, we will clearly explain these concepts and show how they can be calculated.

## Definition of Quantiles

A quantile is a point at which a data set is divided into equal-sized groups. For example, the median of a data set is a quantile because it divides the data into two equal groups. Half of the data points are less than the median, and half of the data points are greater than the median. Sometimes the median is also referred to as the 0.5 quantile or the 50% quantile because it splits the data in half.

We can also define other quantiles. For example, a data set can be divided into four equal groups at the 0.25, 0.5, and 0.75 quantiles (also known as the 25%, 50%, and 75% quantiles). The 0.25 quantile is the point at which 25% of the data points are less than it, and 75% of the data points are greater. Similarly, the 0.75 quantile is the point at which 75% of the data points are less than it, and 25% of the data points are greater.

Mathematically, for a data set $X$ with $n$ data points sorted in increasing order, the $p$-th quantile ($0 < p < 1$) is the data point at position $p(n+1)$. If $p(n+1)$ is not an integer, then it is typically rounded to the nearest integer or interpolated between the two closest integers.

## Definition of Percentiles

Percentiles are a specific type of quantiles that divide a data set into 100 equal groups. Therefore, the $p$-th percentile is the point at which $p%$ of the data points are less than it. For example, the 25th percentile is the point at which 25% of the data points are less than it, and the 75th percentile is the point at which 75% of the data points are less than it.

In practice, the terms quantile and percentile are often used interchangeably, and percentiles are often used even when the data set is not large enough to be divided into 100 groups.

## Calculation of Quantiles and Percentiles

The calculation of quantiles and percentiles is just a matter of finding out how many values are less than a certain value. For example, to calculate the 20th percentile of a data set, we need to find out the point at which 20% of the data points are less than it.

However, it should be noted that there are several methods to calculate quantiles and percentiles, each potentially giving slightly different results. These methods differ in how they round or interpolate when the desired quantile or percentile position is not an integer. The R programming language, for example, provides nine different methods for calculating quantiles.

It should also be noted that the calculated quantiles or percentiles can be sensitive to the specific sample and calculation method when the data set is small. However, when the data set is large, all methods give fairly similar results.

## Conclusion

To summarize, quantiles and percentiles are important statistical concepts that allow us to understand and interpret data. They are points at which a data set is divided into equal-sized groups, with quantiles dividing the data into any number of groups and percentiles dividing the data into 100 groups. The calculation of quantiles and percentiles involves finding out how many values are less than a certain value, but the specific calculation method can vary and can impact the results, especially for small data sets.