# Quartiles, quantiles, and quintiles
Quantiles are a great way of summarizing numerical data since they can be used to measure center and spread, as well as to get a sense of where a data point stands in relation to the rest of the data set. For example, you might want to give a discount to the 10% most active users on a website.

In this exercise, you'll calculate quartiles, quintiles, and deciles, which split up a dataset into 4, 5, and 10 pieces, respectively.

Both pandas as pd and numpy as np are loaded and food_consumption is available.

In [7]:
import pandas as pd
import numpy as np

food_consumption = pd.read_csv(r'datasets/food_consumption.csv', index_col=0)
food_consumption.head()

Unnamed: 0,country,food_category,consumption,co2_emission
1,Argentina,pork,10.51,37.2
2,Argentina,poultry,38.66,41.53
3,Argentina,beef,55.48,1712.0
4,Argentina,lamb_goat,1.56,54.63
5,Argentina,fish,4.36,6.96


- Calculate the quartiles of the co2_emission column of food_consumption.
- Calculate the six quantiles that split up the data into 5 pieces (quintiles) of the co2_emission column of food_consumption.
- Calculate the eleven quantiles of co2_emission that split up the data into ten pieces (deciles).

In [8]:
# Calculate the quartiles of co2_emission
print(np.quantile(food_consumption['co2_emission'], np.linspace(0, 1, 5)))

# Calculate the quintiles of co2_emission
print(np.quantile(food_consumption['co2_emission'], np.linspace(0, 1, 6)))

# Calculate the deciles of co2_emission
print(np.quantile(food_consumption['co2_emission'], np.linspace(0, 1, 11)))

[   0.        5.21     16.53     62.5975 1712.    ]
[   0.       3.54    11.026   25.59    99.978 1712.   ]
[0.00000e+00 6.68000e-01 3.54000e+00 7.04000e+00 1.10260e+01 1.65300e+01
 2.55900e+01 4.42710e+01 9.99780e+01 2.03629e+02 1.71200e+03]


Those are some high-quality quantiles! While calculating more quantiles gives you a more detailed look at the data, it also produces more numbers, making the summary more difficult to quickly understand.