# Python | grouping pandas | groupby

The pandas library is commonly used to load dataset and perform various types of data manipulation and analysis in Python.
In this tutorial, we discuss the concept of **grouping pandas**. 

It happens quite often that we work with a dataset that has one or multiple columns of categorical data. 
In such situations, it can be very helpful to use pandas _groupby_ method to group the dataframe by one or multiple columns. Let's look at an example:

# Python example for grouping pandas | groupby | COVID-19 dataset

In this example, we work with a practical dataset from [Our World in Data](https://ourworldindata.org/). The data is related to the COVID-19 historical data for different locations. Please, note that this dataset is being updated at the time of writing. So, the output you obtain from running the following Python codes might differ from what shown here. 

Let's read the dataset from URL into pandas dataframe. We have prepared another [short tutorial](https://soardeepsci.com/python-pandas-read_csv-from-url/) that provides more details on how to read csv from URL. 

In [1]:
import pandas as pd

In [2]:
#link to the csv file
csv_url = 'https://covid.ourworldindata.org/data/owid-covid-data.csv'

#reading the csv file into pandas dataframe from the URL
df = pd.read_csv(csv_url)

#displaying the dataframe
df

Unnamed: 0,iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
0,AFG,Asia,Afghanistan,2020-02-24,5.0,5.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
1,AFG,Asia,Afghanistan,2020-02-25,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
2,AFG,Asia,Afghanistan,2020-02-26,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
3,AFG,Asia,Afghanistan,2020-02-27,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
4,AFG,Asia,Afghanistan,2020-02-28,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
136988,ZWE,Africa,Zimbabwe,2021-11-26,133836.0,62.0,34.714,4704.0,0.0,0.714,...,1.6,30.7,36.791,1.7,61.49,0.571,,,,
136989,ZWE,Africa,Zimbabwe,2021-11-27,133836.0,0.0,31.571,4704.0,0.0,0.714,...,1.6,30.7,36.791,1.7,61.49,0.571,,,,
136990,ZWE,Africa,Zimbabwe,2021-11-28,133951.0,115.0,43.429,4705.0,1.0,0.857,...,1.6,30.7,36.791,1.7,61.49,0.571,,,,
136991,ZWE,Africa,Zimbabwe,2021-11-29,134226.0,275.0,78.857,4706.0,1.0,1.000,...,1.6,30.7,36.791,1.7,61.49,0.571,,,,


This COVID-19 dataset is a collection of the COVID-19 data, being updated daily throughout the duration of the COVID-19 pandemic. We have prepared a separate set of tutorials on studying this dataset extensively, the first part of which can be accessed from [here](https://soardeepsci.com/a-practical-tutorial-on-data-science-part-1-python-pandas-dataframe/). We don't want to delve into the dataset here. For the purpose of this tutorial, we would like to see how we could group the dataset based on one or multiple columns' label. 

Let's use pandas _groupby_ to group the dataframe based on _location_:

In [3]:
#grouping pandas dataframe by column location using pandas groupby
df_grouped = df.groupby('location')

Grouping pandas dataframe results in an object of type **DataFrameGroupBy** as shown below:

In [4]:
type(df_grouped)

pandas.core.groupby.generic.DataFrameGroupBy

## (Optional reading) What attributes does the DataFrameGroupBy object have?

The following Python code snippet shows how to list callable attributes of object in python. It uses a list comprehension to examine each attribute of the object, and includes an attribute to a list if the attribute is callable not starting with '_':

In [5]:
object_callable_att = [att_name for att_name in dir(df_grouped)
                  if callable(getattr(df_grouped, att_name)) & ~att_name.startswith('_')]
print(object_callable_att)

['agg', 'aggregate', 'all', 'any', 'apply', 'backfill', 'bfill', 'boxplot', 'corr', 'corrwith', 'count', 'cov', 'cumcount', 'cummax', 'cummin', 'cumprod', 'cumsum', 'describe', 'diff', 'expanding', 'ffill', 'fillna', 'filter', 'first', 'get_group', 'head', 'hist', 'idxmax', 'idxmin', 'last', 'mad', 'max', 'mean', 'median', 'min', 'ngroup', 'nth', 'nunique', 'ohlc', 'pad', 'pct_change', 'pipe', 'plot', 'prod', 'quantile', 'rank', 'resample', 'rolling', 'sample', 'sem', 'shift', 'size', 'skew', 'std', 'sum', 'tail', 'take', 'transform', 'tshift', 'var']


And, here is the Python code snippet to list only those non-callable attributes of object that are not a column label of dataframe, and do not start with '_':

In [6]:
object_noncallable_att = [att_name for att_name in dir(df_grouped)
                  if (not callable(getattr(df_grouped, att_name))) & 
                              (not att_name.startswith('_')) & (not att_name in list(df.columns))]
print(object_noncallable_att)

['dtypes', 'groups', 'indices', 'ndim', 'ngroups']


As you can see, there are many methods and variables one can use based on their needs and application. In the following, we discuss some of the commonly used attributes of DataFrameGroupBy object.

## Python grouping pandas | attributes of DataFrameGroupBy object generated by pandas groupby

Let's look at some of the useful attributes of DataFrameGroupBy object generated by pandas groupby. 

We can use _ngroups_ to get the number of groups generated by pandas groupby:

In [7]:
df_grouped.ngroups

237

We can use the method _first_ to obtain a snapshot of the dataframe consisting of the first record from each group generated by pandas groupby. In the case of our COVID-19 dataset, the first row for each group is related to the first daily data point recorded for that specific location. So, the **date** of that first row may differ from one location to another.

In [8]:
df_grouped.first()

Unnamed: 0_level_0,iso_code,continent,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,AFG,Asia,2020-02-24,5.0,5.0,0.714,1.0,1.0,0.000,0.126,...,,,37.746,0.500,64.83,0.511,,,,
Africa,OWID_AFR,,2020-02-13,1.0,0.0,0.143,1.0,0.0,0.000,0.001,...,,,,,,,,,,
Albania,ALB,Europe,2020-02-25,2.0,2.0,5.429,1.0,1.0,0.143,0.696,...,7.100,51.200,,2.890,78.57,0.795,-190.8,-4.34,2.88,-66.412942
Algeria,DZA,Africa,2020-02-25,1.0,1.0,0.143,1.0,1.0,0.000,0.022,...,0.700,30.400,83.741,1.900,76.88,0.748,,,,
Andorra,AND,Europe,2020-03-02,1.0,1.0,0.143,1.0,1.0,0.000,12.928,...,29.000,37.800,,,83.73,0.868,16.0,16.84,23.46,206.841275
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Wallis and Futuna,WLF,Oceania,2021-03-23,,,,,,,,...,,,,,79.94,,,,,
World,OWID_WRL,,2020-01-22,557.0,0.0,717.286,17.0,0.0,16.286,0.071,...,6.434,34.635,60.130,2.705,72.58,0.737,,,,
Yemen,YEM,Asia,2020-04-10,1.0,1.0,0.143,2.0,2.0,0.000,0.033,...,7.600,29.200,49.542,0.700,66.12,0.470,,,,
Zambia,ZMB,Africa,2020-03-18,2.0,2.0,0.429,1.0,1.0,0.000,0.106,...,3.100,24.700,13.938,2.000,63.89,0.584,,,,


Similarly, we can use the method _last_ to obtain a snapshot of the dataframe consisting of the last record from each group generated by pandas groupby:

In [9]:
df_grouped.last()

Unnamed: 0_level_0,iso_code,continent,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,AFG,Asia,2021-11-30,157289.0,29.0,39.143,7308.0,0.0,0.429,3948.470,...,,,37.746,0.500,64.83,0.511,,,,
Africa,OWID_AFR,,2021-11-30,8652562.0,8181.0,6245.857,222881.0,143.0,140.000,6299.707,...,,,,,,,,,,
Albania,ALB,Europe,2021-11-30,199945.0,195.0,396.857,3096.0,4.0,6.143,69596.099,...,7.100,51.200,,2.890,78.57,0.795,10923.4,28.63,43.06,3802.175755
Algeria,DZA,Africa,2021-11-30,210531.0,187.0,178.286,6071.0,7.0,5.857,4718.667,...,0.700,30.400,83.741,1.900,76.88,0.748,,,,
Andorra,AND,Europe,2021-11-30,17115.0,403.0,110.429,131.0,0.0,0.143,221255.527,...,29.000,37.800,,,83.73,0.868,89.6,27.20,31.41,1158.311141
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Wallis and Futuna,WLF,Oceania,2021-11-30,,,,,,,,...,,,,,79.94,,,,,
World,OWID_WRL,,2021-11-30,262806448.0,624741.0,573083.714,5215558.0,7971.0,7019.571,33372.393,...,6.434,34.635,60.130,2.705,72.58,0.737,,,,
Yemen,YEM,Asia,2021-11-30,10004.0,9.0,5.286,1950.0,1.0,1.143,328.101,...,7.600,29.200,49.542,0.700,66.12,0.470,,,,
Zambia,ZMB,Africa,2021-11-30,210169.0,19.0,11.286,3667.0,0.0,0.000,11107.912,...,3.100,24.700,13.938,2.000,63.89,0.584,,,,


We can use the method **nth** to obtain a snapshot of dataframe consisting of the _nth_ row from each group generated by pandas groupby. As a quick tip, the first and last methods are equivalent to nth(0) and nth(-1), respectively.

In [10]:
#5th row from each group
df_grouped.nth(5)

Unnamed: 0_level_0,iso_code,continent,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Afghanistan,AFG,Asia,2020-02-29,5.0,0.0,0.714,,,0.000,0.126,...,,,37.746,0.500,64.83,0.511,,,,
Africa,OWID_AFR,,2020-02-18,1.0,0.0,,,0.0,,0.001,...,,,,,,,,,,
Albania,ALB,Europe,2020-03-01,,,,,,,,...,7.100,51.200,,2.890,78.57,0.795,,,,
Algeria,DZA,Africa,2020-03-01,1.0,0.0,0.143,,,0.000,0.022,...,0.700,30.400,83.741,1.900,76.88,0.748,,,,
Andorra,AND,Europe,2020-03-07,1.0,0.0,0.143,,,0.000,12.928,...,29.000,37.800,,,83.73,0.868,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Wallis and Futuna,WLF,Oceania,2021-03-28,,,,,,,,...,,,,,79.94,,,,,
World,OWID_WRL,,2020-01-27,2927.0,809.0,,82.0,26.0,,0.372,...,6.434,34.635,60.130,2.705,72.58,0.737,,,,
Yemen,YEM,Asia,2020-04-15,1.0,0.0,0.143,,,0.000,0.033,...,7.600,29.200,49.542,0.700,66.12,0.470,,,,
Zambia,ZMB,Africa,2020-03-23,3.0,0.0,0.429,,,0.000,0.159,...,3.100,24.700,13.938,2.000,63.89,0.584,,,,


We can find the size of each group generated by pandas groupby using the method size:

In [11]:
df_grouped.size()

location
Afghanistan          646
Africa               657
Albania              645
Algeria              645
Andorra              639
                    ... 
Wallis and Futuna    253
World                679
Yemen                600
Zambia               623
Zimbabwe             621
Length: 237, dtype: int64

Another useful method of DataFrameGroupBy object generated by pandas groupby is get_group, which returns all the records of a given group. For example, to obtain all the rows related to the **World** group, we can use the following Python code snippet. As a side note, the group **World** has the COVID-19 historical data across the whole world.

In [12]:
df_grouped.get_group('World')

Unnamed: 0,iso_code,continent,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
134470,OWID_WRL,,2020-01-22,557.0,0.0,,17.0,0.0,,0.071,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
134471,OWID_WRL,,2020-01-23,655.0,98.0,,18.0,1.0,,0.083,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
134472,OWID_WRL,,2020-01-24,941.0,286.0,,26.0,8.0,,0.119,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
134473,OWID_WRL,,2020-01-25,1434.0,493.0,,42.0,16.0,,0.182,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
134474,OWID_WRL,,2020-01-26,2118.0,684.0,,56.0,14.0,,0.269,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
135144,OWID_WRL,,2021-11-26,260660212.0,596892.0,564384.571,5189414.0,6672.0,6937.714,33099.854,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
135145,OWID_WRL,,2021-11-27,261110719.0,450507.0,560647.143,5195170.0,5756.0,6908.714,33157.061,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
135146,OWID_WRL,,2021-11-28,261503794.0,393075.0,560210.286,5199817.0,4647.0,6942.714,33206.975,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,
135147,OWID_WRL,,2021-11-29,262181707.0,677913.0,567821.143,5207587.0,7770.0,7018.857,33293.060,...,6.434,34.635,60.13,2.705,72.58,0.737,,,,


You can use the method describe to get some useful statistical data about each column for each group generated by pandas groupby. For example, to get statistics of new daily cases for each group:

In [13]:
df_grouped['new_cases'].describe()

Unnamed: 0_level_0,count,mean,std,min,25%,50%,75%,max
location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Afghanistan,646.0,243.481424,411.031909,-2.0,28.0,78.0,234.75,3243.0
Africa,657.0,13141.515982,10823.804758,0.0,6146.0,10273.0,16922.00,49021.0
Albania,632.0,316.368671,326.462369,0.0,28.0,155.0,567.00,1239.0
Algeria,645.0,326.404651,299.389439,0.0,135.0,219.0,413.00,1927.0
Andorra,639.0,26.784038,42.224029,0.0,0.0,12.0,40.00,403.0
...,...,...,...,...,...,...,...,...
Wallis and Futuna,0.0,,,,,,,
World,679.0,386168.020619,234312.096834,0.0,206580.0,406888.0,560810.00,908289.0
Yemen,600.0,16.673333,23.823431,-1.0,1.0,6.5,24.00,174.0
Zambia,623.0,337.349920,594.645537,0.0,20.0,78.0,324.00,3594.0


# Python grouping pandas | groupby multiple columns 

Sometimes, it becomes insightful to group pandas dataframe by multiple columns. In the case of our dataset, let's say we want to group pandas dataframe by two columns: 1. _continent_, and 2. _location_. Here is how we accomplish the task using pandas groupby, followed by displaying the first row of each group using the method first:

In [14]:
df_multi_grouped = df.groupby(['continent', 'location'])
df_multi_grouped.first()

Unnamed: 0_level_0,Unnamed: 1_level_0,iso_code,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,new_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
continent,location,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Africa,Algeria,DZA,2020-02-25,1.0,1.0,0.143,1.0,1.0,0.000,0.022,0.022,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
Africa,Angola,AGO,2020-03-20,1.0,1.0,0.429,2.0,2.0,0.000,0.029,0.029,...,,,26.664,,61.15,0.581,,,,
Africa,Benin,BEN,2020-03-16,1.0,1.0,0.286,1.0,1.0,0.000,0.080,0.080,...,0.6,12.3,11.035,0.5,61.77,0.545,,,,
Africa,Botswana,BWA,2020-03-30,3.0,3.0,0.571,1.0,1.0,0.143,1.251,1.251,...,5.7,34.4,,1.8,69.59,0.735,,,,
Africa,Burkina Faso,BFA,2020-03-10,1.0,1.0,0.429,1.0,1.0,0.000,0.047,0.047,...,1.6,23.9,11.877,0.4,61.58,0.452,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
South America,Paraguay,PRY,2020-03-07,1.0,1.0,1.000,1.0,1.0,0.000,0.139,0.139,...,5.0,21.6,79.602,1.3,74.25,0.728,704.7,8.78,10.63,97.608732
South America,Peru,PER,2020-01-01,1.0,1.0,1.571,1.0,1.0,0.286,0.030,0.030,...,4.8,,,1.6,76.74,0.777,165.1,7.36,7.36,4.949128
South America,Suriname,SUR,2020-03-14,1.0,1.0,0.143,1.0,1.0,0.000,1.690,1.690,...,7.4,42.9,67.779,3.1,71.68,0.738,,,,
South America,Uruguay,URY,2020-03-13,4.0,4.0,11.286,1.0,1.0,0.000,1.148,1.148,...,14.0,19.9,,2.8,77.91,0.817,-121.4,-1.58,-1.90,-34.833488


A couple of important notes:

<font color='green'>When working with grouping pandas by multiple columns, a multiIndex pandas DataFrameGroupBy object is created. For the example shown here, two indices are 'continent' and 'location'.</font>

<font color='blue'>('Africa', 'Algeria'), ('Africa', 'Angola'), etc. refer to distinct groups, although their continent values 'Africa' are the same and is displayed only once for clarity.</font>

Now, you can obtain the sanpshot of dataframe for a given tuple of index, _e.g._ ('Africa', 'Algeria'):

In [15]:
df_multi_grouped.get_group(('Africa', 'Algeria'))

Unnamed: 0,iso_code,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,total_cases_per_million,new_cases_per_million,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
1948,DZA,2020-02-25,1.0,1.0,,,,,0.022,0.022,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
1949,DZA,2020-02-26,1.0,0.0,,,,,0.022,0.000,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
1950,DZA,2020-02-27,1.0,0.0,,,,,0.022,0.000,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
1951,DZA,2020-02-28,1.0,0.0,,,,,0.022,0.000,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
1952,DZA,2020-02-29,1.0,0.0,,,,,0.022,0.000,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2588,DZA,2021-11-26,209817.0,193.0,160.286,6046.0,5.0,4.429,4702.664,4.326,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
2589,DZA,2021-11-27,209980.0,163.0,163.000,6052.0,6.0,5.000,4706.317,3.653,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
2590,DZA,2021-11-28,210152.0,172.0,171.429,6058.0,6.0,5.286,4710.172,3.855,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,
2591,DZA,2021-11-29,210344.0,192.0,176.143,6064.0,6.0,5.429,4714.476,4.303,...,0.7,30.4,83.741,1.9,76.88,0.748,,,,


# Python grouping pandas | groupby time and place using pandas groupby

As another example, let's try pandas grouping by two columns: 1. _date_ and 2. _continent_. We can use the pandas groupby as shown below:

In [16]:
df_grouped_cont_date = df.groupby(['date', 'continent'])

Now, you can obtain all the data for the group ('2021-01-01', 'Asia'):

In [17]:
df_grouped_cont_date.get_group(('2021-01-01', 'Asia'))

Unnamed: 0,iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
312,AFG,Asia,Afghanistan,2021-01-01,52513.0,183.0,131.143,2201.0,12.0,9.429,...,,,37.746,0.5,64.83,0.511,,,,
5786,ARM,Asia,Armenia,2021-01-01,159738.0,329.0,425.0,2828.0,5.0,13.571,...,1.5,52.1,94.043,4.2,75.09,0.776,,,,
8673,AZE,Asia,Azerbaijan,2021-01-01,219041.0,341.0,1039.571,2670.0,29.0,36.286,...,0.3,42.5,83.241,4.7,73.0,0.756,,,,
9945,BHR,Asia,Bahrain,2021-01-01,92913.0,238.0,229.857,352.0,0.0,0.143,...,5.8,37.6,,2.0,77.29,0.852,,,,
10584,BGD,Asia,Bangladesh,2021-01-01,514500.0,990.0,1033.571,7576.0,17.0,25.429,...,1.0,44.7,34.808,0.8,72.59,0.632,,,,
14694,BTN,Asia,Bhutan,2021-01-01,689.0,19.0,16.143,,,0.0,...,,,79.807,1.7,71.78,0.654,,,,
18186,BRN,Asia,Brunei,2021-01-01,157.0,0.0,0.714,3.0,0.0,0.0,...,2.0,30.9,,2.7,75.86,0.838,,,,
20734,KHM,Asia,Cambodia,2021-01-01,379.0,1.0,2.286,,,0.0,...,2.0,33.7,66.229,0.8,69.82,0.594,,,,
25580,CHN,Asia,China,2021-01-01,87135.0,18.0,22.286,4634.0,0.0,0.0,...,1.9,48.4,,4.34,76.91,0.761,,,,
45844,GEO,Asia,Georgia,2021-01-01,228410.0,990.0,1383.714,2528.0,23.0,30.714,...,5.3,55.5,,2.6,73.77,0.812,,,,


# Final remarks

In this article, we covered the concept of grouping pandas and its helpful features. 
Hopefully, this tutorial was able to help you with the Python basics, 
particularly pandas groupby and pandas groupby multiple columns. 
Feel free to check out the rest of our articles from [https://soardeepsci.com/blog/](https://soardeepsci.com/blog/).