# PANDAS - GROUP THE DATA

In this lesson, we will learn how to group data in a DataFrame and perform operations on it. First, we will split the data into groups, then we will iterate through the groups and then display the groups. Let us see what we will cover:

Example 1: Split the object and combine the result

Example 2: Iterate the Group

Example 3: View the Group

Example 4: Perform Aggregation Operations on Groups

# 1. PANDAS - SPLIT THE OBJECT AND COMBINE THE RESULT

The groupby() method is used in Pandas to split the object. We can define groupby() as grouping the rows/columns into specific groups.

In the below example, we are grouping by the Player column:

In [None]:
import pandas as pd
data = {
    'players ': ['Dhoni','devi','sunriKumar','Jda'],
    'Years':[2014,1997,2020,2001],
    'Rank':[1,3,2,4 ]
}
df = pd.DataFrame(data)

print('DataFrame print',df)

dff = df.groupby('players')

print(dff.first())


In [None]:
import pandas as pd

data = {
    'players': ['Dhoni', 'devi', 'sunrikumar', 'Jda'],
    'Years': [2014, 1997, 2020, 2001],
    'Rank': [1, 3, 2, 4]
}

df = pd.DataFrame(data)
print('DataFrame print', df)

dff = df.groupby('players').first()
print(dff)


# 2. PANDAS – ITERATE THE GROUP

Iterate and loop through the groups with groupby() using the for-in loop. In the below example, the iteration is through the group Player one by one:

In [None]:
import pandas as pd

data = {
    'players': ['Dhoni', 'devi', 'sunrikumar', 'Jda'],
    'Years': [2014, 1997, 2020, 2001],
    'Rank': [1, 3, 2, 4]
}

df = pd.DataFrame(data)
print('DataFrame print', df)

dff = df.groupby('players').first()


for name , group in df:
    print(name)
    print (group) 

In [None]:
import pandas as pd

data = {
    'players': ['Dhoni', 'devi', 'sunrikumar', 'Jda'],
    'Years': [2014, 1997, 2020, 2001],
    'Rank': [1, 3, 2, 4]
}

df = pd.DataFrame(data)
print('DataFrame print', df)

dff = df.groupby('players').first()
print(dff)

for name, group in df.groupby('players'):
    print(name)
    print(group)
    

# 3. PANDAS - VIEW THE GROUP

Use the groups property in Python Pandas to view the group.

In [33]:
import pandas as pd

data = {
    'players': ['Dhoni', 'devi','sunrikumar', 'Jda'],
    'Years': [2014, 1997, 2020, 2001],
    'Rank': [1, 3, 2, 4]
}

df = pd.DataFrame(data)
print('DataFrame print', df)

g = df.groupby('players').groups
print (g)



    

DataFrame print       players  Years  Rank
0       Dhoni   2014     1
1        devi   1997     3
2  sunrikumar   2020     2
3         Jda   2001     4
{'Dhoni': [0], 'Jda': [3], 'devi': [1], 'sunrikumar': [2]}


# PANDAS - PERFORM AGGREGATION OPERATIONS ON GROUPS

After grouping, we can perform operations on the grouped data using the agg() method. Through this method, get the mean or even get the size of each group, etc.



Example 4: Get the mean of the grouped data

Example 5: Get the size of each group

# 4. PANDAS – GET THE MEAN OF GROUPED DATA

To get the mean of the grouped data, first, group and then use the agg() method with numpy.mean().

Let us see an example:

In [38]:
import pandas as pd
import numpy as np

data = {
    'players': ['Dhoni', 'devi','sunrikumar', 'Jda'],
    'Years': [2014, 1997, 2020, 2001],
    'Rank': [1, 3, 2, 4],
    'points' : [101,104,103,102]
    
}

df = pd.DataFrame(data)
print('DataFrame print', df)

dff = df.groupby('Years')

print(dff['points'].agg(np.mean))





    

DataFrame print       players  Years  Rank  points
0       Dhoni   2014     1     101
1        devi   1997     3     104
2  sunrikumar   2020     2     103
3         Jda   2001     4     102
Years
1997    104.0
2001    102.0
2014    101.0
2020    103.0
Name: points, dtype: float64


  print(dff['points'].agg(np.mean))


# 5. PANDAS - GET THE SIZE OF EACH GROUP (AGGREGATION)

To get the size of each group, use the NumPy size attribute in Pandas. We have grouped by the Player column using the groupby().