# **Article 116 : GroupBy Object** [![Static Badge](https://img.shields.io/badge/Open%20in%20Colab%20-%20orange?style=plastic&logo=googlecolab&labelColor=grey)](https://colab.research.google.com/github/sshrizvi/DS-Python/blob/main/Pandas/Notebooks/116_groupby_object.ipynb)

|🔴 **NOTE** 🔴|
|:-----------:|
|This notebook contains the practical implementations of the concepts discussed in the following article.|
| Here is Article 116 - [GroupBy Object](../Articles/116_groupby_object.md) |

### 📦 **Importing Relevant Libraries**

In [1]:
import numpy as np
import pandas as pd

### ⚠️ **Data Warning**
The data is in the [Resources](../Resources/) folder.

#### **Reading Data into DataFrames**

In [2]:
movies_df = pd.read_csv(
    filepath_or_buffer = '../Resources/Data/imdb-top-1000.csv'
)

In [3]:
movies_df

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
0,The Shawshank Redemption,1994,142,Drama,9.3,Frank Darabont,Tim Robbins,2343110,28341469.0,80.0
1,The Godfather,1972,175,Crime,9.2,Francis Ford Coppola,Marlon Brando,1620367,134966411.0,100.0
2,The Dark Knight,2008,152,Action,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0
3,The Godfather: Part II,1974,202,Crime,9.0,Francis Ford Coppola,Al Pacino,1129952,57300000.0,90.0
4,12 Angry Men,1957,96,Crime,9.0,Sidney Lumet,Henry Fonda,689845,4360000.0,96.0
...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,115,Comedy,7.6,Blake Edwards,Audrey Hepburn,166544,679874270.0,76.0
996,Giant,1956,201,Drama,7.6,George Stevens,Elizabeth Taylor,34075,195217415.0,84.0
997,From Here to Eternity,1953,118,Drama,7.6,Fred Zinnemann,Burt Lancaster,43374,30500000.0,85.0
998,Lifeboat,1944,97,Drama,7.6,Alfred Hitchcock,Tallulah Bankhead,26471,852142728.0,78.0


### 🚀 **Creating a GroupBy Object**

In [13]:
groupby_genre = movies_df.groupby(
    by = 'Genre'
)
groupby_genre

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000000000FEF2C90>

In [12]:
groupby_genre.sum(
    numeric_only = True
)

Unnamed: 0_level_0,Runtime,IMDB_Rating,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Action,22196,1367.3,72282412,32632260000.0,10499.0
Adventure,9656,571.5,22576163,9496922000.0,5020.0
Animation,8166,650.3,21978630,14631470000.0,6082.0
Biography,11970,698.6,24006844,8276358000.0,6023.0
Comedy,17380,1224.7,27620327,15663870000.0,9840.0
Crime,13524,857.8,33533615,8452632000.0,6706.0
Drama,36049,2299.7,61367304,35409970000.0,19208.0
Family,215,15.6,551221,439110600.0,158.0
Fantasy,170,16.0,146222,782726700.0,0.0
Film-Noir,312,23.9,367215,125910500.0,287.0


In [24]:
groupby_genre.groups

{'Action': [2, 5, 8, 10, 13, 14, 16, 29, 30, 31, 39, 42, 44, 55, 57, 59, 60, 63, 68, 72, 106, 109, 129, 130, 134, 140, 142, 144, 152, 155, 160, 161, 166, 168, 171, 172, 177, 181, 194, 201, 202, 216, 217, 223, 224, 236, 241, 262, 275, 294, 308, 320, 325, 326, 331, 337, 339, 340, 343, 345, 348, 351, 353, 356, 357, 362, 368, 369, 375, 376, 390, 410, 431, 436, 473, 477, 479, 482, 488, 493, 496, 502, 507, 511, 532, 535, 540, 543, 564, 569, 570, 573, 577, 582, 583, 602, 605, 608, 615, 623, ...], 'Adventure': [21, 47, 93, 110, 114, 116, 118, 137, 178, 179, 191, 193, 209, 226, 231, 247, 267, 273, 281, 300, 301, 304, 306, 323, 329, 361, 366, 377, 402, 406, 415, 426, 458, 470, 497, 498, 506, 513, 514, 537, 549, 552, 553, 566, 576, 604, 609, 618, 638, 647, 675, 681, 686, 692, 711, 713, 739, 755, 781, 797, 798, 851, 873, 884, 912, 919, 947, 957, 964, 966, 984, 991], 'Animation': [23, 43, 46, 56, 58, 61, 66, 70, 101, 135, 146, 151, 158, 170, 197, 205, 211, 213, 219, 229, 230, 242, 245, 246, 270, 33

In [21]:
groupby_genre.get_group(
    name = 'Action'
)

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
2,The Dark Knight,2008,152,Action,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0
5,The Lord of the Rings: The Return of the King,2003,201,Action,8.9,Peter Jackson,Elijah Wood,1642758,377845905.0,94.0
8,Inception,2010,148,Action,8.8,Christopher Nolan,Leonardo DiCaprio,2067042,292576195.0,74.0
10,The Lord of the Rings: The Fellowship of the Ring,2001,178,Action,8.8,Peter Jackson,Elijah Wood,1661481,315544750.0,92.0
13,The Lord of the Rings: The Two Towers,2002,179,Action,8.7,Peter Jackson,Elijah Wood,1485555,342551365.0,87.0
...,...,...,...,...,...,...,...,...,...,...
968,Falling Down,1993,113,Action,7.6,Joel Schumacher,Michael Douglas,171640,40903593.0,56.0
979,Lethal Weapon,1987,109,Action,7.6,Richard Donner,Mel Gibson,236894,65207127.0,68.0
982,Mad Max 2,1981,96,Action,7.6,George Miller,Mel Gibson,166588,12465371.0,77.0
983,The Warriors,1979,92,Action,7.6,Walter Hill,Michael Beck,93878,22490039.0,65.0


### 🚀 **Questions on GroupBy Object**

#### 🎯 **Q1. Find the TOP 3 Genres based on total earning.**

In [26]:
movies_df.groupby(
    by = 'Genre'
)['Gross'].sum().sort_values(
    ascending = False
).head(3)

Genre
Drama     3.540997e+10
Action    3.263226e+10
Comedy    1.566387e+10
Name: Gross, dtype: float64

#### 🎯 **Q2. Find the Genre with highest average IMDB Rating.**

In [28]:
movies_df.groupby(
    by = 'Genre'
)['IMDB_Rating'].mean().sort_values(
    ascending = False
).head(3)

Genre
Western    8.350000
Crime      8.016822
Fantasy    8.000000
Name: IMDB_Rating, dtype: float64

#### 🎯 **Q3. Find the most popular Director.**
**Note :** Consider `No_of_Votes` the measure of popularity for a director.

In [39]:
movies_df.groupby(
    by = 'Director'
)['No_of_Votes'].sum().sort_values(
    ascending = False
).head(1)

Director
Christopher Nolan    11578345
Name: No_of_Votes, dtype: int64

#### 🎯 **Q4. Find the highest rated movies of each Genre.**

In [51]:
def top_rated_movie(group):
    return group.sort_values(
        by = 'IMDB_Rating',
        ascending = False
    ).head(1)['Series_Title']

movies_df.groupby(
    by = 'Genre'
)[['Series_Title', 'IMDB_Rating']].apply(
    func = top_rated_movie
)

Genre         
Action     2                      The Dark Knight
Adventure  21                        Interstellar
Animation  23       Sen to Chihiro no kamikakushi
Biography  7                     Schindler's List
Comedy     19                        Gisaengchung
Crime      1                        The Godfather
Drama      0             The Shawshank Redemption
Family     688         E.T. the Extra-Terrestrial
Fantasy    321       Das Cabinet des Dr. Caligari
Film-Noir  309                      The Third Man
Horror     49                              Psycho
Mystery    69                             Memento
Thriller   700                    Wait Until Dark
Western    12     Il buono, il brutto, il cattivo
Name: Series_Title, dtype: object

#### 🎯 **Q5. Find number of movies done by each actor.**

In [54]:
movies_df.groupby(
    by = 'Star1'
)['Star1'].value_counts()

Star1
Aamir Khan              7
Aaron Taylor-Johnson    1
Abhay Deol              1
Abraham Attah           1
Adam Driver             1
                       ..
Zbigniew Zamachowski    1
Zooey Deschanel         1
Çetin Tekindor          1
Éric Toledano           1
Ömer Faruk Sorak        1
Name: count, Length: 660, dtype: int64

### 🚀 **Attributes and Methods of DataFrameGroupBy**

#### 📦 **1. len(DataFrameGroupBy)**

In [59]:
len(groupby_genre)

14

#### 📦 **2. DataFrameGroupBy.size()**

In [60]:
groupby_genre.size()

Genre
Action       172
Adventure     72
Animation     82
Biography     88
Comedy       155
Crime        107
Drama        289
Family         2
Fantasy        2
Film-Noir      3
Horror        11
Mystery       12
Thriller       1
Western        4
dtype: int64

#### 📦 **3. DataFrameGroupBy.first()**

In [61]:
groupby_genre.first()

Unnamed: 0_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Action,The Dark Knight,2008,152,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0
Adventure,Interstellar,2014,169,8.6,Christopher Nolan,Matthew McConaughey,1512360,188020017.0,74.0
Animation,Sen to Chihiro no kamikakushi,2001,125,8.6,Hayao Miyazaki,Daveigh Chase,651376,10055859.0,96.0
Biography,Schindler's List,1993,195,8.9,Steven Spielberg,Liam Neeson,1213505,96898818.0,94.0
Comedy,Gisaengchung,2019,132,8.6,Bong Joon Ho,Kang-ho Song,552778,53367844.0,96.0
Crime,The Godfather,1972,175,9.2,Francis Ford Coppola,Marlon Brando,1620367,134966411.0,100.0
Drama,The Shawshank Redemption,1994,142,9.3,Frank Darabont,Tim Robbins,2343110,28341469.0,80.0
Family,E.T. the Extra-Terrestrial,1982,115,7.8,Steven Spielberg,Henry Thomas,372490,435110554.0,91.0
Fantasy,Das Cabinet des Dr. Caligari,1920,76,8.1,Robert Wiene,Werner Krauss,57428,337574718.0,
Film-Noir,The Third Man,1949,104,8.1,Carol Reed,Orson Welles,158731,449191.0,97.0


In [62]:
groupby_genre.first(
    numeric_only = True
)

Unnamed: 0_level_0,Runtime,IMDB_Rating,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Action,152,9.0,2303232,534858444.0,84.0
Adventure,169,8.6,1512360,188020017.0,74.0
Animation,125,8.6,651376,10055859.0,96.0
Biography,195,8.9,1213505,96898818.0,94.0
Comedy,132,8.6,552778,53367844.0,96.0
Crime,175,9.2,1620367,134966411.0,100.0
Drama,142,9.3,2343110,28341469.0,80.0
Family,115,7.8,372490,435110554.0,91.0
Fantasy,76,8.1,57428,337574718.0,
Film-Noir,104,8.1,158731,449191.0,97.0


#### 📦 **4. DataFrameGroupBy.last()**

In [63]:
groupby_genre.last()

Unnamed: 0_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Action,Escape from Alcatraz,1979,112,7.6,Don Siegel,Clint Eastwood,121731,43000000.0,76.0
Adventure,Kelly's Heroes,1970,144,7.6,Brian G. Hutton,Clint Eastwood,45338,1378435.0,50.0
Animation,The Jungle Book,1967,78,7.6,Wolfgang Reitherman,Phil Harris,166409,141843612.0,65.0
Biography,Midnight Express,1978,121,7.6,Alan Parker,Brad Davis,73662,35000000.0,59.0
Comedy,Breakfast at Tiffany's,1961,115,7.6,Blake Edwards,Audrey Hepburn,166544,679874270.0,76.0
Crime,The 39 Steps,1935,86,7.6,Alfred Hitchcock,Robert Donat,51853,302787539.0,93.0
Drama,Lifeboat,1944,97,7.6,Alfred Hitchcock,Tallulah Bankhead,26471,852142728.0,78.0
Family,Willy Wonka & the Chocolate Factory,1971,100,7.8,Mel Stuart,Gene Wilder,178731,4000000.0,67.0
Fantasy,Nosferatu,1922,94,7.9,F.W. Murnau,Max Schreck,88794,445151978.0,
Film-Noir,Shadow of a Doubt,1943,108,7.8,Alfred Hitchcock,Teresa Wright,59556,123353292.0,94.0


In [64]:
groupby_genre.last(
    numeric_only = True
)

Unnamed: 0_level_0,Runtime,IMDB_Rating,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Action,112,7.6,121731,43000000.0,76.0
Adventure,144,7.6,45338,1378435.0,50.0
Animation,78,7.6,166409,141843612.0,65.0
Biography,121,7.6,73662,35000000.0,59.0
Comedy,115,7.6,166544,679874270.0,76.0
Crime,86,7.6,51853,302787539.0,93.0
Drama,97,7.6,26471,852142728.0,78.0
Family,100,7.8,178731,4000000.0,67.0
Fantasy,94,7.9,88794,445151978.0,
Film-Noir,108,7.8,59556,123353292.0,94.0


#### 📦 **5. DataFrameGroupBy.nth**

In [65]:
groupby_genre.nth(
    n = 4
)

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
13,The Lord of the Rings: The Two Towers,2002,179,Action,8.7,Peter Jackson,Elijah Wood,1485555,342551365.0,87.0
20,Soorarai Pottru,2020,153,Drama,8.6,Sudha Kongara,Suriya,54995,556832648.0,
22,Cidade de Deus,2002,130,Crime,8.6,Fernando Meirelles,Kátia Lund,699256,7563397.0,79.0
38,The Pianist,2002,150,Biography,8.5,Roman Polanski,Adrien Brody,729603,32572577.0,85.0
58,Spider-Man: Into the Spider-Verse,2018,117,Animation,8.4,Bob Persichetti,Peter Ramsey,375110,190241310.0,87.0
64,3 Idiots,2009,170,Comedy,8.4,Rajkumar Hirani,Aamir Khan,344445,6532908.0,67.0
114,2001: A Space Odyssey,1968,149,Adventure,8.3,Stanley Kubrick,Keir Dullea,603517,56954992.0,84.0
220,Kahaani,2012,122,Mystery,8.1,Sujoy Ghosh,Vidya Balan,57806,1035953.0,
544,Night of the Living Dead,1968,96,Horror,7.9,George A. Romero,Duane Jones,116557,89029.0,89.0


In [68]:
groupby_genre.nth[0:5]

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
0,The Shawshank Redemption,1994,142,Drama,9.3,Frank Darabont,Tim Robbins,2343110,28341469.0,80.0
1,The Godfather,1972,175,Crime,9.2,Francis Ford Coppola,Marlon Brando,1620367,134966411.0,100.0
2,The Dark Knight,2008,152,Action,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0
3,The Godfather: Part II,1974,202,Crime,9.0,Francis Ford Coppola,Al Pacino,1129952,57300000.0,90.0
4,12 Angry Men,1957,96,Crime,9.0,Sidney Lumet,Henry Fonda,689845,4360000.0,96.0
5,The Lord of the Rings: The Return of the King,2003,201,Action,8.9,Peter Jackson,Elijah Wood,1642758,377845905.0,94.0
6,Pulp Fiction,1994,154,Crime,8.9,Quentin Tarantino,John Travolta,1826188,107928762.0,94.0
7,Schindler's List,1993,195,Biography,8.9,Steven Spielberg,Liam Neeson,1213505,96898818.0,94.0
8,Inception,2010,148,Action,8.8,Christopher Nolan,Leonardo DiCaprio,2067042,292576195.0,74.0
9,Fight Club,1999,139,Drama,8.8,David Fincher,Brad Pitt,1854740,37030102.0,66.0


#### 📦 **6. DataFrameGroupBy.get_group()**

In [69]:
groupby_genre.get_group(
    name = 'Horror'
)

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
49,Psycho,1960,109,Horror,8.5,Alfred Hitchcock,Anthony Perkins,604211,32000000.0,97.0
75,Alien,1979,117,Horror,8.4,Ridley Scott,Sigourney Weaver,787806,78900000.0,89.0
271,The Thing,1982,109,Horror,8.1,John Carpenter,Kurt Russell,371271,13782838.0,57.0
419,The Exorcist,1973,122,Horror,8.0,William Friedkin,Ellen Burstyn,362393,232906145.0,81.0
544,Night of the Living Dead,1968,96,Horror,7.9,George A. Romero,Duane Jones,116557,89029.0,89.0
707,The Innocents,1961,100,Horror,7.8,Jack Clayton,Deborah Kerr,27007,2616000.0,88.0
724,Get Out,2017,104,Horror,7.7,Jordan Peele,Daniel Kaluuya,492851,176040665.0,85.0
844,Halloween,1978,91,Horror,7.7,John Carpenter,Donald Pleasence,233106,47000000.0,87.0
876,The Invisible Man,1933,71,Horror,7.7,James Whale,Claude Rains,30683,298791505.0,87.0
932,Saw,2004,103,Horror,7.6,James Wan,Cary Elwes,379020,56000369.0,46.0


#### 📦 **7. DataFrameGroupBy.groups**

In [70]:
groupby_genre.groups

{'Action': [2, 5, 8, 10, 13, 14, 16, 29, 30, 31, 39, 42, 44, 55, 57, 59, 60, 63, 68, 72, 106, 109, 129, 130, 134, 140, 142, 144, 152, 155, 160, 161, 166, 168, 171, 172, 177, 181, 194, 201, 202, 216, 217, 223, 224, 236, 241, 262, 275, 294, 308, 320, 325, 326, 331, 337, 339, 340, 343, 345, 348, 351, 353, 356, 357, 362, 368, 369, 375, 376, 390, 410, 431, 436, 473, 477, 479, 482, 488, 493, 496, 502, 507, 511, 532, 535, 540, 543, 564, 569, 570, 573, 577, 582, 583, 602, 605, 608, 615, 623, ...], 'Adventure': [21, 47, 93, 110, 114, 116, 118, 137, 178, 179, 191, 193, 209, 226, 231, 247, 267, 273, 281, 300, 301, 304, 306, 323, 329, 361, 366, 377, 402, 406, 415, 426, 458, 470, 497, 498, 506, 513, 514, 537, 549, 552, 553, 566, 576, 604, 609, 618, 638, 647, 675, 681, 686, 692, 711, 713, 739, 755, 781, 797, 798, 851, 873, 884, 912, 919, 947, 957, 964, 966, 984, 991], 'Animation': [23, 43, 46, 56, 58, 61, 66, 70, 101, 135, 146, 151, 158, 170, 197, 205, 211, 213, 219, 229, 230, 242, 245, 246, 270, 33

#### 📦 **8. DataFrameGroupBy.describe()**

In [71]:
groupby_genre.describe()

Unnamed: 0_level_0,Runtime,Runtime,Runtime,Runtime,Runtime,Runtime,Runtime,Runtime,IMDB_Rating,IMDB_Rating,...,Gross,Gross,Metascore,Metascore,Metascore,Metascore,Metascore,Metascore,Metascore,Metascore
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max,count,mean,...,75%,max,count,mean,std,min,25%,50%,75%,max
Genre,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
Action,172.0,129.046512,28.500706,45.0,110.75,127.5,143.25,321.0,172.0,7.949419,...,267443700.0,936662225.0,143.0,73.41958,12.421252,33.0,65.0,74.0,82.0,98.0
Adventure,72.0,134.111111,33.31732,88.0,109.0,127.0,149.0,228.0,72.0,7.9375,...,199807000.0,874211619.0,64.0,78.4375,12.345393,41.0,69.75,80.5,87.25,100.0
Animation,82.0,99.585366,14.530471,71.0,90.0,99.5,106.75,137.0,82.0,7.930488,...,252061200.0,873839108.0,75.0,81.093333,8.813646,61.0,75.0,82.0,87.5,96.0
Biography,88.0,136.022727,25.514466,93.0,120.0,129.0,146.25,209.0,88.0,7.938636,...,98299240.0,753585104.0,79.0,76.240506,11.028187,48.0,70.5,76.0,84.5,97.0
Comedy,155.0,112.129032,22.946213,68.0,96.0,106.0,124.5,188.0,155.0,7.90129,...,81078090.0,886752933.0,125.0,78.72,11.82916,45.0,72.0,79.0,88.0,99.0
Crime,107.0,126.392523,27.689231,80.0,106.5,122.0,141.5,229.0,107.0,8.016822,...,71021630.0,790482117.0,87.0,77.08046,13.099102,47.0,69.5,77.0,87.0,100.0
Drama,289.0,124.737024,27.74049,64.0,105.0,121.0,137.0,242.0,289.0,7.957439,...,116446100.0,924558264.0,241.0,79.701245,12.744687,28.0,72.0,82.0,89.0,100.0
Family,2.0,107.5,10.606602,100.0,103.75,107.5,111.25,115.0,2.0,7.8,...,327332900.0,435110554.0,2.0,79.0,16.970563,67.0,73.0,79.0,85.0,91.0
Fantasy,2.0,85.0,12.727922,76.0,80.5,85.0,89.5,94.0,2.0,8.0,...,418257700.0,445151978.0,0.0,,,,,,,
Film-Noir,3.0,104.0,4.0,100.0,102.0,104.0,106.0,108.0,3.0,7.966667,...,62730680.0,123353292.0,3.0,95.666667,1.527525,94.0,95.0,96.0,96.5,97.0


#### 📦 **9. DataFrameGroupBy.sample()**

In [75]:
groupby_genre.sample(
    n = 2,
    replace = True
)

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
641,Gongdong gyeongbi guyeok JSA,2000,110,Action,7.8,Chan-wook Park,Lee Yeong-ae,26518,147161249.0,58.0
308,White Heat,1949,114,Action,8.1,Raoul Walsh,James Cagney,29807,323145090.0,
781,Harry Potter and the Goblet of Fire,2005,157,Adventure,7.7,Mike Newell,Daniel Radcliffe,548619,290013036.0,81.0
304,The Bridge on the River Kwai,1957,161,Adventure,8.1,David Lean,William Holden,203463,44908000.0,87.0
23,Sen to Chihiro no kamikakushi,2001,125,Animation,8.6,Hayao Miyazaki,Daveigh Chase,651376,10055859.0,96.0
170,Tonari no Totoro,1988,86,Animation,8.2,Hayao Miyazaki,Hitoshi Takagi,291180,1105564.0,86.0
411,Gandhi,1982,191,Biography,8.0,Richard Attenborough,Ben Kingsley,217664,52767889.0,79.0
484,The Irishman,2019,209,Biography,7.9,Martin Scorsese,Robert De Niro,324720,7000000.0,94.0
937,The Station Agent,2003,89,Comedy,7.6,Tom McCarthy,Peter Dinklage,67370,5739376.0,81.0
806,As Good as It Gets,1997,139,Comedy,7.7,James L. Brooks,Jack Nicholson,275755,148478011.0,67.0


#### 📦 **10. DataFrameGroupBy.nunique()**

In [76]:
groupby_genre.nunique()

Unnamed: 0_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Action,172,61,78,15,123,121,172,172,50
Adventure,72,49,58,10,59,59,72,72,33
Animation,82,35,41,11,51,77,82,82,29
Biography,88,44,56,13,76,72,88,88,40
Comedy,155,72,70,11,113,133,155,155,44
Crime,106,56,65,14,86,85,107,107,39
Drama,289,83,95,14,211,250,288,287,52
Family,2,2,2,1,2,2,2,2,2
Fantasy,2,2,2,2,2,2,2,2,0
Film-Noir,3,3,3,3,3,3,3,3,3


### 🚀 **The AGG Method of DataFrameGroupBy**

In [78]:
groupby_genre.agg(
    func = 'min'
)

Unnamed: 0_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Action,300,1924,45,7.6,Abhishek Chaubey,Aamir Khan,25312,3296.0,33.0
Adventure,2001: A Space Odyssey,1925,88,7.6,Akira Kurosawa,Aamir Khan,29999,61001.0,41.0
Animation,Akira,1940,71,7.6,Adam Elliot,Adrian Molina,25229,128985.0,61.0
Biography,12 Years a Slave,1928,93,7.6,Adam McKay,Adrien Brody,27254,21877.0,48.0
Comedy,(500) Days of Summer,1921,68,7.6,Alejandro G. Iñárritu,Aamir Khan,26337,1305.0,45.0
Crime,12 Angry Men,1931,80,7.6,Akira Kurosawa,Ajay Devgn,27712,6013.0,47.0
Drama,1917,1925,64,7.6,Aamir Khan,Abhay Deol,25088,3600.0,28.0
Family,E.T. the Extra-Terrestrial,1971,100,7.8,Mel Stuart,Gene Wilder,178731,4000000.0,67.0
Fantasy,Das Cabinet des Dr. Caligari,1920,76,7.9,F.W. Murnau,Max Schreck,57428,337574718.0,
Film-Noir,Shadow of a Doubt,1941,100,7.8,Alfred Hitchcock,Humphrey Bogart,59556,449191.0,94.0


In [80]:
groupby_genre.agg(
    func = ['min', 'max']
)

Unnamed: 0_level_0,Series_Title,Series_Title,Released_Year,Released_Year,Runtime,Runtime,IMDB_Rating,IMDB_Rating,Director,Director,Star1,Star1,No_of_Votes,No_of_Votes,Gross,Gross,Metascore,Metascore
Unnamed: 0_level_1,min,max,min,max,min,max,min,max,min,max,min,max,min,max,min,max,min,max
Genre,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
Action,300,Yôjinbô,1924,2019,45,321,7.6,9.0,Abhishek Chaubey,Zack Snyder,Aamir Khan,Yun-Fat Chow,25312,2303232,3296.0,936662225.0,33.0,98.0
Adventure,2001: A Space Odyssey,Zombieland,1925,PG,88,228,7.6,8.6,Akira Kurosawa,Ömer Faruk Sorak,Aamir Khan,Yves Montand,29999,1512360,61001.0,874211619.0,41.0,100.0
Animation,Akira,Ôkami kodomo no Ame to Yuki,1940,2020,71,137,7.6,8.6,Adam Elliot,Yoshifumi Kondô,Adrian Molina,Yôji Matsuda,25229,999790,128985.0,873839108.0,61.0,96.0
Biography,12 Years a Slave,Zerkalo,1928,2020,93,209,7.6,8.9,Adam McKay,Tom McCarthy,Adrien Brody,Éric Toledano,27254,1213505,21877.0,753585104.0,48.0,97.0
Comedy,(500) Days of Summer,Zindagi Na Milegi Dobara,1921,2020,68,188,7.6,8.6,Alejandro G. Iñárritu,Zoya Akhtar,Aamir Khan,Ömer Faruk Sorak,26337,939631,1305.0,886752933.0,45.0,99.0
Crime,12 Angry Men,À bout de souffle,1931,2019,80,229,7.6,9.2,Akira Kurosawa,Yavuz Turgul,Ajay Devgn,Vincent Cassel,27712,1826188,6013.0,790482117.0,47.0,100.0
Drama,1917,Zwartboek,1925,2020,64,242,7.6,9.3,Aamir Khan,Çagan Irmak,Abhay Deol,Çetin Tekindor,25088,2343110,3600.0,924558264.0,28.0,100.0
Family,E.T. the Extra-Terrestrial,Willy Wonka & the Chocolate Factory,1971,1982,100,115,7.8,7.8,Mel Stuart,Steven Spielberg,Gene Wilder,Henry Thomas,178731,372490,4000000.0,435110554.0,67.0,91.0
Fantasy,Das Cabinet des Dr. Caligari,Nosferatu,1920,1922,76,94,7.9,8.1,F.W. Murnau,Robert Wiene,Max Schreck,Werner Krauss,57428,88794,337574718.0,445151978.0,,
Film-Noir,Shadow of a Doubt,The Third Man,1941,1949,100,108,7.8,8.1,Alfred Hitchcock,John Huston,Humphrey Bogart,Teresa Wright,59556,158731,449191.0,123353292.0,94.0,97.0


In [84]:
groupby_genre.agg(
    func = {
        'Released_Year' : 'min',
        'Runtime' : 'mean',
        'IMDB_Rating' : ['min', 'max', 'mean'],
        'No_of_Votes' : 'sum',
        'Gross' : ['min', 'max', 'sum'],
        'Metascore' : 'mean'
    }
)

Unnamed: 0_level_0,Released_Year,Runtime,IMDB_Rating,IMDB_Rating,IMDB_Rating,No_of_Votes,Gross,Gross,Gross,Metascore
Unnamed: 0_level_1,min,mean,min,max,mean,sum,min,max,sum,mean
Genre,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2
Action,1924,129.046512,7.6,9.0,7.949419,72282412,3296.0,936662225.0,32632260000.0,73.41958
Adventure,1925,134.111111,7.6,8.6,7.9375,22576163,61001.0,874211619.0,9496922000.0,78.4375
Animation,1940,99.585366,7.6,8.6,7.930488,21978630,128985.0,873839108.0,14631470000.0,81.093333
Biography,1928,136.022727,7.6,8.9,7.938636,24006844,21877.0,753585104.0,8276358000.0,76.240506
Comedy,1921,112.129032,7.6,8.6,7.90129,27620327,1305.0,886752933.0,15663870000.0,78.72
Crime,1931,126.392523,7.6,9.2,8.016822,33533615,6013.0,790482117.0,8452632000.0,77.08046
Drama,1925,124.737024,7.6,9.3,7.957439,61367304,3600.0,924558264.0,35409970000.0,79.701245
Family,1971,107.5,7.8,7.8,7.8,551221,4000000.0,435110554.0,439110600.0,79.0
Fantasy,1920,85.0,7.9,8.1,8.0,146222,337574718.0,445151978.0,782726700.0,
Film-Noir,1941,104.0,7.8,8.1,7.966667,367215,449191.0,123353292.0,125910500.0,95.666667


In [86]:
groupby_genre.agg(
    func = ['min', 'max', 'sum']
)

Unnamed: 0_level_0,Series_Title,Series_Title,Series_Title,Released_Year,Released_Year,Released_Year,Runtime,Runtime,Runtime,IMDB_Rating,...,Star1,No_of_Votes,No_of_Votes,No_of_Votes,Gross,Gross,Gross,Metascore,Metascore,Metascore
Unnamed: 0_level_1,min,max,sum,min,max,sum,min,max,sum,min,...,sum,min,max,sum,min,max,sum,min,max,sum
Genre,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
Action,300,Yôjinbô,The Dark KnightThe Lord of the Rings: The Retu...,1924,2019,2008200320102001200219991980197719621954200019...,45,321,22196,7.6,...,Christian BaleElijah WoodLeonardo DiCaprioElij...,25312,2303232,72282412,3296.0,936662225.0,32632260000.0,33.0,98.0,10499.0
Adventure,2001: A Space Odyssey,Zombieland,InterstellarBack to the FutureInglourious Bast...,1925,PG,2014198520091981196819621959201319751963194819...,88,228,9656,7.6,...,Matthew McConaugheyMichael J. FoxBrad PittJürg...,29999,1512360,22576163,61001.0,874211619.0,9496922000.0,41.0,100.0,5020.0
Animation,Akira,Ôkami kodomo no Ame to Yuki,Sen to Chihiro no kamikakushiThe Lion KingHota...,1940,2020,2001199419882016201820172008199719952019200920...,71,137,8166,7.6,...,Daveigh ChaseRob MinkoffTsutomu TatsumiRyûnosu...,25229,999790,21978630,128985.0,873839108.0,14631470000.0,61.0,96.0,6082.0
Biography,12 Years a Slave,Zerkalo,Schindler's ListGoodfellasHamiltonThe Intoucha...,1928,2020,1993199020202011200220171995198420182013201320...,93,209,11970,7.6,...,Liam NeesonRobert De NiroLin-Manuel MirandaÉri...,27254,1213505,24006844,21877.0,753585104.0,8276358000.0,48.0,97.0,6023.0
Comedy,(500) Days of Summer,Zindagi Na Milegi Dobara,GisaengchungLa vita è bellaModern TimesCity Li...,1921,2020,2019199719361931200919641940200120001973196019...,68,188,17380,7.6,...,Kang-ho SongRoberto BenigniCharles ChaplinChar...,26337,939631,27620327,1305.0,886752933.0,15663870000.0,45.0,99.0,9840.0
Crime,12 Angry Men,À bout de souffle,The GodfatherThe Godfather: Part II12 Angry Me...,1931,2019,1972197419571994200219991995199120192006199519...,80,229,13524,7.6,...,Marlon BrandoAl PacinoHenry FondaJohn Travolta...,27712,1826188,33533615,6013.0,790482117.0,8452632000.0,47.0,100.0,6706.0
Drama,1917,Zwartboek,The Shawshank RedemptionFight ClubForrest Gump...,1925,2020,1994199919941975202019981946201420061998198819...,64,242,36049,7.6,...,Tim RobbinsBrad PittTom HanksJack NicholsonSur...,25088,2343110,61367304,3600.0,924558264.0,35409970000.0,28.0,100.0,19208.0
Family,E.T. the Extra-Terrestrial,Willy Wonka & the Chocolate Factory,E.T. the Extra-TerrestrialWilly Wonka & the Ch...,1971,1982,19821971,100,115,215,7.8,...,Henry ThomasGene Wilder,178731,372490,551221,4000000.0,435110554.0,439110600.0,67.0,91.0,158.0
Fantasy,Das Cabinet des Dr. Caligari,Nosferatu,Das Cabinet des Dr. CaligariNosferatu,1920,1922,19201922,76,94,170,7.9,...,Werner KraussMax Schreck,57428,88794,146222,337574718.0,445151978.0,782726700.0,,,0.0
Film-Noir,Shadow of a Doubt,The Third Man,The Third ManThe Maltese FalconShadow of a Doubt,1941,1949,194919411943,100,108,312,7.8,...,Orson WellesHumphrey BogartTeresa Wright,59556,158731,367215,449191.0,123353292.0,125910500.0,94.0,97.0,287.0


### 🚀 **Looping a DataFrameGroupBy**

In [98]:
for name, data in groupby_genre:
    print(name, data)

Action                                           Series_Title Released_Year  Runtime  \
2                                      The Dark Knight          2008      152   
5        The Lord of the Rings: The Return of the King          2003      201   
8                                            Inception          2010      148   
10   The Lord of the Rings: The Fellowship of the Ring          2001      178   
13               The Lord of the Rings: The Two Towers          2002      179   
..                                                 ...           ...      ...   
968                                       Falling Down          1993      113   
979                                      Lethal Weapon          1987      109   
982                                          Mad Max 2          1981       96   
983                                       The Warriors          1979       92   
985                               Escape from Alcatraz          1979      112   

      Genre  IMDB_Ra

#### 🎯 **Q1. Find the highest rated movies of each Genre.** *(Using Loops)*

In [97]:
data_list = []
for name, data in groupby_genre:
    data_list.append(
        [
            data[data['IMDB_Rating'] == data['IMDB_Rating'].max()].head(1)['Series_Title'].values[0],
            data[data['IMDB_Rating'] == data['IMDB_Rating'].max()].head(1)['IMDB_Rating'].values[0],
            data[data['IMDB_Rating'] == data['IMDB_Rating'].max()].head(1)['Genre'].values[0],
        ]
    )
df = pd.DataFrame(
    data = data_list,
    columns = [
        'Series_Title',
        'IMDB_Rating',
        'Genre'
    ]
).set_index('Genre')
df

Unnamed: 0_level_0,Series_Title,IMDB_Rating
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1
Action,The Dark Knight,9.0
Adventure,Interstellar,8.6
Animation,Sen to Chihiro no kamikakushi,8.6
Biography,Schindler's List,8.9
Comedy,Gisaengchung,8.6
Crime,The Godfather,9.2
Drama,The Shawshank Redemption,9.3
Family,E.T. the Extra-Terrestrial,7.8
Fantasy,Das Cabinet des Dr. Caligari,8.1
Film-Noir,The Third Man,8.1


### 🚀 **The apply() Method**

#### 🎯 **Q1. Find the number of movies starting with 'A' for each group.**

In [107]:
def movies_starting_with_a(group):
    return group['Series_Title'].str.startswith('A').sum()

groupby_genre.apply(
    func = movies_starting_with_a,
    include_groups = False
)

Genre
Action       10
Adventure     2
Animation     2
Biography     9
Comedy       14
Crime         4
Drama        21
Family        0
Fantasy       0
Film-Noir     0
Horror        1
Mystery       0
Thriller      0
Western       0
dtype: int64

#### 🎯 **Q2. Find the ranking of each movie in the group according to IMDB Score.**

In [108]:
def rank_movies(group):
    group['Genre_Ranking'] = group['IMDB_Rating'].rank(
        method = 'first',
        ascending = False
    )
    return group

groupby_genre.apply(
    func = rank_movies,
    include_groups = False
)

Unnamed: 0_level_0,Unnamed: 1_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore,Genre_Ranking
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Action,2,The Dark Knight,2008,152,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0,1.0
Action,5,The Lord of the Rings: The Return of the King,2003,201,8.9,Peter Jackson,Elijah Wood,1642758,377845905.0,94.0,2.0
Action,8,Inception,2010,148,8.8,Christopher Nolan,Leonardo DiCaprio,2067042,292576195.0,74.0,3.0
Action,10,The Lord of the Rings: The Fellowship of the Ring,2001,178,8.8,Peter Jackson,Elijah Wood,1661481,315544750.0,92.0,4.0
Action,13,The Lord of the Rings: The Two Towers,2002,179,8.7,Peter Jackson,Elijah Wood,1485555,342551365.0,87.0,5.0
...,...,...,...,...,...,...,...,...,...,...,...
Thriller,700,Wait Until Dark,1967,108,7.8,Terence Young,Audrey Hepburn,27733,17550741.0,81.0,1.0
Western,12,"Il buono, il brutto, il cattivo",1966,161,8.8,Sergio Leone,Clint Eastwood,688390,6100000.0,90.0,1.0
Western,48,Once Upon a Time in the West,1968,165,8.5,Sergio Leone,Henry Fonda,302844,5321508.0,80.0,2.0
Western,115,Per qualche dollaro in più,1965,132,8.3,Sergio Leone,Clint Eastwood,232772,15000000.0,74.0,3.0


#### 🎯 **Q3. Find Normalized IMDB Rating GroupWise.**

In [110]:
def normalize(group):
    max = group['IMDB_Rating'].max()
    min = group['IMDB_Rating'].min()
    group['Normalize_Rating'] = (group['IMDB_Rating'] - min) / (max - min)
    return group

groupby_genre.apply(
    func = normalize,
    include_groups = False
)

Unnamed: 0_level_0,Unnamed: 1_level_0,Series_Title,Released_Year,Runtime,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore,Normalize_Rating
Genre,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Action,2,The Dark Knight,2008,152,9.0,Christopher Nolan,Christian Bale,2303232,534858444.0,84.0,1.000000
Action,5,The Lord of the Rings: The Return of the King,2003,201,8.9,Peter Jackson,Elijah Wood,1642758,377845905.0,94.0,0.928571
Action,8,Inception,2010,148,8.8,Christopher Nolan,Leonardo DiCaprio,2067042,292576195.0,74.0,0.857143
Action,10,The Lord of the Rings: The Fellowship of the Ring,2001,178,8.8,Peter Jackson,Elijah Wood,1661481,315544750.0,92.0,0.857143
Action,13,The Lord of the Rings: The Two Towers,2002,179,8.7,Peter Jackson,Elijah Wood,1485555,342551365.0,87.0,0.785714
...,...,...,...,...,...,...,...,...,...,...,...
Thriller,700,Wait Until Dark,1967,108,7.8,Terence Young,Audrey Hepburn,27733,17550741.0,81.0,
Western,12,"Il buono, il brutto, il cattivo",1966,161,8.8,Sergio Leone,Clint Eastwood,688390,6100000.0,90.0,1.000000
Western,48,Once Upon a Time in the West,1968,165,8.5,Sergio Leone,Henry Fonda,302844,5321508.0,80.0,0.700000
Western,115,Per qualche dollaro in più,1965,132,8.3,Sergio Leone,Clint Eastwood,232772,15000000.0,74.0,0.500000


### 🚀 **GroupBy on Multiple Columns**

In [111]:
director_actor_group = movies_df.groupby(
    by = [
        'Director',
        'Star1'
    ]
)

In [115]:
director_actor_group.size()

Director             Star1         
Aamir Khan           Amole Gupte       1
Aaron Sorkin         Eddie Redmayne    1
Abdellatif Kechiche  Léa Seydoux       1
Abhishek Chaubey     Shahid Kapoor     1
Abhishek Kapoor      Amit Sadh         1
                                      ..
Zaza Urushadze       Lembit Ulfsak     1
Zoya Akhtar          Hrithik Roshan    1
                     Vijay Varma       1
Çagan Irmak          Çetin Tekindor    1
Ömer Faruk Sorak     Cem Yilmaz        1
Length: 898, dtype: int64

In [117]:
director_actor_group.get_group(
    name = ('Aamir Khan', 'Amole Gupte')
)

Unnamed: 0,Series_Title,Released_Year,Runtime,Genre,IMDB_Rating,Director,Star1,No_of_Votes,Gross,Metascore
65,Taare Zameen Par,2007,165,Drama,8.4,Aamir Khan,Amole Gupte,168895,1223869.0,


#### 🎯 **Q1. Find the most earning Actor-Director combo.**

In [126]:
director_actor_group['Gross'].sum().sort_values(
    ascending = False
).head(1)

Director        Star1         
Akira Kurosawa  Toshirô Mifune    2.999877e+09
Name: Gross, dtype: float64

#### 🎯 **Q2. Find the best (In terms of Average MetaScore) Actor-Genre combo.**

In [127]:
actor_genre_group = movies_df.groupby(
    by = [
        'Star1',
        'Genre'
    ]
)

actor_genre_group.Metascore.mean().sort_values(
    ascending = False
).head(1)

Star1           Genre
Ellar Coltrane  Drama    100.0
Name: Metascore, dtype: float64

### ⚠️ **Data Warning**
The data is in the [Resources](../Resources/) folder.

#### **Reading Data into DataFrames**

In [128]:
deliveries = pd.read_csv(
    filepath_or_buffer = '../Resources/Data/deliveries.csv'
)
deliveries

Unnamed: 0,match_id,inning,batting_team,bowling_team,over,ball,batsman,non_striker,bowler,is_super_over,...,bye_runs,legbye_runs,noball_runs,penalty_runs,batsman_runs,extra_runs,total_runs,player_dismissed,dismissal_kind,fielder
0,1,1,Sunrisers Hyderabad,Royal Challengers Bangalore,1,1,DA Warner,S Dhawan,TS Mills,0,...,0,0,0,0,0,0,0,,,
1,1,1,Sunrisers Hyderabad,Royal Challengers Bangalore,1,2,DA Warner,S Dhawan,TS Mills,0,...,0,0,0,0,0,0,0,,,
2,1,1,Sunrisers Hyderabad,Royal Challengers Bangalore,1,3,DA Warner,S Dhawan,TS Mills,0,...,0,0,0,0,4,0,4,,,
3,1,1,Sunrisers Hyderabad,Royal Challengers Bangalore,1,4,DA Warner,S Dhawan,TS Mills,0,...,0,0,0,0,0,0,0,,,
4,1,1,Sunrisers Hyderabad,Royal Challengers Bangalore,1,5,DA Warner,S Dhawan,TS Mills,0,...,0,0,0,0,0,2,2,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
179073,11415,2,Chennai Super Kings,Mumbai Indians,20,2,RA Jadeja,SR Watson,SL Malinga,0,...,0,0,0,0,1,0,1,,,
179074,11415,2,Chennai Super Kings,Mumbai Indians,20,3,SR Watson,RA Jadeja,SL Malinga,0,...,0,0,0,0,2,0,2,,,
179075,11415,2,Chennai Super Kings,Mumbai Indians,20,4,SR Watson,RA Jadeja,SL Malinga,0,...,0,0,0,0,1,0,1,SR Watson,run out,KH Pandya
179076,11415,2,Chennai Super Kings,Mumbai Indians,20,5,SN Thakur,RA Jadeja,SL Malinga,0,...,0,0,0,0,2,0,2,,,


In [129]:
deliveries.columns

Index(['match_id', 'inning', 'batting_team', 'bowling_team', 'over', 'ball',
       'batsman', 'non_striker', 'bowler', 'is_super_over', 'wide_runs',
       'bye_runs', 'legbye_runs', 'noball_runs', 'penalty_runs',
       'batsman_runs', 'extra_runs', 'total_runs', 'player_dismissed',
       'dismissal_kind', 'fielder'],
      dtype='object')

### 🚀 **Exercise on IPL Deliveries DataSet**

#### 🎯 **Q1. Find the TOP 10 Batsmen in terms of RUNS.**

In [132]:
deliveries.groupby(
    by = 'batsman'
)['batsman_runs'].sum().sort_values(
    ascending = False
).head(10)

batsman
V Kohli           5434
SK Raina          5415
RG Sharma         4914
DA Warner         4741
S Dhawan          4632
CH Gayle          4560
MS Dhoni          4477
RV Uthappa        4446
AB de Villiers    4428
G Gambhir         4223
Name: batsman_runs, dtype: int64

#### 🎯 **Q2. Find the Batsman with maximum number of sixes.**

In [139]:
def count_sixes(batsman_run_group):
    return (batsman_run_group == 6).sum()

deliveries.groupby(
    by = 'batsman'
)['batsman_runs'].apply(
    func = count_sixes
).sort_values(
    ascending = False
).head(1)

batsman
CH Gayle    327
Name: batsman_runs, dtype: int64

In [140]:
six = deliveries[deliveries.batsman_runs == 6]

six.groupby(
    by = 'batsman'
)['batsman_runs'].count().sort_values(
    ascending = False
).head(1)

batsman
CH Gayle    327
Name: batsman_runs, dtype: int64

#### 🎯 **Q3. Find the Batsman with most number of 4's and 6's in last 5 overs.**

In [152]:
last_5_over_mask = deliveries.over > 15
last_5_over = deliveries[last_5_over_mask]

fours_and_sixes = last_5_over[(last_5_over.batsman_runs == 4) | (last_5_over.batsman_runs == 6)]

fours_and_sixes.groupby(
    by = 'batsman'
)['batsman_runs'].count().sort_values(
    ascending = False
).head(1)

batsman
MS Dhoni    340
Name: batsman_runs, dtype: int64

#### 🎯 **Q4. Find Virat Kohli's record against all teams.**

In [154]:
vk = deliveries[deliveries.batsman == 'V Kohli']

vk.groupby(
    by = 'bowling_team'
)['batsman_runs'].sum().sort_values(
    ascending = False
)

bowling_team
Delhi Daredevils           763
Chennai Super Kings        749
Kolkata Knight Riders      675
Kings XI Punjab            636
Mumbai Indians             628
Sunrisers Hyderabad        509
Rajasthan Royals           370
Deccan Chargers            306
Gujarat Lions              283
Rising Pune Supergiants    188
Pune Warriors              128
Rising Pune Supergiant      83
Delhi Capitals              66
Kochi Tuskers Kerala        50
Name: batsman_runs, dtype: int64

#### 🎯 **Q5. Create a function that returns the highest score of any batsman.**

In [156]:
def highest_score(batsman_name):
    batsman_df = deliveries[deliveries.batsman == batsman_name]
    return batsman_df.groupby(
        by = 'match_id'
    )['batsman_runs'].sum().sort_values(
        ascending = False
    ).head(1).values[0]

highest_score('CH Gayle')

175