In [198]:
pip install pandas


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/usr/local/Cellar/jupyterlab/4.3.5/libexec/bin/python -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


# Introduction to Pandas

Pandas is a powerful Python library for data manipulation and analysis. In this notebook, we will explore some basic concepts including **Series**, **DataFrames**, indexing, filtering, grouping, and merging data.

For a quick start and more details, refer to the [10 Minute Intro to Pandas](https://pandas.pydata.org/docs/user_guide/10min.html).


In [199]:
import pandas as pd

# Check the pandas version
print("Pandas version:", pd.__version__)


Pandas version: 2.2.3


## Creating a Series

A **Series** is a one-dimensional array-like object that can hold data of any type (integers, strings, floating point numbers, Python objects, etc.). It comes with associated labels, which means each element has an index.


In [200]:
# Create a simple Series from a list
data = [1, 3, 5, 7, 9]
series = pd.Series(data,index = ['a','b','c','d','e'])

# series['b'] = "abv"
# 
print(series)


a    1
b    3
c    5
d    7
e    9
dtype: int64


## Creating a DataFrame

A **DataFrame** is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It is similar to a spreadsheet or SQL table.


In [201]:
# Create a DataFrame from a dictionary of lists
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'chicago', 'Houston']
}
df = pd.DataFrame(data,index = [1,4,2,3])
print(df)
print("\n",df.dtypes)

      Name  Age         City
1    Alice   25     New York
4      Bob   30  Los Angeles
2  Charlie   35      chicago
3    David   40      Houston

 Name    object
Age      int64
City    object
dtype: object


## Basic DataFrame Operations

Let's explore some common operations on a DataFrame:
- Viewing the first few rows with `.head()`
- Generating summary statistics with `.describe()`
- Selecting a specific column


In [202]:
# Viewing the first few rows
print("First 3 rows of the DataFrame:")
print(df.head(3))

# Generating summary statistics (for numeric and object columns)
print("\nSummary statistics:")
print(df.describe())#include = 'all' for categorical columns as well.

# Selecting a column (returns a Series)
print("\nNames column:")
print(df['Name'])


First 3 rows of the DataFrame:
      Name  Age         City
1    Alice   25     New York
4      Bob   30  Los Angeles
2  Charlie   35      chicago

Summary statistics:
             Age
count   4.000000
mean   32.500000
std     6.454972
min    25.000000
25%    28.750000
50%    32.500000
75%    36.250000
max    40.000000

Names column:
1      Alice
4        Bob
2    Charlie
3      David
Name: Name, dtype: object


In [203]:
df['Age'].dtypes

dtype('int64')

In [204]:
df.tail(3)


Unnamed: 0,Name,Age,City
4,Bob,30,Los Angeles
2,Charlie,35,chicago
3,David,40,Houston


In [205]:
df.columns

Index(['Name', 'Age', 'City'], dtype='object')

In [206]:
df.to_numpy()

array([['Alice', 25, 'New York'],
       ['Bob', 30, 'Los Angeles'],
       ['Charlie', 35, 'chicago'],
       ['David', 40, 'Houston']], dtype=object)

In [207]:
df.sort_index(axis=0)


Unnamed: 0,Name,Age,City
1,Alice,25,New York
2,Charlie,35,chicago
3,David,40,Houston
4,Bob,30,Los Angeles


In [208]:
df.sort_values(by='City')

Unnamed: 0,Name,Age,City
3,David,40,Houston
4,Bob,30,Los Angeles
1,Alice,25,New York
2,Charlie,35,chicago


## Selecting Data from DataFrames

Filtering allows you to select rows based on a condition. In this example, we filter the DataFrame to include only rows where the `Age` is greater than 30.


In [209]:
df

Unnamed: 0,Name,Age,City
1,Alice,25,New York
4,Bob,30,Los Angeles
2,Charlie,35,chicago
3,David,40,Houston


In [210]:
df[0:3]

Unnamed: 0,Name,Age,City
1,Alice,25,New York
4,Bob,30,Los Angeles
2,Charlie,35,chicago


In [211]:
df.loc[1:3,['Name','Age']] # Label Based. Inslicing both indexes are inclusive.


Unnamed: 0,Name,Age
1,Alice,25
4,Bob,30
2,Charlie,35
3,David,40


In [212]:
df.iloc[0:2, 0:3] # Index Based. 


Unnamed: 0,Name,Age,City
1,Alice,25,New York
4,Bob,30,Los Angeles


In [213]:
df.iloc[:, lambda x: [0, 2]]


Unnamed: 0,Name,City
1,Alice,New York
4,Bob,Los Angeles
2,Charlie,chicago
3,David,Houston


In [214]:
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print(filtered_df)


      Name  Age     City
2  Charlie   35  chicago
3    David   40  Houston


## Adding a New Column

You can easily create a new column in a DataFrame. Here we add a column `Senior` that marks `True` if the person's age is over 30, and `False` otherwise.


In [215]:
# Add a new column 'Senior'
df['Senior'] = df['Age'] > 30
print(df)


      Name  Age         City  Senior
1    Alice   25     New York   False
4      Bob   30  Los Angeles   False
2  Charlie   35      chicago    True
3    David   40      Houston    True


## Setting an Index

Sometimes, it is useful to set one of the columns as the DataFrame’s index. This can make data lookups more intuitive.


In [216]:
# Set the 'Name' column as the index
df_indexed = df.set_index('Name')
print(df_indexed)


         Age         City  Senior
Name                             
Alice     25     New York   False
Bob       30  Los Angeles   False
Charlie   35      chicago    True
David     40      Houston    True


## Writing Data to a CSV File

You can also save a DataFrame to a CSV file easily using the `.to_csv()` method.


In [217]:
# Write the DataFrame 'df' to a CSV file (this will create 'output.csv' in your working directory)
df.to_csv('output.csv')
print("DataFrame written to 'output.csv'")


DataFrame written to 'output.csv'


## Reading Data from a CSV File

Pandas can read data from many file formats. Here’s an example (note that you'll need an actual CSV file, e.g., `sample.csv`, in your working directory to run this cell).


In [218]:
df_csv = pd.read_csv('output.csv')
print(df_csv.head())


   Unnamed: 0     Name  Age         City  Senior
0           1    Alice   25     New York   False
1           4      Bob   30  Los Angeles   False
2           2  Charlie   35      chicago    True
3           3    David   40      Houston    True


## Grouping Data

The `groupby()` method is used to split the data into groups, apply a function, and combine the results. This is especially useful for aggregation.


In [219]:
# Create a new DataFrame for grouping demonstration
data_group = {
    'Team': ['A', 'B', 'A', 'B', 'A', 'B'],
    'Points': [10, 20, 15, 25, 10, 30]
}
df_group = pd.DataFrame(data_group)
print("Original DataFrame:")
print(df_group)

# Group by 'Team' and calculate the sum of points for each group
grouped = df_group.groupby('Team').sum()
print("\nGrouped DataFrame (sum of points by team):")
print(grouped)


Original DataFrame:
  Team  Points
0    A      10
1    B      20
2    A      15
3    B      25
4    A      10
5    B      30

Grouped DataFrame (sum of points by team):
      Points
Team        
A         35
B         75


In [220]:
df=pd.read_csv("IMDB-Movie-Data.csv",index_col=['Title'])
df.head(10)

Unnamed: 0_level_0,Rank,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Guardians of the Galaxy,1,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0
Prometheus,2,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0
Split,3,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0
Sing,4,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0
Suicide Squad,5,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0
The Great Wall,6,"Action,Adventure,Fantasy",European mercenaries searching for black powde...,Yimou Zhang,"Matt Damon, Tian Jing, Willem Dafoe, Andy Lau",2016,103,6.1,56036,45.13,42.0
La La Land,7,"Comedy,Drama,Music",A jazz pianist falls for an aspiring actress i...,Damien Chazelle,"Ryan Gosling, Emma Stone, Rosemarie DeWitt, J....",2016,128,8.3,258682,151.06,93.0
Mindhorn,8,Comedy,A has-been actor best known for playing the ti...,Sean Foley,"Essie Davis, Andrea Riseborough, Julian Barrat...",2016,89,6.4,2490,,71.0
The Lost City of Z,9,"Action,Adventure,Biography","A true-life drama, centering on British explor...",James Gray,"Charlie Hunnam, Robert Pattinson, Sienna Mille...",2016,141,7.1,7188,8.01,78.0
Passengers,10,"Adventure,Drama,Romance",A spacecraft traveling to a distant colony pla...,Morten Tyldum,"Jennifer Lawrence, Chris Pratt, Michael Sheen,...",2016,116,7.0,192177,100.01,41.0


In [221]:
df.loc['Sing']

Rank                                                                  4
Genre                                           Animation,Comedy,Family
Description           In a city of humanoid animals, a hustling thea...
Director                                           Christophe Lourdelet
Actors                Matthew McConaughey,Reese Witherspoon, Seth Ma...
Year                                                               2016
Runtime (Minutes)                                                   108
Rating                                                              7.2
Votes                                                             60545
Revenue (Millions)                                               270.32
Metascore                                                          59.0
Name: Sing, dtype: object

In [222]:
df.iloc[2]

Rank                                                                  3
Genre                                                   Horror,Thriller
Description           Three girls are kidnapped by a man with a diag...
Director                                             M. Night Shyamalan
Actors                James McAvoy, Anya Taylor-Joy, Haley Lu Richar...
Year                                                               2016
Runtime (Minutes)                                                   117
Rating                                                              7.3
Votes                                                            157606
Revenue (Millions)                                               138.12
Metascore                                                          62.0
Name: Split, dtype: object

In [223]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1000 entries, Guardians of the Galaxy to Nine Lives
Data columns (total 11 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   Rank                1000 non-null   int64  
 1   Genre               1000 non-null   object 
 2   Description         1000 non-null   object 
 3   Director            1000 non-null   object 
 4   Actors              1000 non-null   object 
 5   Year                1000 non-null   int64  
 6   Runtime (Minutes)   1000 non-null   int64  
 7   Rating              1000 non-null   float64
 8   Votes               1000 non-null   int64  
 9   Revenue (Millions)  872 non-null    float64
 10  Metascore           936 non-null    float64
dtypes: float64(3), int64(4), object(4)
memory usage: 126.0+ KB


In [224]:
df.shape

(1000, 11)

In [225]:
temp_df = pd.concat([df])
temp_df.shape

(1000, 11)

In [226]:
temp_df

Unnamed: 0_level_0,Rank,Genre,Description,Director,Actors,Year,Runtime (Minutes),Rating,Votes,Revenue (Millions),Metascore
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Guardians of the Galaxy,1,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0
Prometheus,2,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0
Split,3,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0
Sing,4,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0
Suicide Squad,5,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0
...,...,...,...,...,...,...,...,...,...,...,...
Secret in Their Eyes,996,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,,45.0
Hostel: Part II,997,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0
Step Up 2: The Streets,998,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0
Search Party,999,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,,22.0


In [227]:
df.columns

Index(['Rank', 'Genre', 'Description', 'Director', 'Actors', 'Year',
       'Runtime (Minutes)', 'Rating', 'Votes', 'Revenue (Millions)',
       'Metascore'],
      dtype='object')

In [228]:
df.rename(columns={
    'Runtime (Minutes)':'Runtime',
    'Revenue (Millions)':'Revenue'
    },inplace = True)
df.columns


Index(['Rank', 'Genre', 'Description', 'Director', 'Actors', 'Year', 'Runtime',
       'Rating', 'Votes', 'Revenue', 'Metascore'],
      dtype='object')

In [229]:
for i in df:
    print (i.lower())

rank
genre
description
director
actors
year
runtime
rating
votes
revenue
metascore


In [230]:
df.columns=[i.lower() for i in df]
df.columns

Index(['rank', 'genre', 'description', 'director', 'actors', 'year', 'runtime',
       'rating', 'votes', 'revenue', 'metascore'],
      dtype='object')

In [231]:
df.isnull().sum()

rank             0
genre            0
description      0
director         0
actors           0
year             0
runtime          0
rating           0
votes            0
revenue        128
metascore       64
dtype: int64

In [232]:
df.dropna()

Unnamed: 0_level_0,rank,genre,description,director,actors,year,runtime,rating,votes,revenue,metascore
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Guardians of the Galaxy,1,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.13,76.0
Prometheus,2,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.46,65.0
Split,3,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.12,62.0
Sing,4,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.32,59.0
Suicide Squad,5,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.02,40.0
...,...,...,...,...,...,...,...,...,...,...,...
Resident Evil: Afterlife,994,"Action,Adventure,Horror",While still out to destroy the evil Umbrella C...,Paul W.S. Anderson,"Milla Jovovich, Ali Larter, Wentworth Miller,K...",2010,97,5.9,140900,60.13,37.0
Project X,995,Comedy,3 high school seniors throw a birthday party t...,Nima Nourizadeh,"Thomas Mann, Oliver Cooper, Jonathan Daniel Br...",2012,88,6.7,164088,54.72,48.0
Hostel: Part II,997,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.54,46.0
Step Up 2: The Streets,998,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.01,50.0


In [233]:
df.dropna(axis=1)

Unnamed: 0_level_0,rank,genre,description,director,actors,year,runtime,rating,votes
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Guardians of the Galaxy,1,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074
Prometheus,2,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820
Split,3,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606
Sing,4,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545
Suicide Squad,5,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727
...,...,...,...,...,...,...,...,...,...
Secret in Their Eyes,996,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585
Hostel: Part II,997,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152
Step Up 2: The Streets,998,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699
Search Party,999,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881


In [234]:
re=df['revenue']

In [235]:
re

Title
Guardians of the Galaxy    333.13
Prometheus                 126.46
Split                      138.12
Sing                       270.32
Suicide Squad              325.02
                            ...  
Secret in Their Eyes          NaN
Hostel: Part II             17.54
Step Up 2: The Streets      58.01
Search Party                  NaN
Nine Lives                  19.64
Name: revenue, Length: 1000, dtype: float64

In [236]:
df.isnull().sum()

rank             0
genre            0
description      0
director         0
actors           0
year             0
runtime          0
rating           0
votes            0
revenue        128
metascore       64
dtype: int64

In [237]:
m=re.mean()
print (m)
# df['Revenue'].fillna(m)
df['revenue'].fillna(m,inplace=True)
df['revenue']


82.95637614678898


Title
Guardians of the Galaxy    333.130000
Prometheus                 126.460000
Split                      138.120000
Sing                       270.320000
Suicide Squad              325.020000
                              ...    
Secret in Their Eyes        82.956376
Hostel: Part II             17.540000
Step Up 2: The Streets      58.010000
Search Party                82.956376
Nine Lives                  19.640000
Name: revenue, Length: 1000, dtype: float64

In [238]:
df['revenue'] = df['revenue'].fillna(m)

In [239]:
df.isnull().sum()

rank            0
genre           0
description     0
director        0
actors          0
year            0
runtime         0
rating          0
votes           0
revenue         0
metascore      64
dtype: int64

In [240]:
df['newFeature'] = df['revenue'].apply(lambda x: 'High' if x > 150 else 'Low')
df

Unnamed: 0_level_0,rank,genre,description,director,actors,year,runtime,rating,votes,revenue,metascore,newFeature
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
Guardians of the Galaxy,1,"Action,Adventure,Sci-Fi",A group of intergalactic criminals are forced ...,James Gunn,"Chris Pratt, Vin Diesel, Bradley Cooper, Zoe S...",2014,121,8.1,757074,333.130000,76.0,High
Prometheus,2,"Adventure,Mystery,Sci-Fi","Following clues to the origin of mankind, a te...",Ridley Scott,"Noomi Rapace, Logan Marshall-Green, Michael Fa...",2012,124,7.0,485820,126.460000,65.0,Low
Split,3,"Horror,Thriller",Three girls are kidnapped by a man with a diag...,M. Night Shyamalan,"James McAvoy, Anya Taylor-Joy, Haley Lu Richar...",2016,117,7.3,157606,138.120000,62.0,Low
Sing,4,"Animation,Comedy,Family","In a city of humanoid animals, a hustling thea...",Christophe Lourdelet,"Matthew McConaughey,Reese Witherspoon, Seth Ma...",2016,108,7.2,60545,270.320000,59.0,High
Suicide Squad,5,"Action,Adventure,Fantasy",A secret government agency recruits some of th...,David Ayer,"Will Smith, Jared Leto, Margot Robbie, Viola D...",2016,123,6.2,393727,325.020000,40.0,High
...,...,...,...,...,...,...,...,...,...,...,...,...
Secret in Their Eyes,996,"Crime,Drama,Mystery","A tight-knit team of rising investigators, alo...",Billy Ray,"Chiwetel Ejiofor, Nicole Kidman, Julia Roberts...",2015,111,6.2,27585,82.956376,45.0,Low
Hostel: Part II,997,Horror,Three American college students studying abroa...,Eli Roth,"Lauren German, Heather Matarazzo, Bijou Philli...",2007,94,5.5,73152,17.540000,46.0,Low
Step Up 2: The Streets,998,"Drama,Music,Romance",Romantic sparks occur between two dance studen...,Jon M. Chu,"Robert Hoffman, Briana Evigan, Cassie Ventura,...",2008,98,6.2,70699,58.010000,50.0,Low
Search Party,999,"Adventure,Comedy",A pair of friends embark on a mission to reuni...,Scot Armstrong,"Adam Pally, T.J. Miller, Thomas Middleditch,Sh...",2014,93,5.6,4881,82.956376,22.0,Low


In [241]:
director_group = df.groupby('director')[['revenue', 'rating', 'metascore']].mean()
director_group

Unnamed: 0_level_0,revenue,rating,metascore
director,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Aamir Khan,1.200000,8.50,42.0
Abdellatif Kechiche,2.200000,7.80,88.0
Adam Leon,82.956376,6.50,77.0
Adam McKay,109.535000,7.00,65.5
Adam Shankman,78.665000,6.30,64.0
...,...,...,...
Xavier Dolan,43.223188,7.55,61.0
Yimou Zhang,45.130000,6.10,42.0
Yorgos Lanthimos,4.405000,7.20,77.5
Zack Snyder,195.148000,7.04,48.0


In [242]:
df.to_csv('output.csv')