In this video we will learn about how to modify a DataFrame using the inplace parameter. We will first read a real dataset into pandas. We will then introduce pandas' inplace parameter and see how it impacts a method's execution end result. We will also execute methods with and without inplace parameters to demonstrate the effect of inplace.

In [13]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [14]:
%cd /content/drive/My Drive/Colab Notebooks

/content/drive/My Drive/Colab Notebooks


In [15]:
import pandas as pd
# We will read our dataset.
top_movies = pd.read_table('data-movies-top-grossing.csv', sep=',')

Since it is a CSV file, we are using pandas'read_csv function for this. Now, that we have read in our dataset into a DataFrame, let's take a look at a few of the records.

In [16]:
top_movies

Unnamed: 0,Rank,Title,Worldwide gross,Year
0,1,Avatar,"$2,787,965,087",2009
1,2,Titanic,"$2,186,772,302",1997
2,3,Star Wars: The Force Awakens,"$2,068,223,624",2015
3,4,Jurassic World,"$1,671,713,208",2015
4,5,The Avengers,"$1,518,812,988",2012
5,6,Furious 7,"$1,516,045,911",2015
6,7,Avengers: Age of Ultron,"$1,405,403,694",2015
7,8,Harry Potter and the Deathly Hallows – Part 2,"$1,341,511,219",2011
8,9,Frozen,"$1,287,000,000",2013
9,10,Beauty and the Beast,"$1,257,024,611",2017


The data we are using from Wikipedia, it is the cross annex data for top movies worldwide. Most pandas DataFrame methods return a new DataFrame. However, you might want to use a method to modify the original DataFrame itself. This is where the inplace parameter is useful. First let's call a method on a DataFrame without the inplace parameter to see how it works in the code.

In [17]:
top_movies.set_index('Rank').head()

Unnamed: 0_level_0,Title,Worldwide gross,Year
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,Avatar,"$2,787,965,087",2009
2,Titanic,"$2,186,772,302",1997
3,Star Wars: The Force Awakens,"$2,068,223,624",2015
4,Jurassic World,"$1,671,713,208",2015
5,The Avengers,"$1,518,812,988",2012


Here we are setting one of the columns as the index for our DataFrame. We can see that the index has been set in the memory. Let's check now to see if it has modified the original DataFrame or not.

In [18]:
top_movies.head()

Unnamed: 0,Rank,Title,Worldwide gross,Year
0,1,Avatar,"$2,787,965,087",2009
1,2,Titanic,"$2,186,772,302",1997
2,3,Star Wars: The Force Awakens,"$2,068,223,624",2015
3,4,Jurassic World,"$1,671,713,208",2015
4,5,The Avengers,"$1,518,812,988",2012


We can see that in the original DataFrame there has been no chnage. The set_index method only created the change in a completely new DataFrame in memory, which we could have saved in a new DataFrame. Now let's see how it works if we pass the inplace parameter.

In [20]:
top_movies.set_index('Rank', inplace=True)

In [21]:
top_movies.head()

Unnamed: 0_level_0,Title,Worldwide gross,Year
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,Avatar,"$2,787,965,087",2009
2,Titanic,"$2,186,772,302",1997
3,Star Wars: The Force Awakens,"$2,068,223,624",2015
4,Jurassic World,"$1,671,713,208",2015
5,The Avengers,"$1,518,812,988",2012


We can see that passing inplace=True did modify the original DataFrame. Not all methods require the use of the inplace parameter to modify the original DataFrame. For example, the rename(columns) method modifies the original DataFrame, without the needfor the inplace parameter.

In [22]:
top_movies.rename(columns={'Year':'Release Year'}).head()
# Changing the name of the column from year to release year

Unnamed: 0_level_0,Title,Worldwide gross,Release Year
Rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,Avatar,"$2,787,965,087",2009
2,Titanic,"$2,186,772,302",1997
3,Star Wars: The Force Awakens,"$2,068,223,624",2015
4,Jurassic World,"$1,671,713,208",2015
5,The Avengers,"$1,518,812,988",2012


It is a good idea to get familiar with which methods need inplace and which do not. In this video, we did learn about how to modify a DataFrame using the inplace parameter. We did introduce pandas inplace parameter and how it impacts a method's execution end reuslt. We did explore the execution of methods with and without the inplace parameter to demonstrate the difference in the result.