## **Pandas DataFrame**

**A DataFrame in Pandas is a two-dimensional labeled data structure, similar to a table in a database or an Excel spreadsheet**

**Key Points :**

- It consists of rows and columns

- Each column can hold data of different types (integers, floats, strings, etc.)

- Think of it as a dictionary of Series, where each key is a column name

**Features :**

- Easy to filter rows and columns

- Allows reading from and writing to files like CSV, Excel, SQL, etc

- Great for data cleaning, analysis, and manipulation

In [1]:
import numpy as np
import pandas as pd

#### **Creating DataFrame**

**Using Lists**

In [2]:
students_data = [
    [100,90,12],
    [80,60,5],
    [120,98,20],
    [200,100,50]
]

In [3]:
# Create dataframe using list

pd.DataFrame(students_data,columns=['iq','marks','package'])

Unnamed: 0,iq,marks,package
0,100,90,12
1,80,60,5
2,120,98,20
3,200,100,50


**Using Dictionary**

In [4]:
stud_dict = {
    'iq' : [100,80,120,95],
    'marks' : [95,70,99,85] ,
    'package' : [20,10,30,15] 
}

In [5]:
# Creating dataframe using dictionary

students = pd.DataFrame(stud_dict)
students

Unnamed: 0,iq,marks,package
0,100,95,20
1,80,70,10
2,120,99,30
3,95,85,15


**Using read_csv**

In [6]:
# ipl runs data

runs = pd.read_csv('Datasets/batsman_runs_ipl.csv') 

runs.sample(5)

Unnamed: 0,batter,batsman_run
62,Abdul Samad,228
543,T Natarajan,3
571,V Sehwag,2728
115,CK Langeveldt,8
109,CH Morris,618


In [7]:
# ipl matches data

ipl = pd.read_csv('Datasets/ipl-matches.csv')
ipl.sample(5)

Unnamed: 0,ID,City,Date,Season,MatchNumber,Team1,Team2,Venue,TossWinner,TossDecision,SuperOver,WinningTeam,WonBy,Margin,method,Player_of_Match,Team1Players,Team2Players,Umpire1,Umpire2
578,598047,Jaipur,2013-05-05,2013,50,Rajasthan Royals,Pune Warriors,Sawai Mansingh Stadium,Pune Warriors,bat,N,Rajasthan Royals,Wickets,5.0,,AM Rahane,"['R Dravid', 'AM Rahane', 'SR Watson', 'BJ Hod...","['RV Uthappa', 'AJ Finch', 'Yuvraj Singh', 'MR...",C Shamshuddin,RJ Tucker
480,829729,Mumbai,2015-04-17,2015,12,Mumbai Indians,Chennai Super Kings,Wankhede Stadium,Mumbai Indians,bat,N,Chennai Super Kings,Wickets,6.0,,A Nehra,"['LMP Simmons', 'PA Patel', 'CJ Anderson', 'RG...","['DR Smith', 'BB McCullum', 'SK Raina', 'F du ...",AK Chaudhary,M Erasmus
155,1216494,Abu Dhabi,2020-10-21,2020/21,39,Kolkata Knight Riders,Royal Challengers Bangalore,Sheikh Zayed Stadium,Kolkata Knight Riders,bat,N,Royal Challengers Bangalore,Wickets,8.0,,Mohammed Siraj,"['Shubman Gill', 'RA Tripathi', 'N Rana', 'T B...","['D Padikkal', 'AJ Finch', 'Gurkeerat Singh', ...",VK Sharma,S Ravi
592,598032,Jaipur,2013-04-27,2013,36,Rajasthan Royals,Sunrisers Hyderabad,Sawai Mansingh Stadium,Sunrisers Hyderabad,bat,N,Rajasthan Royals,Wickets,8.0,,JP Faulkner,"['R Dravid', 'AM Rahane', 'SR Watson', 'STR Bi...","['PA Reddy', 'S Dhawan', 'KC Sangakkara', 'GH ...",VA Kulkarni,K Srinath
583,598041,Chennai,2013-05-02,2013,45,Chennai Super Kings,Kings XI Punjab,"MA Chidambaram Stadium, Chepauk",Chennai Super Kings,bat,N,Chennai Super Kings,Runs,15.0,,SK Raina,"['WP Saha', 'MEK Hussey', 'SK Raina', 'MS Dhon...","['LA Pomersbach', 'Mandeep Singh', 'SE Marsh',...",M Erasmus,VA Kulkarni


In [8]:
# Movies Data
movies = pd.read_csv('Datasets/movies.csv')
movies.sample(5)

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
452,Ranbanka,tt5151622,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Ranbanka,Ranbanka,Ranbanka,0,2015,98,Action,3.7,118,Rahul an engineer tries his best to protect ...,A non-violent engineer resorts to violence aft...,A Film By Aryeman,Manish Paul|Ravi Kishan|Pooja Thakur|Rudra Kau...,1 win,6 November 2015 (India)
1402,Border (1997 film),tt0347416,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Border_(1997_film),LOC: Kargil,LOC: Kargil,0,2003,255,Drama|History|War,5.1,2284,LOC KARGIL is the story of Indian soldiers fig...,Based on the real story during Kargil war foug...,Heroes do not choose their destiny| destiny ch...,Sanjay Dutt|Ajay Devgn|Saif Ali Khan|Sunil She...,2 wins & 10 nominations,12 December 2003 (India)
1482,Talaash: The Hunt Begins...,tt0338477,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Talaash:_The_Hun...,Talaash: The Hunt Begins...,Talaash: The Hunt Begins...,0,2003,153,Action|Drama|Mystery,4.7,1292,After receiving pardon from his jail term Babu...,#NAME?,The hunt begins...,Rakhee Gulzar|Akshay Kumar|Kareena Kapoor|Pooj...,,3 January 2003 (India)
813,Desi Boyz,tt1985981,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Desi_Boyz,Desi Boyz,Desi Boyz,0,2011,122,Comedy|Drama,5.7,12089,Based in England Nikhil Mathur is employed as...,Two friends lose their jobs then part bitterl...,,Akshay Kumar|John Abraham|Deepika Padukone|Chi...,2 wins & 6 nominations,25 November 2011 (India)
631,John Day (film),tt2699840,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/John_Day_(film),John Day,John Day,0,2013,138,Drama,5.7,735,John Day is a bank manager and lives with his ...,John Day is a bank manager and lives with his ...,,Naseeruddin Shah|Randeep Hooda|Elena Kazan|She...,,13 September 2013 (India)


In [9]:
# Diabetes

diabetes = pd.read_csv('Datasets/diabetes.csv')
diabetes.sample(5)

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
103,1,81,72,18,40,26.6,0.283,24,0
55,1,73,50,10,0,23.0,0.248,21,0
713,0,134,58,20,291,26.4,0.352,21,0
579,2,197,70,99,0,34.7,0.575,62,1
311,0,106,70,37,148,39.4,0.605,22,0


#### **DataFrame Attributes and Methods**

##### **Attributes**

**1 ) Shape**

In [10]:
movies.shape

(1629, 18)

In [11]:
runs.shape

(605, 2)

**2 ) Dtypes**

In [12]:
ipl.dtypes

ID                   int64
City                object
Date                object
Season              object
MatchNumber         object
Team1               object
Team2               object
Venue               object
TossWinner          object
TossDecision        object
SuperOver           object
WinningTeam         object
WonBy               object
Margin             float64
method              object
Player_of_Match     object
Team1Players        object
Team2Players        object
Umpire1             object
Umpire2             object
dtype: object

**3 ) Index**

In [13]:
ipl.index

RangeIndex(start=0, stop=950, step=1)

In [14]:
movies.index

RangeIndex(start=0, stop=1629, step=1)

**4 ) Columns**

In [15]:
movies.columns

Index(['title_x', 'imdb_id', 'poster_path', 'wiki_link', 'title_y',
       'original_title', 'is_adult', 'year_of_release', 'runtime', 'genres',
       'imdb_rating', 'imdb_votes', 'story', 'summary', 'tagline', 'actors',
       'wins_nominations', 'release_date'],
      dtype='object')

In [16]:
runs.columns

Index(['batter', 'batsman_run'], dtype='object')

**5 ) Values**

In [17]:
students.values

array([[100,  95,  20],
       [ 80,  70,  10],
       [120,  99,  30],
       [ 95,  85,  15]])

In [18]:
runs.values

array([['A Ashish Reddy', 280],
       ['A Badoni', 161],
       ['A Chandila', 4],
       ...,
       ['Younis Khan', 3],
       ['Yuvraj Singh', 2754],
       ['Z Khan', 117]], shape=(605, 2), dtype=object)

##### **Methods**

**1 ) Head and Tail**

In [19]:
# head

ipl.head(10)  # first five  

Unnamed: 0,ID,City,Date,Season,MatchNumber,Team1,Team2,Venue,TossWinner,TossDecision,SuperOver,WinningTeam,WonBy,Margin,method,Player_of_Match,Team1Players,Team2Players,Umpire1,Umpire2
0,1312200,Ahmedabad,2022-05-29,2022,Final,Rajasthan Royals,Gujarat Titans,"Narendra Modi Stadium, Ahmedabad",Rajasthan Royals,bat,N,Gujarat Titans,Wickets,7.0,,HH Pandya,"['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...","['WP Saha', 'Shubman Gill', 'MS Wade', 'HH Pan...",CB Gaffaney,Nitin Menon
1,1312199,Ahmedabad,2022-05-27,2022,Qualifier 2,Royal Challengers Bangalore,Rajasthan Royals,"Narendra Modi Stadium, Ahmedabad",Rajasthan Royals,field,N,Rajasthan Royals,Wickets,7.0,,JC Buttler,"['V Kohli', 'F du Plessis', 'RM Patidar', 'GJ ...","['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...",CB Gaffaney,Nitin Menon
2,1312198,Kolkata,2022-05-25,2022,Eliminator,Royal Challengers Bangalore,Lucknow Super Giants,"Eden Gardens, Kolkata",Lucknow Super Giants,field,N,Royal Challengers Bangalore,Runs,14.0,,RM Patidar,"['V Kohli', 'F du Plessis', 'RM Patidar', 'GJ ...","['Q de Kock', 'KL Rahul', 'M Vohra', 'DJ Hooda...",J Madanagopal,MA Gough
3,1312197,Kolkata,2022-05-24,2022,Qualifier 1,Rajasthan Royals,Gujarat Titans,"Eden Gardens, Kolkata",Gujarat Titans,field,N,Gujarat Titans,Wickets,7.0,,DA Miller,"['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...","['WP Saha', 'Shubman Gill', 'MS Wade', 'HH Pan...",BNJ Oxenford,VK Sharma
4,1304116,Mumbai,2022-05-22,2022,70,Sunrisers Hyderabad,Punjab Kings,"Wankhede Stadium, Mumbai",Sunrisers Hyderabad,bat,N,Punjab Kings,Wickets,5.0,,Harpreet Brar,"['PK Garg', 'Abhishek Sharma', 'RA Tripathi', ...","['JM Bairstow', 'S Dhawan', 'M Shahrukh Khan',...",AK Chaudhary,NA Patwardhan
5,1304115,Mumbai,2022-05-21,2022,69,Delhi Capitals,Mumbai Indians,"Wankhede Stadium, Mumbai",Mumbai Indians,field,N,Mumbai Indians,Wickets,5.0,,JJ Bumrah,"['PP Shaw', 'DA Warner', 'MR Marsh', 'RR Pant'...","['Ishan Kishan', 'RG Sharma', 'D Brevis', 'Til...",Nitin Menon,Tapan Sharma
6,1304114,Mumbai,2022-05-20,2022,68,Chennai Super Kings,Rajasthan Royals,"Brabourne Stadium, Mumbai",Chennai Super Kings,bat,N,Rajasthan Royals,Wickets,5.0,,R Ashwin,"['RD Gaikwad', 'DP Conway', 'MM Ali', 'N Jagad...","['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...",CB Gaffaney,NA Patwardhan
7,1304113,Mumbai,2022-05-19,2022,67,Gujarat Titans,Royal Challengers Bangalore,"Wankhede Stadium, Mumbai",Gujarat Titans,bat,N,Royal Challengers Bangalore,Wickets,8.0,,V Kohli,"['WP Saha', 'Shubman Gill', 'MS Wade', 'HH Pan...","['V Kohli', 'F du Plessis', 'GJ Maxwell', 'KD ...",KN Ananthapadmanabhan,GR Sadashiv Iyer
8,1304112,Navi Mumbai,2022-05-18,2022,66,Lucknow Super Giants,Kolkata Knight Riders,"Dr DY Patil Sports Academy, Mumbai",Lucknow Super Giants,bat,N,Lucknow Super Giants,Runs,2.0,,Q de Kock,"['Q de Kock', 'KL Rahul', 'E Lewis', 'DJ Hooda...","['VR Iyer', 'A Tomar', 'N Rana', 'SS Iyer', 'S...",R Pandit,YC Barde
9,1304111,Mumbai,2022-05-17,2022,65,Sunrisers Hyderabad,Mumbai Indians,"Wankhede Stadium, Mumbai",Mumbai Indians,field,N,Sunrisers Hyderabad,Runs,3.0,,RA Tripathi,"['Abhishek Sharma', 'PK Garg', 'RA Tripathi', ...","['RG Sharma', 'Ishan Kishan', 'DR Sams', 'Tila...",CB Gaffaney,N Pandit


In [20]:
runs.tail(5)  # last five 

Unnamed: 0,batter,batsman_run
600,Yash Dayal,0
601,Yashpal Singh,47
602,Younis Khan,3
603,Yuvraj Singh,2754
604,Z Khan,117


**2 ) Sample**

In [21]:
movies.sample(2)  # random 2 rows 

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
1310,Vidyaarthi,tt1483815,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Vidyaarthi,Vidhyaarthi: The Power of Students,Vidhyaarthi: The Power of Students,0,2006,250,Action,5.0,8,This story about underworld and crime. How can...,This story about underworld and crime. How can...,The power of students,Vikram Aditya|Akash Ajmera|Ishrat Ali|Rajesh B...,,18 June 2006 (India)
1590,Deewaanapan,tt0301179,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Deewaanapan,Deewaanapan,Deewaanapan,0,2001,153,Action|Romance,4.6,378,Suraj Saxena (Arjun Rampal) lives in a remote ...,A rebel crosses swords with an influential ric...,,Arjun Rampal|Dia Mirza|Vinod Khanna|Om Puri|Sm...,1 win & 2 nominations,16 November 2001 (India)


**3 ) Info**

In [22]:
ipl.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 950 entries, 0 to 949
Data columns (total 20 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   ID               950 non-null    int64  
 1   City             899 non-null    object 
 2   Date             950 non-null    object 
 3   Season           950 non-null    object 
 4   MatchNumber      950 non-null    object 
 5   Team1            950 non-null    object 
 6   Team2            950 non-null    object 
 7   Venue            950 non-null    object 
 8   TossWinner       950 non-null    object 
 9   TossDecision     950 non-null    object 
 10  SuperOver        946 non-null    object 
 11  WinningTeam      946 non-null    object 
 12  WonBy            950 non-null    object 
 13  Margin           932 non-null    float64
 14  method           19 non-null     object 
 15  Player_of_Match  946 non-null    object 
 16  Team1Players     950 non-null    object 
 17  Team2Players    

**4 ) Describe**

In [23]:
diabetes.describe()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
count,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0,768.0
mean,3.845052,120.894531,69.105469,20.536458,79.799479,31.992578,0.471876,33.240885,0.348958
std,3.369578,31.972618,19.355807,15.952218,115.244002,7.88416,0.331329,11.760232,0.476951
min,0.0,0.0,0.0,0.0,0.0,0.0,0.078,21.0,0.0
25%,1.0,99.0,62.0,0.0,0.0,27.3,0.24375,24.0,0.0
50%,3.0,117.0,72.0,23.0,30.5,32.0,0.3725,29.0,0.0
75%,6.0,140.25,80.0,32.0,127.25,36.6,0.62625,41.0,1.0
max,17.0,199.0,122.0,99.0,846.0,67.1,2.42,81.0,1.0


In [24]:
movies.describe()

Unnamed: 0,is_adult,year_of_release,imdb_rating,imdb_votes
count,1629.0,1629.0,1629.0,1629.0
mean,0.0,2010.263966,5.557459,5384.263352
std,0.0,5.381542,1.567609,14552.103231
min,0.0,2001.0,0.0,0.0
25%,0.0,2005.0,4.4,233.0
50%,0.0,2011.0,5.6,1000.0
75%,0.0,2015.0,6.8,4287.0
max,0.0,2019.0,9.4,310481.0


**5 ) Isnull**

In [25]:
ipl.isnull().sum()   # missing values

ID                   0
City                51
Date                 0
Season               0
MatchNumber          0
Team1                0
Team2                0
Venue                0
TossWinner           0
TossDecision         0
SuperOver            4
WinningTeam          4
WonBy                0
Margin              18
method             931
Player_of_Match      4
Team1Players         0
Team2Players         0
Umpire1              0
Umpire2              0
dtype: int64

In [26]:
ipl.isnull().mean()*100  # percentage of missing values

ID                  0.000000
City                5.368421
Date                0.000000
Season              0.000000
MatchNumber         0.000000
Team1               0.000000
Team2               0.000000
Venue               0.000000
TossWinner          0.000000
TossDecision        0.000000
SuperOver           0.421053
WinningTeam         0.421053
WonBy               0.000000
Margin              1.894737
method             98.000000
Player_of_Match     0.421053
Team1Players        0.000000
Team2Players        0.000000
Umpire1             0.000000
Umpire2             0.000000
dtype: float64

**6 ) Duplicated**

In [27]:
print(ipl.duplicated().sum())

0


**7 ) Rename**

In [28]:
students.rename(columns={'package':'lpa'},inplace=True)
students

Unnamed: 0,iq,marks,lpa
0,100,95,20
1,80,70,10
2,120,99,30
3,95,85,15


#### **Maths Methods**

**1 ) Sum**

In [29]:
students

Unnamed: 0,iq,marks,lpa
0,100,95,20
1,80,70,10
2,120,99,30
3,95,85,15


In [30]:
students.sum()  # Default is column wise ( axis = 0)

iq       395
marks    349
lpa       75
dtype: int64

In [31]:
students.sum(axis=1)  # Row wise ( axis = 1)

0    215
1    160
2    249
3    195
dtype: int64

**2 ) Mean**

In [32]:
students.mean()  # Column wise 

iq       98.75
marks    87.25
lpa      18.75
dtype: float64

In [33]:
students.mean(axis=1)

0    71.666667
1    53.333333
2    83.000000
3    65.000000
dtype: float64

**3 ) Min**

In [34]:
students.min()

iq       80
marks    70
lpa      10
dtype: int64

In [35]:
students.min(axis=1)

0    20
1    10
2    30
3    15
dtype: int64

**4 ) Max**

In [36]:
students.max()

iq       120
marks     99
lpa       30
dtype: int64

In [37]:
students.max(axis=1)

0    100
1     80
2    120
3     95
dtype: int64

**5 ) Median**

In [38]:
students.median()

iq       97.5
marks    90.0
lpa      17.5
dtype: float64

In [39]:
students.median(axis=1)

0    95.0
1    70.0
2    99.0
3    85.0
dtype: float64

**6 ) Varaince**

In [40]:
students.var()

iq       272.916667
marks    166.916667
lpa       72.916667
dtype: float64

In [41]:
students.var(axis=1)

0    2008.333333
1    1433.333333
2    2217.000000
3    1900.000000
dtype: float64

**7 ) Standard Deviation**

In [42]:
students.std()

iq       16.520190
marks    12.919623
lpa       8.539126
dtype: float64

In [43]:
students.std(axis=1)

0    44.814432
1    37.859389
2    47.085029
3    43.588989
dtype: float64

#### **Selecting column from a DataFrame**

**Single column**

In [44]:
movies.columns

Index(['title_x', 'imdb_id', 'poster_path', 'wiki_link', 'title_y',
       'original_title', 'is_adult', 'year_of_release', 'runtime', 'genres',
       'imdb_rating', 'imdb_votes', 'story', 'summary', 'tagline', 'actors',
       'wins_nominations', 'release_date'],
      dtype='object')

In [45]:
movies['title_x'].sample(5)

436            Talvar (film)
291          Dear Dad (film)
973     Raat Gayi Baat Gayi?
403          Kaagaz Ke Fools
1401              LOC Kargil
Name: title_x, dtype: object

In [46]:
type(movies['title_x'])  # It is a series

pandas.core.series.Series

**Multiple Column**

In [47]:
movies[['title_x','imdb_rating','genres']] # (flows the order you provided for column)

Unnamed: 0,title_x,imdb_rating,genres
0,Uri: The Surgical Strike,8.4,Action|Drama|War
1,Battalion 609,4.1,War
2,The Accidental Prime Minister (film),6.1,Biography|Drama
3,Why Cheat India,6.0,Crime|Drama
4,Evening Shadows,7.3,Drama
...,...,...,...
1624,Tera Mera Saath Rahen,4.9,Drama
1625,Yeh Zindagi Ka Safar,3.0,Drama
1626,Sabse Bada Sukh,6.1,Comedy|Drama
1627,Daaka,7.4,Action


In [48]:
type(movies[['title_x','imdb_rating','genres']])  # its is a Dataframe 

pandas.core.frame.DataFrame

#### **Selecting rows from DataFrame**

- **iloc : searches using index positions**

- **loc : searches using index labels**

In [49]:
stud_data = {
    'name' : ['Kisan','Kisna','Shubham','Fake'],
    'iq' : [100,80,120,95],
    'marks' : [95,70,99,85] ,
    'package' : [20,10,30,15] 
}

In [50]:
stud_data = pd.DataFrame(stud_data)
stud_data.set_index('name',inplace=True)

In [51]:
stud_data

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Kisna,80,70,10
Shubham,120,99,30
Fake,95,85,15


##### **iloc**

**Single row**

In [52]:
movies.head(3)

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Uri:_The_Surgica...,Uri: The Surgical Strike,Uri: The Surgical Strike,0,2019,138,Action|Drama|War,8.4,35112,Divided over five chapters the film chronicle...,Indian army special forces execute a covert op...,,Vicky Kaushal|Paresh Rawal|Mohit Raina|Yami Ga...,4 wins,11 January 2019 (USA)
1,Battalion 609,tt9472208,,https://en.wikipedia.org/wiki/Battalion_609,Battalion 609,Battalion 609,0,2019,131,War,4.1,73,The story revolves around a cricket match betw...,The story of Battalion 609 revolves around a c...,,Vicky Ahuja|Shoaib Ibrahim|Shrikant Kamat|Elen...,,11 January 2019 (India)
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/The_Accidental_P...,The Accidental Prime Minister,The Accidental Prime Minister,0,2019,112,Biography|Drama,6.1,5549,Based on the memoir by Indian policy analyst S...,Explores Manmohan Singh's tenure as the Prime ...,,Anupam Kher|Akshaye Khanna|Aahana Kumra|Atul S...,,11 January 2019 (USA)


In [53]:
movies.iloc[1]

title_x                                                 Battalion 609
imdb_id                                                     tt9472208
poster_path                                                       NaN
wiki_link                 https://en.wikipedia.org/wiki/Battalion_609
title_y                                                 Battalion 609
original_title                                          Battalion 609
is_adult                                                            0
year_of_release                                                  2019
runtime                                                           131
genres                                                            War
imdb_rating                                                       4.1
imdb_votes                                                         73
story               The story revolves around a cricket match betw...
summary             The story of Battalion 609 revolves around a c...
tagline             

**Multiple Rows**

In [54]:
movies.iloc[0:16:2]

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Uri:_The_Surgica...,Uri: The Surgical Strike,Uri: The Surgical Strike,0,2019,138,Action|Drama|War,8.4,35112,Divided over five chapters the film chronicle...,Indian army special forces execute a covert op...,,Vicky Kaushal|Paresh Rawal|Mohit Raina|Yami Ga...,4 wins,11 January 2019 (USA)
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/The_Accidental_P...,The Accidental Prime Minister,The Accidental Prime Minister,0,2019,112,Biography|Drama,6.1,5549,Based on the memoir by Indian policy analyst S...,Explores Manmohan Singh's tenure as the Prime ...,,Anupam Kher|Akshaye Khanna|Aahana Kumra|Atul S...,,11 January 2019 (USA)
4,Evening Shadows,tt6028796,,https://en.wikipedia.org/wiki/Evening_Shadows,Evening Shadows,Evening Shadows,0,2018,102,Drama,7.3,280,While gay rights and marriage equality has bee...,Under the 'Evening Shadows' truth often plays...,,Mona Ambegaonkar|Ananth Narayan Mahadevan|Deva...,17 wins & 1 nomination,11 January 2019 (India)
6,Fraud Saiyaan,tt5013008,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Fraud_Saiyaan,Fraud Saiyaan,Fraud Saiyyan,0,2019,109,Comedy|Drama,4.2,504,Fraud Saiyyan is the story of a con artist in ...,Fraud Saiyyan is the story of a con artist in ...,,Arshad Warsi|Saurabh Shukla|Flora Saini|Sara L...,,18 January 2019 (India)
8,Manikarnika: The Queen of Jhansi,tt6903440,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Manikarnika:_The...,Manikarnika: The Queen of Jhansi,Manikarnika: The Queen of Jhansi,0,2019,148,Action|Biography|Drama,6.5,7361,Manikarnika born in Varanasi when Dixt a minis...,Story of Rani Lakshmibai one of the leading f...,,Kangana Ranaut|Rimi Sen|Atul Kulkarni|Nalneesh...,,25 January 2019 (USA)
10,Amavas,tt8396186,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Amavas,Amavas,Amavas,0,2019,134,Horror|Thriller,2.8,235,Far away from the bustle of the city a young ...,The lives of a couple turn into a nightmare a...,,Ali Asgar|Vivan Bhatena|Nargis Fakhri|Sachiin ...,,8 February 2019 (India)
12,Hum Chaar,tt9319812,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Hum_Chaar,Hum chaar,Hum chaar,0,2019,143,Drama,5.6,183,The story of the film revolves around four col...,The story of the film revolves around four col...,,Prit Kamani|Simran Sharma|Anshuman Malhotra|Tu...,,15 February 2019 (India)
14,Sonchiriya,tt8108200,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Sonchiriya,Sonchiriya,Sonchiriya,0,2019,143,Action|Crime|Drama,7.5,2322,This is the story of a group of rebels in Cham...,Set in the Chambal valley the film follows th...,,Sushant Singh Rajput|Bhumi Pednekar|Manoj Bajp...,,1 March 2019 (India)


**Fancy Indexing**

In [55]:
movies.iloc[[1,10,15]]

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
1,Battalion 609,tt9472208,,https://en.wikipedia.org/wiki/Battalion_609,Battalion 609,Battalion 609,0,2019,131,War,4.1,73,The story revolves around a cricket match betw...,The story of Battalion 609 revolves around a c...,,Vicky Ahuja|Shoaib Ibrahim|Shrikant Kamat|Elen...,,11 January 2019 (India)
10,Amavas,tt8396186,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Amavas,Amavas,Amavas,0,2019,134,Horror|Thriller,2.8,235,Far away from the bustle of the city a young ...,The lives of a couple turn into a nightmare a...,,Ali Asgar|Vivan Bhatena|Nargis Fakhri|Sachiin ...,,8 February 2019 (India)
15,Badla (2019 film),tt8130968,https://upload.wikimedia.org/wikipedia/en/0/0c...,https://en.wikipedia.org/wiki/Badla_(2019_film),Badla,Badla,0,2019,118,Crime|Drama|Mystery,7.9,15499,Naina Sethi a successful entrepreneur finds he...,A dynamic young entrepreneur finds herself loc...,,Amitabh Bachchan|Taapsee Pannu|Amrita Singh|An...,1 win,8 March 2019 (India)


##### **loc**

In [56]:
stud_data

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Kisna,80,70,10
Shubham,120,99,30
Fake,95,85,15


In [57]:
stud_data.loc['Kisan']  # It is a series

iq         100
marks       95
package     20
Name: Kisan, dtype: int64

In [58]:
stud_data.loc['Kisan':'Shubham']  # last one included

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Kisna,80,70,10
Shubham,120,99,30


In [59]:
stud_data.loc['Kisan':'Shubham':2]  # Alternate rows

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Shubham,120,99,30


In [60]:
stud_data.loc[['Kisan','Fake']]

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Fake,95,85,15


In [61]:
# Here in stud_data I provide my own indexes as name but , still pandas internally have the default indexing 0 to 1
# So what i want say that , we can also apply iloc on this 

stud_data.iloc[0]

iq         100
marks       95
package     20
Name: Kisan, dtype: int64

In [62]:
stud_data.iloc[0:3]

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisan,100,95,20
Kisna,80,70,10
Shubham,120,99,30


In [63]:
stud_data.iloc[[1,3,2]]

Unnamed: 0_level_0,iq,marks,package
name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Kisna,80,70,10
Fake,95,85,15
Shubham,120,99,30


#### **Selecting both Rows and Column**

In [64]:
movies.iloc[0:3,0:3]  # Using iloc

Unnamed: 0,title_x,imdb_id,poster_path
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...
1,Battalion 609,tt9472208,
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...


In [65]:
movies.loc[0:2,'title_x':'poster_path']   # Using loc

Unnamed: 0,title_x,imdb_id,poster_path
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...
1,Battalion 609,tt9472208,
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...


#### **Filtering a DataFrame**

In [66]:
ipl.head(2)

Unnamed: 0,ID,City,Date,Season,MatchNumber,Team1,Team2,Venue,TossWinner,TossDecision,SuperOver,WinningTeam,WonBy,Margin,method,Player_of_Match,Team1Players,Team2Players,Umpire1,Umpire2
0,1312200,Ahmedabad,2022-05-29,2022,Final,Rajasthan Royals,Gujarat Titans,"Narendra Modi Stadium, Ahmedabad",Rajasthan Royals,bat,N,Gujarat Titans,Wickets,7.0,,HH Pandya,"['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...","['WP Saha', 'Shubman Gill', 'MS Wade', 'HH Pan...",CB Gaffaney,Nitin Menon
1,1312199,Ahmedabad,2022-05-27,2022,Qualifier 2,Royal Challengers Bangalore,Rajasthan Royals,"Narendra Modi Stadium, Ahmedabad",Rajasthan Royals,field,N,Rajasthan Royals,Wickets,7.0,,JC Buttler,"['V Kohli', 'F du Plessis', 'RM Patidar', 'GJ ...","['YBK Jaiswal', 'JC Buttler', 'SV Samson', 'D ...",CB Gaffaney,Nitin Menon


In [67]:
# Example : Find all the final winners

new_df = ipl[ipl['MatchNumber'] == 'Final'] 

new_df[['Season','WinningTeam']]

Unnamed: 0,Season,WinningTeam
0,2022,Gujarat Titans
74,2021,Chennai Super Kings
134,2020/21,Mumbai Indians
194,2019,Mumbai Indians
254,2018,Chennai Super Kings
314,2017,Mumbai Indians
373,2016,Sunrisers Hyderabad
433,2015,Mumbai Indians
492,2014,Kolkata Knight Riders
552,2013,Mumbai Indians


In [68]:
# in one line 

ipl[ipl['MatchNumber'] == 'Final'][['Season','WinningTeam']]

Unnamed: 0,Season,WinningTeam
0,2022,Gujarat Titans
74,2021,Chennai Super Kings
134,2020/21,Mumbai Indians
194,2019,Mumbai Indians
254,2018,Chennai Super Kings
314,2017,Mumbai Indians
373,2016,Sunrisers Hyderabad
433,2015,Mumbai Indians
492,2014,Kolkata Knight Riders
552,2013,Mumbai Indians


In [69]:
# How many super over finishes have occured
ipl[ipl['SuperOver'] == 'Y'].shape[0]

14

In [70]:
# or 

ipl['SuperOver'].value_counts()

SuperOver
N    932
Y     14
Name: count, dtype: int64

In [71]:
# Example : How many matches has csk won in kolkata

ipl[(ipl['City'] == 'Kolkata') & (ipl['WinningTeam'] == 'Chennai Super Kings')].shape[0]

5

In [72]:
# Example : Toss Winner is match winner in % 

(ipl[ipl['TossWinner'] == ipl['WinningTeam']]['WinningTeam'].shape[0] / ipl.shape[0]) * 100

51.473684210526315

In [73]:
movies.sample(3)

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
1379,Rudraksh (film),tt0366985,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Rudraksh_(film),Rudraksh,Rudraksh,0,2004,143,Action|Fantasy|Sci-Fi,2.6,1020,Healing powers that science cannot explain and...,Healing powers that science cannot explain and...,The power to possess,Sanjay Dutt|Bipasha Basu|Sunil Shetty|Isha Kop...,,13 February 2004 (India)
1508,Chor Machaaye Shor,tt0331216,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Chor_Machaaye_Shor,Chor Machaaye Shor,Chor Machaaye Shor,0,2002,145,Action|Comedy|Crime,4.2,1003,Fortunately before his arrest Shyam manages t...,Fortunately before his arrest Shyam manages t...,,Bobby Deol|Om Puri|Paresh Rawal|Shekhar Suman|...,,23 August 2002 (India)
888,Knock Out (2010 film),tt1558578,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Knock_Out_(2010_...,Knock Out,Knock Out,0,2010,117,Action|Crime|Drama,5.9,1857,While making a phone call from a phone booth i...,Hidden with a mysterious agenda a sniper hold...,,Sanjay Dutt|Irrfan Khan|Kangana Ranaut|Gulshan...,,15 October 2010 (India)


In [74]:
# Example : Numbers of Movies with rating > 8 and votes > 10000

movies[(movies['imdb_rating'] > 8 ) & (movies['imdb_votes'] > 10000)].shape[0]

43

In [75]:
movies['genres'].value_counts()

genres
Drama                     162
Comedy|Drama|Romance      101
Comedy|Drama               88
Drama|Romance              86
Action|Crime|Drama         86
                         ... 
Action|Musical|Romance      1
Documentary|War             1
Action|Crime|Horror         1
Comedy|Fantasy              1
Comedy|Musical|Mystery      1
Name: count, Length: 205, dtype: int64

In [76]:
# Example : Action moives with rating > 7.5

mask1 = movies['genres'].str.split('|').apply(lambda x : 'Action' in x)
mask2 = movies['imdb_rating'] > 7.5

movies[mask1 & mask2].sample(5)

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
112,Bhavesh Joshi Superhero,tt6129302,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Bhavesh_Joshi_Su...,Bhavesh Joshi Superhero,Bhavesh Joshi Superhero,0,2018,154,Action|Drama,7.6,4928,Bhavesh Joshi Superhero is an action film abou...,The origin story of Bhavesh Joshi an Indian s...,This year| justice will have a new name.,Harshvardhan Kapoor|Priyanshu Painyuli|Ashish ...,2 nominations,1 June 2018 (USA)
982,Jodhaa Akbar,tt0449994,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Jodhaa_Akbar,Jodhaa Akbar,Jodhaa Akbar,0,2008,213,Action|Drama|History,7.6,27541,Jodhaa Akbar is a sixteenth century love story...,A sixteenth century love story about a marriag...,,Hrithik Roshan|Aishwarya Rai Bachchan|Sonu Soo...,32 wins & 21 nominations,15 February 2008 (USA)
362,Bajrangi Bhaijaan,tt3863552,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Bajrangi_Bhaijaan,Bajrangi Bhaijaan,Bajrangi Bhaijaan,0,2015,163,Action|Comedy|Drama,8.0,65877,A little mute girl from a Pakistani village ge...,An Indian man with a magnanimous heart takes a...,,Salman Khan|Harshaali Malhotra|Nawazuddin Sidd...,25 wins & 13 nominations,17 July 2015 (USA)
1607,Nayak (2001 Hindi film),tt0291376,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Nayak_(2001_Hind...,Nayak: The Real Hero,Nayak: The Real Hero,0,2001,187,Action|Drama|Thriller,7.8,12522,Employed as a camera-man at a popular televisi...,A man accepts a challenge by the chief ministe...,Fight the power,Anil Kapoor|Rani Mukerji|Amrish Puri|Johnny Le...,2 nominations,7 September 2001 (India)
106,Raazi,tt7098658,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Raazi,Raazi,Raazi,0,2018,138,Action|Drama|Thriller,7.8,20289,Hidayat Khan is the son of an Indian freedom f...,A Kashmiri woman agrees to marry a Pakistani a...,An incredible true story,Alia Bhatt|Vicky Kaushal|Rajit Kapoor|Shishir ...,21 wins & 26 nominations,11 May 2018 (USA)


In [77]:
# Second Way to do this is 

mask1 = movies['genres'].str.contains('Action')
mask2 = movies['imdb_rating'] > 7.5

movies[mask1 & mask2].sample(5)

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
362,Bajrangi Bhaijaan,tt3863552,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Bajrangi_Bhaijaan,Bajrangi Bhaijaan,Bajrangi Bhaijaan,0,2015,163,Action|Comedy|Drama,8.0,65877,A little mute girl from a Pakistani village ge...,An Indian man with a magnanimous heart takes a...,,Salman Khan|Harshaali Malhotra|Nawazuddin Sidd...,25 wins & 13 nominations,17 July 2015 (USA)
219,Raag Desh (film),tt6080746,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Raagdesh,Raag Desh,Raag Desh,0,2017,135,Action|Drama|History,8.3,341,A period film based on the historic 1945 India...,A period film based on the historic 1945 India...,,Kunal Kapoor|Amit Sadh|Mohit Marwah|Kenneth De...,,28 July 2017 (India)
1607,Nayak (2001 Hindi film),tt0291376,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Nayak_(2001_Hind...,Nayak: The Real Hero,Nayak: The Real Hero,0,2001,187,Action|Drama|Thriller,7.8,12522,Employed as a camera-man at a popular televisi...,A man accepts a challenge by the chief ministe...,Fight the power,Anil Kapoor|Rani Mukerji|Amrish Puri|Johnny Le...,2 nominations,7 September 2001 (India)
41,Family of Thakurganj,tt8897986,https://upload.wikimedia.org/wikipedia/en/9/99...,https://en.wikipedia.org/wiki/Family_of_Thakur...,Family of Thakurganj,Family of Thakurganj,0,2019,127,Action|Drama,9.4,895,The film is based on small town of North India...,The film is based on small town of North India...,,Jimmy Sheirgill|Mahie Gill|Nandish Singh|Prana...,,19 July 2019 (India)
112,Bhavesh Joshi Superhero,tt6129302,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Bhavesh_Joshi_Su...,Bhavesh Joshi Superhero,Bhavesh Joshi Superhero,0,2018,154,Action|Drama,7.6,4928,Bhavesh Joshi Superhero is an action film abou...,The origin story of Bhavesh Joshi an Indian s...,This year| justice will have a new name.,Harshvardhan Kapoor|Priyanshu Painyuli|Ashish ...,2 nominations,1 June 2018 (USA)


#### **Adding new columns**

**Completely new column**

In [78]:
# adding new column country

movies['Country'] = 'India'

In [79]:
movies.head()  # new column country is added

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date,Country
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Uri:_The_Surgica...,Uri: The Surgical Strike,Uri: The Surgical Strike,0,2019,138,Action|Drama|War,8.4,35112,Divided over five chapters the film chronicle...,Indian army special forces execute a covert op...,,Vicky Kaushal|Paresh Rawal|Mohit Raina|Yami Ga...,4 wins,11 January 2019 (USA),India
1,Battalion 609,tt9472208,,https://en.wikipedia.org/wiki/Battalion_609,Battalion 609,Battalion 609,0,2019,131,War,4.1,73,The story revolves around a cricket match betw...,The story of Battalion 609 revolves around a c...,,Vicky Ahuja|Shoaib Ibrahim|Shrikant Kamat|Elen...,,11 January 2019 (India),India
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/The_Accidental_P...,The Accidental Prime Minister,The Accidental Prime Minister,0,2019,112,Biography|Drama,6.1,5549,Based on the memoir by Indian policy analyst S...,Explores Manmohan Singh's tenure as the Prime ...,,Anupam Kher|Akshaye Khanna|Aahana Kumra|Atul S...,,11 January 2019 (USA),India
3,Why Cheat India,tt8108208,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Why_Cheat_India,Why Cheat India,Why Cheat India,0,2019,121,Crime|Drama,6.0,1891,The movie focuses on existing malpractices in ...,The movie focuses on existing malpractices in ...,,Emraan Hashmi|Shreya Dhanwanthary|Snighdadeep ...,,18 January 2019 (USA),India
4,Evening Shadows,tt6028796,,https://en.wikipedia.org/wiki/Evening_Shadows,Evening Shadows,Evening Shadows,0,2018,102,Drama,7.3,280,While gay rights and marriage equality has bee...,Under the 'Evening Shadows' truth often plays...,,Mona Ambegaonkar|Ananth Narayan Mahadevan|Deva...,17 wins & 1 nomination,11 January 2019 (India),India


**New column from existing one**

In [80]:
# Using dropna because of missing value
movies.dropna(inplace=True)

In [81]:
movies.info()

<class 'pandas.core.frame.DataFrame'>
Index: 298 entries, 11 to 1623
Data columns (total 19 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   title_x           298 non-null    object 
 1   imdb_id           298 non-null    object 
 2   poster_path       298 non-null    object 
 3   wiki_link         298 non-null    object 
 4   title_y           298 non-null    object 
 5   original_title    298 non-null    object 
 6   is_adult          298 non-null    int64  
 7   year_of_release   298 non-null    int64  
 8   runtime           298 non-null    object 
 9   genres            298 non-null    object 
 10  imdb_rating       298 non-null    float64
 11  imdb_votes        298 non-null    int64  
 12  story             298 non-null    object 
 13  summary           298 non-null    object 
 14  tagline           298 non-null    object 
 15  actors            298 non-null    object 
 16  wins_nominations  298 non-null    object 
 17  

In [82]:
# Adding the column 'Lead_actor'

movies['Lead_actor'] = movies['actors'].str.split('|').apply(lambda x:x[0])

movies.head()

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date,Country,Lead_actor
11,Gully Boy,tt2395469,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Gully_Boy,Gully Boy,Gully Boy,0,2019,153,Drama|Music,8.2,22440,"Gully Boy is a film about a 22-year-old boy ""M...",A coming-of-age story based on the lives of st...,Apna Time Aayega!,Ranveer Singh|Alia Bhatt|Siddhant Chaturvedi|V...,6 wins & 3 nominations,14 February 2019 (USA),India,Ranveer Singh
34,Yeh Hai India,tt5525846,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Yeh_Hai_India,Yeh Hai India,Yeh Hai India,0,2017,128,Action|Adventure|Drama,5.7,169,Yeh Hai India follows the story of a 25 years...,Yeh Hai India follows the story of a 25 years...,A Film for Every Indian,Gavie Chahal|Mohan Agashe|Mohan Joshi|Lom Harsh|,2 wins & 1 nomination,24 May 2019 (India),India,Gavie Chahal
37,Article 15 (film),tt10324144,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Article_15_(film),Article 15,Article 15,0,2019,130,Crime|Drama,8.3,13417,In the rural heartlands of India an upright p...,In the rural heartlands of India an upright p...,Farq Bahut Kar Liya| Ab Farq Laayenge.,Ayushmann Khurrana|Nassar|Manoj Pahwa|Kumud Mi...,1 win,28 June 2019 (USA),India,Ayushmann Khurrana
87,Aiyaary,tt6774212,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Aiyaary,Aiyaary,Aiyaary,0,2018,157,Action|Thriller,5.2,3538,General Gurinder Singh comes with a proposal t...,After finding out about an illegal arms deal ...,The Ultimate Trickery,Sidharth Malhotra|Manoj Bajpayee|Rakul Preet S...,1 nomination,16 February 2018 (USA),India,Sidharth Malhotra
96,Raid (2018 film),tt7363076,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Raid_(2018_film),Raid,Raid,0,2018,122,Action|Crime|Drama,7.4,13159,Set in the 80s in Uttar Pradesh India Raid i...,A fearless income tax officer raids the mansio...,Heroes don't always come in uniform,Ajay Devgn|Saurabh Shukla|Ileana D'Cruz|Amit S...,2 wins & 3 nominations,16 March 2018 (India),India,Ajay Devgn
