### **What is Pandas**

Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool,
built on top of the Python programming language.

https://pandas.pydata.org/about/index.html

In [None]:
#pip install pandas

In [1]:
import pandas as pd

### **Pandas Series**

A Pandas Series is like a column in a table. It is a 1-D array holding data of any type.

In [2]:
# string
country = ['Bangladesh','Pakistan','USA','Nepal','Srilanka']

pd.Series(country)

0    Bangladesh
1      Pakistan
2           USA
3         Nepal
4      Srilanka
dtype: object

In [3]:
# integers
runs = [13,24,56,78,100]

runs_ser = pd.Series(runs)

In [4]:
runs_ser

0     13
1     24
2     56
3     78
4    100
dtype: int64

In [5]:
# custom index
marks = [67,57,89]
subjects = ['maths','english','science']

pd.Series(marks,index=subjects)

maths      67
english    57
science    89
dtype: int64

In [29]:
marks = {
    'maths':67,
    'english':57,
    'science':89,
}

marks_series = pd.Series(marks,name='marksheet')
marks_series

maths      67
english    57
science    89
Name: marksheet, dtype: int64

**🧠 What is a DataFrame?**

A DataFrame is a table-like structure in Python used to store and work with data — just like an Excel sheet or a SQL table.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

**Rows =** records (each person)

**Columns =** fields (name, age, city)

In [10]:
df = pd.read_csv(r"X:\DSML -03\Data/movies.csv")

In [11]:
df

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [15]:
#head()
df.head()

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [17]:
#tail()
df.tail(3)

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
34,Pushpa: The Rise - Part 1,Bollywood,2021,7.6,Mythri Movie Makers,2.0,3.6,Billions,INR,Telugu
35,RRR,Bollywood,2022,8.0,DVV Entertainment,5.5,12.0,Billions,INR,Telugu
36,Baahubali: The Beginning,Bollywood,2015,8.0,Arka Media Works,1.8,6.5,Billions,INR,Telugu


In [18]:
#shape
df.shape

(37, 10)

In [19]:
# dtypes
df.dtypes

title            object
industry         object
release_year      int64
imdb_rating     float64
studio           object
budget          float64
revenue         float64
unit             object
currency         object
language         object
dtype: object

In [None]:
#columns
df.columns


Index(['title', 'industry', 'release_year', 'imdb_rating', 'studio', 'budget',
       'revenue', 'unit', 'currency', 'language'],
      dtype='object')

In [20]:
'title' == 'title '

False

In [21]:
df

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [34]:
#sample
df.sample(5)

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
36,Baahubali: The Beginning,Bollywood,2015,8.0,Arka Media Works,1.8,6.5,Billions,INR,Telugu
26,Munna Bhai M.B.B.S.,Bollywood,2003,8.1,Vinod Chopra Productions,100.0,410.0,Millions,INR,Hindi
32,Shershaah,Bollywood,2021,8.4,Dharma Productions,500.0,950.0,Millions,INR,Hindi
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
34,Pushpa: The Rise - Part 1,Bollywood,2021,7.6,Mythri Movie Makers,2.0,3.6,Billions,INR,Telugu


In [35]:
#info()
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 37 entries, 0 to 36
Data columns (total 10 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   title         37 non-null     object 
 1   industry      37 non-null     object 
 2   release_year  37 non-null     int64  
 3   imdb_rating   36 non-null     float64
 4   studio        34 non-null     object 
 5   budget        37 non-null     float64
 6   revenue       37 non-null     float64
 7   unit          37 non-null     object 
 8   currency      37 non-null     object 
 9   language      37 non-null     object 
dtypes: float64(3), int64(1), object(6)
memory usage: 3.0+ KB


In [24]:
df.describe()

Unnamed: 0,release_year,imdb_rating,budget,revenue
count,37.0,36.0,37.0,37.0
mean,2007.027027,7.919444,2084.975135,4117.135135
std,17.657995,1.204947,11477.487145,16372.462682
min,1946.0,1.9,1.0,3.1
25%,2001.0,7.8,15.5,263.1
50%,2014.0,8.1,165.0,701.8
75%,2018.0,8.4,250.0,2000.0
max,2022.0,9.3,70000.0,100000.0


In [37]:
df.isnull()

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False
5,False,False,False,False,False,False,False,False,False,False
6,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False
8,False,False,False,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False,False,False,False


In [26]:
df.isnull()

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False
5,False,False,False,False,False,False,False,False,False,False
6,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False
8,False,False,False,False,False,False,False,False,False,False
9,False,False,False,False,False,False,False,False,False,False


In [38]:
df.isnull().sum()

title           0
industry        0
release_year    0
imdb_rating     1
studio          3
budget          0
revenue         0
unit            0
currency        0
language        0
dtype: int64

In [29]:
#duplicated
df.duplicated().sum()

0

In [30]:
## if duplicated available

df.drop_duplicates()

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [31]:
df.head(3)

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English


In [None]:
df=df.rename(columns={"title":"Title"})

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [40]:
df

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [41]:
df.rename(columns={"title":"Title"},inplace=True)

In [42]:
df

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


### **Selecting cols from a DataFrame**

In [43]:
df.head(3)

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English


In [44]:
df['Title']

0                                 Pather Panchali
1     Doctor Strange in the Multiverse of Madness
2                           Thor: The Dark World 
3                                 Thor: Ragnarok 
4                         Thor: Love and Thunder 
5                        The Shawshank Redemption
6                                    Interstellar
7                        The Pursuit of Happyness
8                                       Gladiator
9                                         Titanic
10                          It's a Wonderful Life
11                                         Avatar
12                                  The Godfather
13                                The Dark Knight
14                               Schindler's List
15                                  Jurassic Park
16                                       Parasite
17                              Avengers: Endgame
18                         Avengers: Infinity War
19             Captain America: The First Avenger


In [45]:
df2 = df[['Title', 'budget']]

In [46]:
df2

Unnamed: 0,Title,budget
0,Pather Panchali,70000.0
1,Doctor Strange in the Multiverse of Madness,200.0
2,Thor: The Dark World,165.0
3,Thor: Ragnarok,180.0
4,Thor: Love and Thunder,250.0
5,The Shawshank Redemption,25.0
6,Interstellar,165.0
7,The Pursuit of Happyness,55.0
8,Gladiator,103.0
9,Titanic,200.0


### **Math**

In [47]:
df.head(3)

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English


In [48]:
a=df[["budget","revenue"]]

In [49]:
a

Unnamed: 0,budget,revenue
0,70000.0,100000.0
1,200.0,954.8
2,165.0,644.8
3,180.0,854.0
4,250.0,670.0
5,25.0,73.3
6,165.0,701.8
7,55.0,307.1
8,103.0,460.5
9,200.0,2202.0


In [50]:
a.max()

budget      70000.0
revenue    100000.0
dtype: float64

In [51]:
a.min()

budget     1.0
revenue    3.1
dtype: float64

In [52]:
a.sum()

budget      77144.08
revenue    152334.00
dtype: float64

In [53]:
a.mean()

budget     2084.975135
revenue    4117.135135
dtype: float64

In [54]:
a.std()

budget     11477.487145
revenue    16372.462682
dtype: float64

Create 2 tables, one with bollywood data another with hollywood data

In [42]:
df

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [55]:
df[ df['industry'] == "Bollywood" ]

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
21,Dilwale Dulhania Le Jayenge,Bollywood,1995,8.0,Yash Raj Films,400.0,2000.0,Millions,INR,Hindi
22,3 Idiots,Bollywood,2009,8.4,Vinod Chopra Films,550.0,4000.0,Millions,INR,Hindi
23,Kabhi Khushi Kabhie Gham,Bollywood,2001,7.4,Dharma Productions,390.0,1360.0,Millions,INR,Hindi
24,Bajirao Mastani,Bollywood,2015,7.2,,1.4,3.5,Billions,INR,Hindi
25,Taare Zameen Par,Bollywood,2007,8.3,,120.0,1350.0,Millions,INR,Hindi
26,Munna Bhai M.B.B.S.,Bollywood,2003,8.1,Vinod Chopra Productions,100.0,410.0,Millions,INR,Hindi
27,PK,Bollywood,2014,8.1,Vinod Chopra Films,850.0,8540.0,Millions,INR,Hindi
28,Sanju,Bollywood,2018,,Vinod Chopra Films,1.0,5.9,Billions,INR,Hindi
29,The Kashmir Files,Bollywood,2022,8.3,Zee Studios,250.0,3409.0,Millions,INR,Hindi


In [57]:
df_bollywood = df[ df['industry'] == "Bollywood" ]
df_bollywood.head()

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
21,Dilwale Dulhania Le Jayenge,Bollywood,1995,8.0,Yash Raj Films,400.0,2000.0,Millions,INR,Hindi
22,3 Idiots,Bollywood,2009,8.4,Vinod Chopra Films,550.0,4000.0,Millions,INR,Hindi
23,Kabhi Khushi Kabhie Gham,Bollywood,2001,7.4,Dharma Productions,390.0,1360.0,Millions,INR,Hindi
24,Bajirao Mastani,Bollywood,2015,7.2,,1.4,3.5,Billions,INR,Hindi


In [58]:
df_hollywood = df[ df['industry'] == "Hollywood" ]
df_hollywood.head()

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English


### **How to export the CSV data?**

In [59]:
df_bollywood.to_csv("bollywood data.csv")

In [60]:
df_bollywood.to_csv("bollywood data.csv", index= False)

**How to keep track of the variables?** 

-> Click on Jupyter Variables!

In [49]:
df.head()

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [50]:
# value_counts

df['studio'].value_counts()

studio
Marvel Studios               8
Vinod Chopra Films           3
Universal Pictures           2
Salman Khan Films            2
Paramount Pictures           2
Dharma Productions           2
DVV Entertainment            1
Mythri Movie Makers          1
Hombale Films                1
Zee Studios                  1
Vinod Chopra Productions     1
Government of West Bengal    1
Yash Raj Films               1
Syncopy                      1
20th Century Fox             1
Liberty Films                1
Universal Pictures           1
Columbia Pictures            1
Warner Bros. Pictures        1
Castle Rock Entertainment    1
Arka Media Works             1
Name: count, dtype: int64

In [None]:
# value_counts

df['industry'].value_counts()

industry
Hollywood    20
Bollywood    17
Name: count, dtype: int64

In [51]:
# value_counts

df[['industry', 'studio']].value_counts()

industry   studio                   
Hollywood  Marvel Studios               8
Bollywood  Vinod Chopra Films           3
           Dharma Productions           2
Hollywood  Universal Pictures           2
Bollywood  Salman Khan Films            2
Hollywood  Paramount Pictures           2
Bollywood  Arka Media Works             1
Hollywood  Castle Rock Entertainment    1
           Universal Pictures           1
           Syncopy                      1
           Liberty Films                1
           Columbia Pictures            1
Bollywood  Zee Studios                  1
Hollywood  20th Century Fox             1
Bollywood  DVV Entertainment            1
           Yash Raj Films               1
           Vinod Chopra Productions     1
           Mythri Movie Makers          1
           Hombale Films                1
           Government of West Bengal    1
Hollywood  Warner Bros. Pictures        1
Name: count, dtype: int64

## **Selecting Rows**

- **iloc** - searches using index positions
- **loc** - searches using index labels 

In [None]:
df.head()

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [None]:
#single row

df.iloc[0]

title                     Pather Panchali
industry                        Bollywood
release_year                         1955
imdb_rating                           8.3
studio          Government of West Bengal
budget                            70000.0
revenue                          100000.0
unit                            Thousands
currency                              INR
language                          Bengali
Name: 0, dtype: object

In [None]:
#multiple rows

df.iloc[0:5]

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [54]:
df.iloc[5:20]

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
6,Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
7,The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
9,Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English
10,It's a Wonderful Life,Hollywood,1946,8.6,Liberty Films,3.18,3.3,Millions,USD,English
11,Avatar,Hollywood,2009,7.8,20th Century Fox,237.0,2847.0,Millions,USD,English
12,The Godfather,Hollywood,1972,9.2,Paramount Pictures,7.2,291.0,Millions,USD,English
13,The Dark Knight,Hollywood,2008,9.0,Syncopy,185.0,1006.0,Millions,USD,English
14,Schindler's List,Hollywood,1993,9.0,Universal Pictures,22.0,322.2,Millions,USD,English


In [55]:
df.iloc[5:20:3]

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
8,Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
11,Avatar,Hollywood,2009,7.8,20th Century Fox,237.0,2847.0,Millions,USD,English
14,Schindler's List,Hollywood,1993,9.0,Universal Pictures,22.0,322.2,Millions,USD,English
17,Avengers: Endgame,Hollywood,2019,8.4,Marvel Studios,400.0,2798.0,Millions,USD,English


In [None]:
## indexing

df.iloc[[0,5,2]]

Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
5,The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English


In [None]:
### loc

df.head()


Unnamed: 0,title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [65]:
df3=df.set_index('Title')

In [58]:
df3

Unnamed: 0_level_0,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English
The Shawshank Redemption,Hollywood,1994,9.3,Castle Rock Entertainment,25.0,73.3,Millions,USD,English
Interstellar,Hollywood,2014,8.6,Warner Bros. Pictures,165.0,701.8,Millions,USD,English
The Pursuit of Happyness,Hollywood,2006,8.0,Columbia Pictures,55.0,307.1,Millions,USD,English
Gladiator,Hollywood,2000,8.5,Universal Pictures,103.0,460.5,Millions,USD,English
Titanic,Hollywood,1997,7.9,Paramount Pictures,200.0,2202.0,Millions,USD,English


In [66]:
df3.loc['Thor: Ragnarok ']

industry             Hollywood
release_year              2017
imdb_rating                7.9
studio          Marvel Studios
budget                   180.0
revenue                  854.0
unit                  Millions
currency                   USD
language               English
Name: Thor: Ragnarok , dtype: object

In [67]:
df3.loc['Pather Panchali':"Thor: Love and Thunder "]

Unnamed: 0_level_0,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [68]:
df3.loc['Pather Panchali':"Thor: Love and Thunder ":2]

Unnamed: 0_level_0,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
Title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [62]:
df3.reset_index(inplace= True)

In [63]:
df3.head()

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


### **Selecting both rows and cols**

In [69]:
df.head()

Unnamed: 0,Title,industry,release_year,imdb_rating,studio,budget,revenue,unit,currency,language
0,Pather Panchali,Bollywood,1955,8.3,Government of West Bengal,70000.0,100000.0,Thousands,INR,Bengali
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022,7.0,Marvel Studios,200.0,954.8,Millions,USD,English
2,Thor: The Dark World,Hollywood,2013,6.8,Marvel Studios,165.0,644.8,Millions,USD,English
3,Thor: Ragnarok,Hollywood,2017,7.9,Marvel Studios,180.0,854.0,Millions,USD,English
4,Thor: Love and Thunder,Hollywood,2022,6.8,Marvel Studios,250.0,670.0,Millions,USD,English


In [None]:
df.iloc[0:5,0:3]

Unnamed: 0,title,industry,release_year
0,Pather Panchali,Bollywood,1955
1,Doctor Strange in the Multiverse of Madness,Hollywood,2022
2,Thor: The Dark World,Hollywood,2013
3,Thor: Ragnarok,Hollywood,2017
4,Thor: Love and Thunder,Hollywood,2022
