<a href="https://colab.research.google.com/github/desaivishwas/D590_Project/blob/main/DV_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Indian Cinema through Data

`The goal of this visualization is to analyze various aspects that contribute to the success of a film. A commercial picture can not only entertain the masses but also make a lot of money for the creators. A good director, excellent actors, production house, technicians such as editors/cinematographers, and the timing of the movie's release are all key factors in determining whether or not a film will make money. Indian cinema, one of the world’s oldest cinemas is a broad term that refers to a variety of film industries in India, which are mostly split by languages and regions. The Hindi film industry, popularly known as Bollywood, will be our primary emphasis. We hope to visually explore what makes a Bollywood film successful as well as provide a brief overview of Indian cinema with this project.`



In [1]:
import pandas as pd
import plotly.express as px
import seaborn  as sns

In [33]:
movies = pd.read_csv('/content/bollywood_full.csv')
movies.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4330 entries, 0 to 4329
Data columns (total 18 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   title_x           4330 non-null   object 
 1   imdb_id           4330 non-null   object 
 2   poster_path       3580 non-null   object 
 3   wiki_link         4330 non-null   object 
 4   title_y           4330 non-null   object 
 5   original_title    4330 non-null   object 
 6   is_adult          4330 non-null   int64  
 7   year_of_release   4330 non-null   object 
 8   runtime           4330 non-null   object 
 9   genres            4330 non-null   object 
 10  imdb_rating       4317 non-null   float64
 11  imdb_votes        4317 non-null   float64
 12  story             4065 non-null   object 
 13  summary           4329 non-null   object 
 14  tagline           685 non-null    object 
 15  actors            4320 non-null   object 
 16  wins_nominations  1344 non-null   object 


In [34]:
movies.head()

Unnamed: 0,title_x,imdb_id,poster_path,wiki_link,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,summary,tagline,actors,wins_nominations,release_date
0,Uri: The Surgical Strike,tt8291224,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Uri:_The_Surgica...,Uri: The Surgical Strike,Uri: The Surgical Strike,0,2019,138,Action|Drama|War,8.4,35112.0,Divided over five chapters the film chronicle...,Indian army special forces execute a covert op...,,Vicky Kaushal|Paresh Rawal|Mohit Raina|Yami Ga...,4 wins,11 January 2019 (USA)
1,Battalion 609,tt9472208,,https://en.wikipedia.org/wiki/Battalion_609,Battalion 609,Battalion 609,0,2019,131,War,4.1,73.0,The story revolves around a cricket match betw...,The story of Battalion 609 revolves around a c...,,Vicky Ahuja|Shoaib Ibrahim|Shrikant Kamat|Elen...,,11 January 2019 (India)
2,The Accidental Prime Minister (film),tt6986710,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/The_Accidental_P...,The Accidental Prime Minister,The Accidental Prime Minister,0,2019,112,Biography|Drama,6.1,5549.0,Based on the memoir by Indian policy analyst S...,Explores Manmohan Singh's tenure as the Prime ...,,Anupam Kher|Akshaye Khanna|Aahana Kumra|Atul S...,,11 January 2019 (USA)
3,Why Cheat India,tt8108208,https://upload.wikimedia.org/wikipedia/en/thum...,https://en.wikipedia.org/wiki/Why_Cheat_India,Why Cheat India,Why Cheat India,0,2019,121,Crime|Drama,6.0,1891.0,The movie focuses on existing malpractices in ...,The movie focuses on existing malpractices in ...,,Emraan Hashmi|Shreya Dhanwanthary|Snighdadeep ...,,18 January 2019 (USA)
4,Evening Shadows,tt6028796,,https://en.wikipedia.org/wiki/Evening_Shadows,Evening Shadows,Evening Shadows,0,2018,102,Drama,7.3,280.0,While gay rights and marriage equality has bee...,Under the 'Evening Shadows' truth often plays...,,Mona Ambegaonkar|Ananth Narayan Mahadevan|Deva...,17 wins & 1 nomination,11 January 2019 (India)


In [35]:
movies = movies.drop(columns=['poster_path', 'wiki_link', 'summary', 'tagline'])

## Checking for null values in the dataset

In [36]:
movies.isna()

Unnamed: 0,title_x,imdb_id,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,actors,wins_nominations,release_date
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,True,False
2,False,False,False,False,False,False,False,False,False,False,False,False,True,False
3,False,False,False,False,False,False,False,False,False,False,False,False,True,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4325,False,False,False,False,False,False,False,False,False,False,False,False,True,True
4326,False,False,False,False,False,False,False,False,False,False,False,False,True,True
4327,False,False,False,False,False,False,False,False,False,False,True,False,True,False
4328,False,False,False,False,False,False,False,False,False,False,False,False,True,True


### Replacing the null vlaues with 0

In [39]:
movies = movies.fillna(0)

In [40]:
# checking again for null values
movies.isna()

Unnamed: 0,title_x,imdb_id,title_y,original_title,is_adult,year_of_release,runtime,genres,imdb_rating,imdb_votes,story,actors,wins_nominations,release_date
0,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4325,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4326,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4327,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4328,False,False,False,False,False,False,False,False,False,False,False,False,False,False
