The purpose of this project is to develop mutliple linear regression model to analyze the factors that will make a movie popular. The dataset contains the information that are extracted from IMDB for random sample movies. For popularity we are going to measure the audience_score as an output variable and the attributes will be the type of movie, genre, runtime, imdb rating, imdb number of votes, critics rating, critics score, audience rating, Oscar awards obtained (actor, actress, director and picture). SPAP has been uploaded for the same.
Following is the description of the Dataset-
title_type: Type of movie (Documentary, Feature Film, TV Movie)
genre: Genre of movie (Action & Adventure, Comedy, Documentary, Drama, Horror, Mystery & Suspense, Other)
runtime: Runtime of movie (in minutes)
imdb_rating: Rating on IMDB
imdb_num_votes: Number of votes on IMDB
critics_rating: Categorical variable for critics rating on Rotten Tomatoes (Certified Fresh, Fresh, Rotten)
critics_score: Critics score on Rotten Tomatoes
audience_rating: Categorical variable for audience rating on Rotten Tomatoes (Spilled, Upright)
audience_score: Audience score on Rotten Tomatoes
best_pic_win: Whether or not the movie won a best picture Oscar (no, yes)
best_actor_win: Whether or not one of the main actors in the movie ever won an Oscar (no, yes)
best_actress win: Whether or not one of the main actresses in the movie ever won an Oscar (no, yes)
best_dir_win: Whether or not the director of the movie ever won an Oscar (no, yes)
Questions: