Movie_Market_Analysis_Project

Authors:

MS Consulting Incorporated Sam Videlock & Mayank Phanse

Provide actionable insgights and recommendations to Microsft based on data analytics.

Franchisied movies significantly outperferm their non-franchise counterparts. We recommend purchasing established franchises as well as creating orginal content
Spending more typically leads to higher profits, with the highest potential budget between $200 - $250 Million.
When you release your film has a major impact on total sales. Releasing between April - June will give you the best chance of success, with the Q4 dates in second place.

Pull data from: https://www.the-numbers.com/movie/budgets/all (saved as 'full_data') This file contains roughly 600 of the highest budgeted movies, we adjusted our data to only contain information from movie release dates from 2006 - mid 2019
Read that csv into a Jupyter Notebook, ours is called 'Movies_1.ipynb'
When looking at the dataframe, we had some good inital data on movie title, gross worldwide sales, and budget, but we had little other data
We then wanted to connect to the movie DB website API (https://developers.themoviedb.org/3/getting-started/introduction) in order to pull down additional details about the movie titles like genre, rating, release date
There is data manipulation that this part contianed, and you will need to refer to that python file in order to see the work performed. Two dataframes were turned into csv's.
In the next next three python files called 'Combining DataFrames_2' , 'GenreCorrection_3' & 'Month_data_4' we are performing different tasks to clean up and organize the data, refer to the files for specifics. Descriptions are provided in each python file.
Our final file was saved as 'table_month_roi.csv'

Pull data from: https://www.the-numbers.com/box-office-records/worldwide/all-movies/cumulative/released-in-2018
We then did this for 2014 - 2017 as well as the link above which contains 2018
We had to do some self classification in order to identify franchise movies and non-franchise movies for the Top 7 grossing of each category in each year. This can be found in the file 'franchise_movie_data_with_movies.csv'

For the budget vs. profit boxplot, this was created in python in the file 'visual_proj1'. Refer to theat file for instruction and code on creation of that visual.
The other visuals were created in Power BI and can be found in the file names 'monthly_visualization' & 'top_7_franchise'
Canva was used for the presentation deck, and the link is above where all visualizations can be found.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
Combining DataFrames_2.ipynb		Combining DataFrames_2.ipynb
GenreCorrection_3.ipynb		GenreCorrection_3.ipynb
Misc_not_used_5.ipynb		Misc_not_used_5.ipynb
Month_data_4.ipynb		Month_data_4.ipynb
Movies_1.ipynb		Movies_1.ipynb
Presentiation.pdf		Presentiation.pdf
README.md		README.md
franchise_movie_data_with_movies.csv		franchise_movie_data_with_movies.csv
monthly_visualization.pbix		monthly_visualization.pbix
top_7_franchise.pbix		top_7_franchise.pbix
visual_proj1.ipynb		visual_proj1.ipynb