Skip to content

ddcots24/Movie-Data-Analysis-Microsoft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Movie-Data-Analysis-Microsoft

Overview

Topic: Microsoft wants to enter the movie business industry

  • We were tasked to give three recommendations to Microsoft for entering the industry as a part of their Data Scientist team
  • We used SQL along with the Python package Pandas to load in and clean data from data sets as well as analyze, create visualizations and make recommendations

Business Understanding

  • The key stakeholders are the higher ups in the company such as the CEO and Owners
  • Microsoft is entering a very saturated industry with many big competitors
    • How can Microsoft best navigate starting a new movie studio?
  • A good measure of success in any industry is the return on investment and profit
  • In our analysis we focused in on recent movies from 2010-2019 and their profitability
  • The more granular focus of our analysis was in the profitability of top: movie genres, movie run times and movie actors/actresses

Data Understanding

  • We used two different data sets
    • IMDB website
      • A data set which contained categorical information on movie start years, run times, peoples profession associated with the movie (actors, actresses, writers, directors, producers etc..)
    • The Numbers website
      • A data set which contained numerical information on movie production budgets and gross earnings from which we were able to calculate the profit
  • We merged the two different data sets to conduct our analysis on movie profitability in our three areas of focus

Data Analysis

Most Profitable Genres

  • Based on our data we found most profitable genres by median profit Project 1 image 1
  • From the top 13 genres with the highest profitability based on the median profit we decided to focus in on the top 4 genres

Top 4 Genre's Profitability based on Run Times

  • For these Top 4 profiting genres, we analyzed each with respect to profit by run times Project 1 image 2
  • We found that for animation movies the most profitable run times are in the time range of 80-110 minutes
  • For the other three genres, the most profitable run times are in the interval of 140-155 minutes
    • The 125-140 and 155-170 run times are quite profitable as well

Action Genre Most Profitable Actor/Actress

  • Profitability of top 10 actors for the action genre
    • Chose the action genre because it had the most actors and actresses in the data set compared to the other top 4 genres
  • We filtered down to actors who had been in 4 or more different action movies which represents their popularity and established success in the action genre Project 1 image 3

Conclusion

  • If Microsoft were to enter the movie industry we would give 3 recommendations:
    1. Pursue 1 of the top 4 most profitable genres (Animation, Adventure, Sci-Fi and Action)
    2. For Adventure, Sci-Fi and Action focus on run times between 140-155 minutes. For Animation focus on run times between 80-110 minutes
    3. For Action movies, recruit 1 of the 10 actors previously shown

Repository Information

  • This repository contains:
    • 2 folders containing exploratory notebooks
    • A folder containing data
    • A gitignore file
    • A presentation pdf
    • Final project Jupyter Notebook
    • A readme file

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published