## Exploratory Data Analysis for Microsoft's New Movie Studio

## Overview

In this project, we will perform exploratory data analysis (EDA) to generate insights for Microsoft, 
helping them decide what type of films to create for their new movie studio. We will use various datasets 
related to movies, analyze trends, and provide actionable recommendations based on our findings.


## Business Problem

Microsoft sees all the big companies creating original video content and they want to get in on the fun. 
They have decided to create a new movie studio but lack knowledge about creating movies. Our task is to explore 
what types of films are currently doing the best at the box office and translate those findings into actionable insights 
for Microsoft's new movie studio.

## Objectives

The objectives of this project are to:
1. Analyze current trends in the movie industry to identify high-performing genres and film characteristics.
2. Determine the key factors that contribute to a movie's success at the box office.
3. Provide actionable recommendations to Microsoft on the types of films they should consider producing.

## Loading Data

In [1]:
import pandas as pd 
import sqlite3

In [14]:
# Load CSV and TSV files
bom_movie_gross = pd.read_csv('data/bom.movie_gross.csv')
rt_movie_info = pd.read_csv('data/rt.movie_info.tsv', sep='\t')
rt_reviews = pd.read_csv('data/rt.reviews.tsv', sep='\t', encoding='ISO-8859-1')
tmdb_movies = pd.read_csv('data/tmdb.movies.csv')
tn_movie_budgets = pd.read_csv('data/tn.movie_budgets.csv')

# Load SQLite database
conn = sqlite3.connect('data/im.db')
imdb_movie_basics = pd.read_sql_query("SELECT * FROM movie_basics", conn)
imdb_movie_ratings = pd.read_sql_query("SELECT * FROM movie_ratings", conn)