## Create and merge the DataFrames


This analysis uses the **movies_merge.xlsx** and **ott_merge.csv** data sets. Our objectives at this stage are to prepare for analysis by:

- Importing the CSV files into DataFrames.
- Viewing the DataFrames.
- Describing the DataFrames to understand the structures and data types. 
- Merging the two DataFrames into a single DataFrame.

The insights gained from the analysis will inform the campaign, promotional materials, slogans, and language the political party will use to reach potential voters.

## 1. Import Pandas

In [1]:
#import necessary packages
import pandas as pd

## 2. Import Excel file

In [2]:
#Load the excel data using pd.read_excel
movies = pd.read_excel("movies_merge.xlsx")

print(movies.columns)

Index(['ID', 'Title', 'Year', 'Age', 'IMDb', 'Rotten Tomatoes', 'Directors',
       'Genres', 'Country', 'Language', 'Runtime'],
      dtype='object')


## 3. Import CSV file

In [3]:
#Load the csv data using pd.read_csv
ott = pd.read_csv("ott_merge.csv")

print(ott.columns)

Index(['ID', 'Netflix', 'Hulu', 'Prime Video', 'Disney+'], dtype='object')


## 4. Validate the DataFrames

In [4]:
# data imported correctly?
print(movies.head())
print(movies.shape)

   ID                           Title  Year  Age  IMDb  Rotten Tomatoes  \
0   1                       Inception  2010  13+   8.8             0.87   
1   2                      The Matrix  1999  18+   8.7             0.87   
2   3          Avengers: Infinity War  2018  13+   8.5             0.84   
3   4              Back to the Future  1985   7+   8.5             0.96   
4   5  The Good, the Bad and the Ugly  1966  18+   8.8             0.97   

                        Directors                            Genres  \
0               Christopher Nolan  Action,Adventure,Sci-Fi,Thriller   
1  Lana Wachowski,Lilly Wachowski                     Action,Sci-Fi   
2         Anthony Russo,Joe Russo           Action,Adventure,Sci-Fi   
3                 Robert Zemeckis           Adventure,Comedy,Sci-Fi   
4                    Sergio Leone                           Western   

                        Country                 Language  Runtime  
0  United States,United Kingdom  English,Japanese,Fren

In [5]:
# data imported correctly?
print(ott.head())
print(ott.shape)

   ID  Netflix  Hulu  Prime Video  Disney+
0   1        0     0            1        0
1   2        0     1            0        0
2   3        0     0            1        0
3   4        1     0            0        0
4   5        0     0            1        0
(16744, 5)


## 5. Describe the data types

In [6]:
# data types
print(ott.dtypes)
print(movies.dtypes)

ID             int64
Netflix        int64
Hulu           int64
Prime Video    int64
Disney+        int64
dtype: object
ID                   int64
Title               object
Year                 int64
Age                 object
IMDb               float64
Rotten Tomatoes    float64
Directors           object
Genres              object
Country             object
Language            object
Runtime            float64
dtype: object


## 6. Combine the two DataFrames
### a) merge()

In [7]:
#merge the data
df_mov_ott = pd.merge(movies, ott, how='left', on = "ID")

print(df_mov_ott.head())
print(df_mov_ott.shape)

   ID                           Title  Year  Age  IMDb  Rotten Tomatoes  \
0   1                       Inception  2010  13+   8.8             0.87   
1   2                      The Matrix  1999  18+   8.7             0.87   
2   3          Avengers: Infinity War  2018  13+   8.5             0.84   
3   4              Back to the Future  1985   7+   8.5             0.96   
4   5  The Good, the Bad and the Ugly  1966  18+   8.8             0.97   

                        Directors                            Genres  \
0               Christopher Nolan  Action,Adventure,Sci-Fi,Thriller   
1  Lana Wachowski,Lilly Wachowski                     Action,Sci-Fi   
2         Anthony Russo,Joe Russo           Action,Adventure,Sci-Fi   
3                 Robert Zemeckis           Adventure,Comedy,Sci-Fi   
4                    Sergio Leone                           Western   

                        Country                 Language  Runtime  Netflix  \
0  United States,United Kingdom  English,Jap

### b) concat()

In [8]:
# concat data frames
mov_ott_concat = pd.concat([movies, ott], axis=0)

print(mov_ott_concat.head())
print(mov_ott_concat.shape)

   ID                           Title    Year  Age  IMDb  Rotten Tomatoes  \
0   1                       Inception  2010.0  13+   8.8             0.87   
1   2                      The Matrix  1999.0  18+   8.7             0.87   
2   3          Avengers: Infinity War  2018.0  13+   8.5             0.84   
3   4              Back to the Future  1985.0   7+   8.5             0.96   
4   5  The Good, the Bad and the Ugly  1966.0  18+   8.8             0.97   

                        Directors                            Genres  \
0               Christopher Nolan  Action,Adventure,Sci-Fi,Thriller   
1  Lana Wachowski,Lilly Wachowski                     Action,Sci-Fi   
2         Anthony Russo,Joe Russo           Action,Adventure,Sci-Fi   
3                 Robert Zemeckis           Adventure,Comedy,Sci-Fi   
4                    Sergio Leone                           Western   

                        Country                 Language  Runtime  Netflix  \
0  United States,United Kingdom 

### c) append()

In [9]:
# append data frames
mov_ott_append = movies.append(ott)

print(mov_ott_append.head())
print(mov_ott_append.shape)

   ID                           Title    Year  Age  IMDb  Rotten Tomatoes  \
0   1                       Inception  2010.0  13+   8.8             0.87   
1   2                      The Matrix  1999.0  18+   8.7             0.87   
2   3          Avengers: Infinity War  2018.0  13+   8.5             0.84   
3   4              Back to the Future  1985.0   7+   8.5             0.96   
4   5  The Good, the Bad and the Ugly  1966.0  18+   8.8             0.97   

                        Directors                            Genres  \
0               Christopher Nolan  Action,Adventure,Sci-Fi,Thriller   
1  Lana Wachowski,Lilly Wachowski                     Action,Sci-Fi   
2         Anthony Russo,Joe Russo           Action,Adventure,Sci-Fi   
3                 Robert Zemeckis           Adventure,Comedy,Sci-Fi   
4                    Sergio Leone                           Western   

                        Country                 Language  Runtime  Netflix  \
0  United States,United Kingdom 