### Loading the Datasets
We will load in all our packages and our datasets all at once in the beginning.

In [2]:
# Loading all packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


In [3]:
# Loading both datasets
baby = pd.read_csv("spotify_history.csv")
toddler = pd.read_csv("spotify_additional_metadata.csv")

### Merging the Datasets
Now, we will merge our two datasets into one.

In [5]:
# Merge datasets using the common key "spotify_track_uri"
merged_data = pd.merge(baby, toddler, on="spotify_track_uri", how="inner")

# Save the new merged dataset
merged_data.to_csv("merged_spotify_data.csv", index=False)

# Display a sample of merged data
merged_data.head()

Unnamed: 0,spotify_track_uri,ts,platform,ms_played,track_name,artist_name,album_name,reason_start,reason_end,shuffle,skipped,genre,popularity,release_year,user_rating
0,2J3n32GeLmMjwuAzyhcSNe,2013-07-08 02:44:34,web player,3185,"Say It, Just Say It",The Mowgli's,Waiting For The Dawn,autoplay,clickrow,False,False,Pop,70,2015,1.3
1,1oHxIPqJyvAYHy0PVrDU98,2013-07-08 02:45:37,web player,61865,Drinking from the Bottle (feat. Tinie Tempah),Calvin Harris,18 Months,clickrow,clickrow,False,False,Rock,90,1990,4.6
2,1oHxIPqJyvAYHy0PVrDU98,2013-07-08 02:45:37,web player,61865,Drinking from the Bottle (feat. Tinie Tempah),Calvin Harris,18 Months,clickrow,clickrow,False,False,Rock,57,2009,3.9
3,487OPlneJNni3NWC8SYqhW,2013-07-08 02:50:24,web player,285386,Born To Die,Lana Del Rey,Born To Die - The Paradise Edition,clickrow,unknown,False,False,Jazz,53,1998,3.0
4,487OPlneJNni3NWC8SYqhW,2013-07-08 02:50:24,web player,285386,Born To Die,Lana Del Rey,Born To Die - The Paradise Edition,clickrow,unknown,False,False,Rock,65,2010,4.8


### Merge Explanation
I chose an _inner join_ because it ensures that only records present in both datasets are included in the final merged dataset. This:
- Removes any unmatched records from either dataset, maintaining data accuracy.
- Ensures that each spotify_track_uri in the final dataset has both playback data and additional metadata.

Since an inner join only keeps the matching keys in the datasets, no NA values are created from missing matches.
However, if the datasets had inconsistent data, an outer join could have been used to keep all the same records in the merged dataset and fill missing values with NA.

### Taking a Look at our Merged Dataset
Looking at the final product of our merged data.

In [8]:
from IPython.display import display # Table Created Using GPT (table that showcases the data)
display(merged_data)

Unnamed: 0,spotify_track_uri,ts,platform,ms_played,track_name,artist_name,album_name,reason_start,reason_end,shuffle,skipped,genre,popularity,release_year,user_rating
0,2J3n32GeLmMjwuAzyhcSNe,2013-07-08 02:44:34,web player,3185,"Say It, Just Say It",The Mowgli's,Waiting For The Dawn,autoplay,clickrow,False,False,Pop,70,2015,1.3
1,1oHxIPqJyvAYHy0PVrDU98,2013-07-08 02:45:37,web player,61865,Drinking from the Bottle (feat. Tinie Tempah),Calvin Harris,18 Months,clickrow,clickrow,False,False,Rock,90,1990,4.6
2,1oHxIPqJyvAYHy0PVrDU98,2013-07-08 02:45:37,web player,61865,Drinking from the Bottle (feat. Tinie Tempah),Calvin Harris,18 Months,clickrow,clickrow,False,False,Rock,57,2009,3.9
3,487OPlneJNni3NWC8SYqhW,2013-07-08 02:50:24,web player,285386,Born To Die,Lana Del Rey,Born To Die - The Paradise Edition,clickrow,unknown,False,False,Jazz,53,1998,3.0
4,487OPlneJNni3NWC8SYqhW,2013-07-08 02:50:24,web player,285386,Born To Die,Lana Del Rey,Born To Die - The Paradise Edition,clickrow,unknown,False,False,Rock,65,2010,4.8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6043051,6iGU74CwXuT4XVepjc9Emf,2024-12-15 23:06:25,android,1893,God Only Knows - Mono,The Beach Boys,Pet Sounds,fwdbtn,fwdbtn,True,True,Pop,98,2007,1.4
6043052,6iGU74CwXuT4XVepjc9Emf,2024-12-15 23:06:25,android,1893,God Only Knows - Mono,The Beach Boys,Pet Sounds,fwdbtn,fwdbtn,True,True,Pop,60,2006,4.4
6043053,6iGU74CwXuT4XVepjc9Emf,2024-12-15 23:06:25,android,1893,God Only Knows - Mono,The Beach Boys,Pet Sounds,fwdbtn,fwdbtn,True,True,Jazz,88,2008,3.9
6043054,6iGU74CwXuT4XVepjc9Emf,2024-12-15 23:06:25,android,1893,God Only Knows - Mono,The Beach Boys,Pet Sounds,fwdbtn,fwdbtn,True,True,Rock,69,2021,2.4
