<h1 style="color: #8b5e3c;">Merging Game Level & Seat Level</h1>
In this Jupyter Notebook, we aim to merge the datasets from GameLevel.csv and SeatLevel.csv. The purpose for merging these datasets is that we would like to find any possible relationsips between features of these different datasets by combining them together.

<h3 style="color: #8b5e3c">Converting the CSV File to a Pandas Dataframe</h3>
In this section, we convert the `.csv` file into a pandas dataframe by importing pandas and using `.read_csv` to read the csv into a pandas data frame. Finally, we display the dataframes that we have created.

In [1]:
# importing the pandas library
import pandas as pd

# importing ipython display
from IPython.display import display

# importing the .csv files as dataframes
game_df = pd.read_csv("C:/GitHub/BucksHackathon25/BucksBusinessObjectives/BucksDatasets/GameLevel.csv")
seat_df = pd.read_csv("C:/GitHub/BucksHackathon25/BucksBusinessObjectives/BucksDatasets/SeatLevel.csv")

# displaying the data frames
display(game_df)
display(seat_df)


Unnamed: 0,Game,Giveaway
0,2023-10-26 Philadelphia 76ers,
1,2023-10-29 Atlanta Hawks,Cap
2,2023-10-30 Miami Heat,
3,2023-11-03 New York Knicks,
4,2023-11-08 Detroit Pistons,Lunch Bag
...,...,...
77,2025-03-30 Atlanta Hawks,Bucket Hat
78,2025-04-01 Phoenix Suns,
79,2025-04-08 Minnesota Timberwolves,
80,2025-04-10 New Orleans Pelicans,


Unnamed: 0,Season,AccountNumber,Game,GameDate,GameTier
0,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D
1,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D
2,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D
3,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D
4,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D
...,...,...,...,...,...
493879,2024,15667,2025-04-10 New Orleans Pelicans,2025-04-10,D
493880,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B
493881,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B
493882,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B


<h3 style="color: #8b5e3c">Accessing the Details of the Rows</h3>
In order to ensure we have a successful merge, we check the two rows from the different datasets that we expect to merge. We look at the selected observations below to see if the Game `Minnesota Timberwolves` from the Seat dataset and Game dataset merge together.

In [2]:
# printing out the details of a specific row
row = seat_df.iloc[493883]
print(row)

Season                                        2024
AccountNumber                                15667
Game             2025-04-08 Minnesota Timberwolves
GameDate                                2025-04-08
GameTier                                         B
Name: 493883, dtype: object


In [None]:
# printing out the details of the game giveaway
result = game_df[game_df.apply(lambda row: row.astype(str).str.contains('2025-04-08 Minnesota Timberwolves').any(), axis=1)]
print(result)


                                 Game Giveaway
79  2025-04-08 Minnesota Timberwolves      NaN


<h3 style="color: #8b5e3c">Merging the Two Datasets</h3>
Now that we've imported the datasets as pandas dataframes, we now create a new dataframe by merging two tables together. The feature that is in common with both datasets is the Game feature. As a result, we perform what is otherwise called a composite key. The type of merge we aim for is a left merge.

In [4]:
game_seat_df = pd.merge(seat_df, game_df, on='Game', how='left')
display(game_seat_df)

Unnamed: 0,Season,AccountNumber,Game,GameDate,GameTier,Giveaway
0,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D,Bucket Cap
1,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D,Bucket Cap
2,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D,Bucket Cap
3,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D,Bucket Cap
4,2023,1,2024-01-24 Cleveland Cavaliers,2024-01-24,D,Bucket Cap
...,...,...,...,...,...,...
493879,2024,15667,2025-04-10 New Orleans Pelicans,2025-04-10,D,
493880,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B,
493881,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B,
493882,2024,15667,2025-04-08 Minnesota Timberwolves,2025-04-08,B,


<h3 style="color: #8b5e3c">Validating the Merge</h3>
Next, we move on to validating the merging of the tables. We would like to check if the game `Minnesota Timberwolves` has the same features that we looked at before. To do this, we use `.iloc` to check for the observation. 

In [5]:
# checking number of missing values in giveaway column
missing_giveaway_1 = game_df['Giveaway'].isna().sum()
print("Missing Giveaways: ", missing_giveaway_1)

# checking number of missing values in new giveaway column
missing_giveaway_2 = game_df['Giveaway'].isna().sum()
print("Missing Giveaways: ", missing_giveaway_2)

# checking the new row that we've generated
new_row = game_seat_df.iloc[493883]
print(new_row)

Missing Giveaways:  63
Missing Giveaways:  63
Season                                        2024
AccountNumber                                15667
Game             2025-04-08 Minnesota Timberwolves
GameDate                                2025-04-08
GameTier                                         B
Giveaway                                       NaN
Name: 493883, dtype: object


<h3 style="color: #8b5e3c">Comparing with Another Merge</h3>
To ensure we did our merge right, and that we're confident with the type of merge, we try to perform any other type of merge, and check to see what the features for the game `Minnesota Timeberwolves` look like. If they don't match with the the details of the observation that we looked at previously, then we know that there was a error in our merge.

In [7]:
game_seat_df_1 = pd.merge(seat_df, game_df, on='Game', how='outer')
display(game_seat_df_1)

Unnamed: 0,Season,AccountNumber,Game,GameDate,GameTier,Giveaway
0,2023,18,2023-10-26 Philadelphia 76ers,2023-10-26,B,
1,2023,40,2023-10-26 Philadelphia 76ers,2023-10-26,B,
2,2023,40,2023-10-26 Philadelphia 76ers,2023-10-26,B,
3,2023,77,2023-10-26 Philadelphia 76ers,2023-10-26,B,
4,2023,77,2023-10-26 Philadelphia 76ers,2023-10-26,B,
...,...,...,...,...,...,...
493879,2024,42997,2025-04-13 Detroit Pistons,2025-04-13,D,Bobblehead
493880,2024,15667,2025-04-13 Detroit Pistons,2025-04-13,D,Bobblehead
493881,2024,15667,2025-04-13 Detroit Pistons,2025-04-13,D,Bobblehead
493882,2024,15667,2025-04-13 Detroit Pistons,2025-04-13,D,Bobblehead


In [8]:
# checking the new row that we've generated
new_row = game_seat_df_1.iloc[493883]
print(new_row)

Season                                 2024
AccountNumber                         15667
Game             2025-04-13 Detroit Pistons
GameDate                         2025-04-13
GameTier                                  D
Giveaway                         Bobblehead
Name: 493883, dtype: object


<h3 style="color: #8b5e3c">Converting to a `.csv` file</h3>
Now that we've completed our merge and are confident with the results, we can finally convert our dataset into a .csv file and have an opportunity to perform data visualization.

In [13]:
# converting data frame to a .csv file
game_seat_df.to_csv('C:/GitHub/BucksHackathon25/BucksBusinessObjectives/BucksDatasets/GLSL.csv', index='True')