<h1 style="color: #8b5e3c;">Merging AccountLevel, Game Level & Seat Level</h1>
In this Jupyter Notebook, we aim to merge `AccountLevel`, `GameLevel` & `SeatLevel` datasets. The purpose of merging these datasets is that we would like to find any possible relationships between the features of all three datasets by combining them together.

<h3 style="color: #8b5e3c">Converting the CSV File to a Pandas Dataframe</h3>
In this section, we convert the `.csv` file into a pandas dataframe by importing pandas and using `.read_csv` to read the csv into a pandas data frame. Finally, we display the dataframes that we have created.

In [None]:
# importing the pandas library
import pandas as pd

# importing ipython display
from IPython.display import display

# importing the .csv files as dataframes
game_seat_df_2024 = pd.read_csv("C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/GLSL_2024.csv").drop(columns="Unnamed: 0")
game_seat_df_2023 = pd.read_csv("C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/GLSL_2023.csv").drop(columns="Unnamed: 0")
game_seat_df = pd.read_csv("C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/GLSL.csv").drop(columns="Unnamed: 0")
account_df = pd.read_csv("C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/AccountLevel.csv")

# splitting the AccountLevel dataset by Season
account_df_2024 = account_df[account_df['Season'] == 2024].sort_values('AccountNumber', ascending=True)
account_df_2023 = account_df[account_df['Season'] == 2023].sort_values('AccountNumber', ascending=True)

# displaying the data frames
display(game_seat_df_2024)
display(account_df_2024)


<h3 style="color: #8b5e3c">Extracting a Row for Validation</h3>
Next, we extract observations for each dataset to where we expect that both observations would merge. In this case, we merge based on the `AccountNumber`. We extract the following as shown below.

In [None]:
# the row for account number
account_row = account_df.iloc[44210]
print(account_row)

In [None]:
# the row for game & seat level
#game_seat_row = game_seat_df.loc[game_seat_df.apply(lambda row: row.astype(str).str.contains('15667').any(), axis=1)]
#print(game_seat_row)

<h3 style="color: #8b5e3c">Merging the Two Datasets</h3>
Now that we've imported the datasets as pandas dataframes, we now create a new dataframe by merging two tables together. The feature that is in common with both datasets is the AccountNumber feature. As a result, we perform what is otherwise called a composite key. The type of merge we aim for is a left merge.

In [None]:
# merging the two datasets together
account_game_seat_df = pd.merge(game_seat_df, account_df, on='AccountNumber', how='left')
account_game_seat_df_2024 = pd.merge(game_seat_df_2024, account_df_2024, on='AccountNumber', how='left')
account_game_seat_df_2023 = pd.merge(game_seat_df_2023, account_df_2023, on='AccountNumber', how='left')

display(account_game_seat_df.head(3))

<h3 style="color: #8b5e3c">Validating the Merge</h3>
Next, we move on to validating the merge. We achieve this by checking if the rows that we checked before merging are the same as after merging. As a result, we display the data from the rows on `AccountNumber` of 15667.

In [None]:
#account_game_seat_row = account_game_seat_df.loc[account_game_seat_df.apply(lambda row: row.astype(str).str.contains('15667').any(), axis=1)]
#print(account_game_seat_row)

In [None]:
# printing out the details of the new data frame
#display(account_game_seat_df.info())


<h3 style="color: #8b5e3c">Converting to a `.csv` file</h3>
Now that we've completed our merge and are confident with the results, we can finally convert our dataset into a .csv file and have an opportunity to perform data visualization.

In [None]:
# converting data frame to a .csv file
account_game_seat_df.to_csv('C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/ALGLSL.csv', index='False')
account_game_seat_df_2024.to_csv('C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/ALGLSL_2024.csv', index='False')
account_game_seat_df_2023.to_csv('C:/Users/galvanm/python/BucksHackathon25/BucksDatasets/ALGLSL_2023.csv', index='False')

