# Set List Table

## Introduction

The purpose of this notebook is to process and upload keyword data from MTGJSON into the postgresql database mtg_db. This is done through the following steps:
- Download the json file from MTGJSON's file server
- Check the version and date of the json file
- Pre-process the dictionary and convert it into a dataframe
- Push the keywords dataframe to the database "raw_data" schema

## Schemas

### Input table - Set List

| Column            | Renamed         | Datatype   | Description                                                                |
| ---               | ---             | ---        | ---                                                                        |
| code              | SET_CODE        | STRING     | The set code                                                               |
| name              | SET_NAME        | STRING     | Name of the set                                                            |
| baseSetSize       | BASE_SET_SIZE   | INTEGER    | The number of cards in the base set without promos or supplements          |
| mcmId             | CM_ID           | FLOAT      | Card Market set ID                                                         |
| mcmIdExtras       | CM_ID_ADD       | FLOAT      | If the set is split into two sets this is the additional Card Market ID    |
| mcmName           | CM_NAME         | STRING     | Name of the set on Card Market                                             |
| cardsphereSetId   | CS_SET_ID       | FLOAT      | ID for set in Cardsphere                                                   |
| isFoilOnly        | FOIL_FLAG       | BOOLEAN    | Flag whether the set is only available as foils                            |
| isForeignOnly     | FOREIGN_FLAG    | BOOLEAN    | Flag whether the set is only available outside the US                      |
| keyruneCode       | KEYRUNE_CODE    | STRING     | ID for the keyrune database of set icons                                   |
| languages         | LANGUAGES       | LIST       | List of languages the set was printed in                                   |
| mtgoCode          | MTGO_SET_CODE   | STRING     | Set code on Magic The Gathering Online                                     |
| isNonFoilOnly     | NON_FOIL_FLAG   | BOOLEAN    | Flag whether the set is only available as non-foils                        |
| isOnlineOnly      | ONLINE_FLAG     | BOOLEAN    | Flag whether the set is only available in online formats                   |
| isPartialPreview  | PREVIEW_FLAG    | BOOLEAN    | Flag whether the set is still in preview and not complete                  |
| sealedProduct     | PRODUCT_INFO    | LIST       | Information about the purchasable sealed product                           |
| releaseDate       | RELEASE_DATE    | STRING     | Date the set was release, in format YYYY-MM-DD                             |
| block             | SET_BLOCK_NAME  | STRING     | Block the set is in, e.g. Kaladesh                                         |
| decks             | SET_DECKS       | LIST       | All decks associated with the set                                          |
| parentCode        | SET_PARENT_CODE | STRING     | Code of the parent set for set variations, e.g. promotions, guild kits etc |
| tokenSetCode      | SET_TOKEN_CODE  | STRING     | Code for the set's tokens                                                  |
| type              | SET_TYPE        | STRING     | The type of set, e.g. alchemy, commander, funny                            |
| tcgplayerGroupId  | TCGPG_ID        | INTEGER    | ID for the set on TCGplayer                                                |
| totalSetSize      | TOTAL_SET_SIZE  | INTEGER    | The number opf cards in the set with promos and supplements                |
| translations      | TRANSLATIONS    | DICTIONARY | The translated name of the set                                             |

### Sets Info table

Schema for sets_info

| Column          | Datatype   | Description                                                                |
| ---             | ---        | ---                                                                        |
| SET_CODE        | STRING     | The set code                                                               |
| SET_NAME        | STRING     | Name of the set                                                            |
| RELEASE_DATE    | STRING     | Date the set was release, in format YYYY-MM-DD                             |
| SET_TYPE        | STRING     | The type of set, e.g. alchemy, commander, funny                            |
| SET_BLOCK_NAME  | STRING     | Block the set is in, e.g. Kaladesh                                         |
| SET_PARENT_CODE | STRING     | Code of the parent set for set variations, e.g. promotions, guild kits etc |
| SET_TOKEN_CODE  | STRING     | Code for the set's tokens                                                  |
| BASE_SET_SIZE   | INTEGER    | The number of cards in the base set without promos or supplements          |
| TOTAL_SET_SIZE  | INTEGER    | The number opf cards in the set with promos and supplements                |
| DECK_COUNT      | INTEGER    | The number of decks released with the set                                  |
| FOIL_FLAG       | BOOLEAN    | Flag whether the set is only available as foils                            |
| NON_FOIL_FLAG   | BOOLEAN    | Flag whether the set is only available as non-foils                        |
| FOREIGN_FLAG    | BOOLEAN    | Flag whether the set is only available outside the US                      |
| ONLINE_FLAG     | BOOLEAN    | Flag whether the set is only available in online formats                   |
| PREVIEW_FLAG    | BOOLEAN    | Flag whether the set is still in preview and not complete                  |
| CM_ID           | INTEGER    | Card Market set ID                                                         |
| CM_ID_ADD       | INTEGER    | If the set is split into two sets this is the additional Card Market ID    |
| CM_NAME         | STRING     | Name of the set on Card Market                                             |
| CS_SET_ID       | INTEGER    | ID for set in Cardsphere                                                   |
| KEYRUNE_CODE    | STRING     | ID for the keyrune database of set icons                                   |
| MTGO_SET_CODE   | STRING     | Set code on Magic The Gathering Online                                     |
| TCGPG_ID        | INTEGER    | ID for the set on TCGplayer                                                |

### Card Languages table

Schema for languages

| Column              | Datatype | Description                                                     |
| ---                 | ---      | ---                                                             |
| SET_CODE            | STRING   | Set code ID                                                     |
| SET_NAME            | STRING   | Name of the set                                                 |
| ANCIENT_GREEK       | BOOLEAN  | True/False whether the set is translated to Ancient Greek       |
| ARABIC              | BOOLEAN  | True/False whether the set is translated to Arabic              |
| CHINESE_SIMPLIFIED  | BOOLEAN  | True/False whether the set is translated to simplified Chinese  |
| CHINESE_TRADITIONAL | BOOLEAN  | True/False whether the set is translated to traditional Chinese |
| ENGLISH             | BOOLEAN  | True/False whether the set is translated to English             |
| FRENCH              | BOOLEAN  | True/False whether the set is translated to French              |
| GERMAN              | BOOLEAN  | True/False whether the set is translated to German              |
| HEBREW              | BOOLEAN  | True/False whether the set is translated to Hebrew              |
| ITALIAN             | BOOLEAN  | True/False whether the set is translated to Italian             |
| JAPANESE            | BOOLEAN  | True/False whether the set is translated to Japanese            |
| KOREAN              | BOOLEAN  | True/False whether the set is translated to Korean              |
| LATIN               | BOOLEAN  | True/False whether the set is translated to Latin               |
| PHYREXIAN           | BOOLEAN  | True/False whether the set is translated to Phyrexian           |
| BRAZILIAN_PORTUGESE | BOOLEAN  | True/False whether the set is translated to Portugese           |
| RUSSIAN             | BOOLEAN  | True/False whether the set is translated to Russian             |
| SANSKRIT            | BOOLEAN  | True/False whether the set is translated to Sanskrit            |
| SPANISH             | BOOLEAN  | True/False whether the set is translated to Spanish             |

### Set Name Translations table

Schema for translations

| Column               | Datatype | Description                                         |
| ---                  | ---      | ---                                                 |
| SET_CODE             | STRING   | Set code ID                                         |
| SET_NAME             | STRING   | Name of the set                                     |
| BRAZILIAN_PORTUGUESE | STRING   | Portuguese set name translation if exists           |
| FRENCH               | STRING   | French set name translation if exists               |
| GERMAN               | STRING   | German set name translation if exists               |
| ITALIAN              | STRING   | Italian set name translation if exists              |
| JAPANESE             | STRING   | Japanese set name translation if exists             |
| KOREAN               | STRING   | Korean set name translation if exists               |
| RUSSIAN              | STRING   | Russian set name translation if exists              |
| SIMPLIFIED_CHINESE   | STRING   | Simplified Chinese set name translation if exists   |
| SPANISH              | STRING   | Spanish set name translation if exists              |
| TRADITIONAL_CHINESE  | STRING   | Traditional Chinese set name translation if exists  |

### Set Decks

#### Set Decks Info table

Schema for set_decks_info:

| Column                    | Datatype  | Description                                                   |
| ---                       | ---       | ---                                                           |
| SET_CODE                  | STRING    | The set code                                                  |
| SET_NAME                  | STRING    | Name of the set                                               |
| DECK_NAME                 | STRING    | Name of the deck                                              |
| RELEASE_DATE              | DATE      | Date the deck was released                                    |
| DECK_TYPE                 | STRING    | The type of deck, e.g. Commander, Jumpstart etc               |
| COMMANDER                 | STRING    | If a Commander deck then the name of the Commander            |
| PARTNER                   | STRING    | If there are two Commanders then name of the 2nd Commander    |
| DECK_CARDS_COUNT          | INTEGER   | Number of cards in the deck                                   |
| SIDE_BOARD_CARDS_COUNT    | INTEGER   | Number of cards in the sideboard                              |
| DISPLAY_COMMANDER_COUNT   | INTEGER   | The number of display commanders associated with the deck     |
| PLANES_COUNT              | INTEGER   | The number of planes associated with a planeschase deck       |
| SCHEMES_COUNT             | INTEGER   | The number of schemes associated with an Archenemeny deck     |
| SEALED_PRODUCT_IDS        | STRING    | ID of the deck sealed products                                |

#### Display Commanders table

Schema for display_commanders:

| Column            | Datatype  | Description                                               |
| ---               | ---       | ---                                                       |
| SET_CODE          | STRING    | The set code                                              |
| SET_NAME          | STRING    | Name of the set                                           |
| DECK_NAME         | STRING    | Name of the deck                                          |
| DISPLAY_COMMANDER | STRING    | UUID for the card used to represent the Commander deck    |

#### Set Deck Cards table

Schema for set_decks_cards:

| Column        | Datatype  | Description                           |
| ---           | ---       | ---                                   |
| SET_CODE      | STRING    | The set code                          |
| SET_NAME      | STRING    | Name of the set                       |
| DECK_NAME     | STRING    | Name of the deck                      |
| CARD_COUNT    | INTEGER   | Number of specific card in the deck   |
| CARD          | STRING    | UUID of the card in the deck          |

#### Set Deck Side Board Cards table

Schema for set_decks_side_board:

| Column            | Datatype  | Description                           |
| ---               | ---       | ---                                   |
| SET_CODE          | STRING    | The set code                          |
| SET_NAME          | STRING    | Name of the set                       |
| DECK_NAME         | STRING    | Name of the deck                      |
| CARD_COUNT        | INTEGER   | Number of specific card in the deck   |
| SIDE_BOARD_CARD   | STRING    | UUID of the card in the side board    |

#### Set Deck Planes table

Schema for set_decks_planes:

| Column        | Datatype  | Description               |
| ---           | ---       | ---                       |
| SET_CODE      | STRING    | The set code              |
| SET_NAME      | STRING    | Name of the set           |
| DECK_NAME     | STRING    | Name of the deck          |
| PLANE_COUNT   | INTEGER   | Number of specific plane  |
| PLANE         | STRING    | UUID of the plane         |

#### Set Deck Schemes table

Schema for set_decks_schemes:

| Column        | Datatype  | Description               |
| ---           | ---       | ---                       |
| SET_CODE      | STRING    | The set code              |
| SET_NAME      | STRING    | Name of the set           |
| DECK_NAME     | STRING    | Name of the deck          |
| SCHEME_COUNT  | INTEGER   | Number of specific scheme |
| SCHEME        | STRING    | UUID of the scheme        |

### Set Product Info table

Schema for set_product_info:

| Column                    | Datatype          | Description                                               |
| ---                       | ---               | ---                                                       |
| SET_CODE                  | STRING            | Set code ID                                               |
| SET_NAME                  | STRING            | Name of the set                                           |
| PRODUCT_NAME              | STRING            | Name of the product                                       |
| CATEGORY                  | STRING            | The category of the physical product package              |
| SUBTYPE                   | STRING            | The type of the product, e.g. prerelease_kit, collector   |
| PRODUCT_RELEASE_DATE      | DATE              | Date the product was released                             |
| PRODUCT_CARD_COUNT        | INTEGER           | Count of the cards in the product                         |
| PRODUCT_UUID              | STRING            | UUID of the product                                       |
| PRODUCT_LANGUAGE          | STRING            | Languages the product has been released in                |
| CONTENTS_NAME             | STRING            | Names of the individual contents of the product           |
| CONTENTS_TYPE             | STRING            | Type of individual content, e.g. sealed, pack, deck       |
| CONTENTS_CODE             | STRING            | Categorisation of the contents, e.g. prerelease-selesnya  |
| CONTENTS_COUNT            | INTEGER           | Count of the individual contents                          |
| CONTENTS_UUID             | STRING            | UUID of the individual contents                           |
| CONTENTS_CARD_NUMBER      | INTEGER/STRING    | Numerical ID associated with content, e.g. 4, 373, 240★  |
| PURCHASE_URL_CARD_KINGDOM | URL               | Card Kingdom URL for purchasing product                   |
| PURCHASE_URL_TCG_PLAYER   | URL               | TCG Player URL for purchasing product                     |
| ABU_GAMES_ID              | INTEGER           | Unique ID for Abu Games database                          |
| CARD_KINGDOM_ID           | INTEGER           | Unique ID for Card Kingdom database                       |
| CARD_MARKET_ID            | INTEGER           | Unique ID for Card Market database                        |
| CARD_TRADER_ID            | INTEGER           | Unique ID for Card Trader database                        |
| COOL_STUFF_INC_ID         | INTEGER           | Unique ID for Cool Stuff Inc database                     |
| MINIATURE_MARKET_ID       | INTEGER           | Unique ID for Miniature Market database                   |
| MVP_GAMES_ID              | STRING            | Unique ID for MVP Games database                          |
| STAR_CITY_GAMES_ID        | STRING            | Unique ID for Star City Games database                    |
| TCG_PLAYER_ID             | INTEGER           | Unique ID for TCG Player database                         |
| TOAD_AND_TROLL_ID         | INTEGER           | Unique ID for Toad and Troll database                     |

## Python Libraries

In [596]:
import json
import requests
import lzma
from   tqdm                           import tqdm
import numpy                          as     np
import pandas                         as     pd
from   sqlalchemy                     import create_engine, text, inspect

## Modular functions
# Setting the root path for finding the modules directory
import sys, os
sys.path.append(os.path.abspath(".."))
# Loading Modular functions
from   modules.utils_set_list import extract_purchase_urls
from   modules.data_recency   import data_recency_check, recency_check_upload

# Clean-Up
del sys, os

In [597]:
# Show all columns instead of truncating with "..."
pd.set_option("display.max_columns", None)

# (Optional) also show all rows
pd.set_option("display.max_rows", None)

# (Optional) widen the display area so columns don’t wrap badly
pd.set_option("display.width", None)

## Input

### Database Connection

In [598]:
## Setting up credentials for accessing postgresql "mtg_db" database

# Credentials for setting up connection to postgresql
user     = "postgres"
password = "as:123bpostgresql"
host     = "localhost"
port     = "5432"
database = "mtg_db"

# Engine connection to postgresql
engine = create_engine(f"postgresql+psycopg2://{user}:{password}@{host}:{port}/{database}")

# Clean-Up
del user, password, host, port, database, create_engine

In [599]:
## Creating the empty data_recency table if not exists
query = """
        CREATE TABLE IF NOT EXISTS raw_data.data_recency (
         json_type      TEXT PRIMARY KEY
        ,latest_date    DATE
        ,latest_version TEXT);
        """
with engine.begin() as conn:
    conn.execute(text(query))

# Clean-Up
del query, conn, text

### Input Data

In [600]:
# URL for MTGJSON (example: Keywords.xz)
url = "https://mtgjson.com/api/v5/SetList.json.xz"

# Download the compressed file
response = requests.get(url)
response.raise_for_status()

# Prepare to track total size and read in chunks
total_size = int(response.headers.get('content-length', 0))  # total bytes, may be None
chunk_size = 1024 * 1024  # 1 MB per chunk
compressed_data = bytearray()  # store the downloaded bytes

# Iterate over response chunks, updating progress bar
with tqdm(total=total_size, unit='B', unit_scale=True, desc="Downloading") as pbar:
    for chunk in response.iter_content(chunk_size=chunk_size):
        if chunk:  # filter out keep-alive chunks
            compressed_data.extend(chunk)
            pbar.update(len(chunk))

# Decompress the .xz file
decompressed_bytes = lzma.decompress(compressed_data)

# Parse JSON into a dictionary
dict__set_list = json.loads(decompressed_bytes)

# Clean-Up
del url, response, total_size, chunk_size, compressed_data, decompressed_bytes
del pbar, chunk, tqdm, json, lzma, requests

Downloading: 100%|██████████| 1.46M/1.46M [00:00<00:00, 8.32GB/s]


### Column Lists & Dictionaries

In [601]:
# Dictionary for renaming the original set list dataframe
columns__rename_set_list = {'baseSetSize'                      : 'BASE_SET_SIZE'
                           ,'code'                             : 'SET_CODE'
                           ,'isFoilOnly'                       : 'FOIL_FLAG'
                           ,'isOnlineOnly'                     : 'ONLINE_FLAG'
                           ,'keyruneCode'                      : 'KEYRUNE_CODE'
                           ,'languages'                        : 'LANGUAGES'
                           ,'name'                             : 'SET_NAME'
                           ,'releaseDate'                      : 'RELEASE_DATE'
                           ,'sealedProduct'                    : 'PRODUCT_INFO'
                           ,'tcgplayerGroupId'                 : 'TCGPG_ID'
                           ,'totalSetSize'                     : 'TOTAL_SET_SIZE'
                           ,'type'                             : 'SET_TYPE'
                           ,'block'                            : 'SET_BLOCK_NAME'
                           ,'isNonFoilOnly'                    : 'NON_FOIL_FLAG'
                           ,'parentCode'                       : 'SET_PARENT_CODE'
                           ,'mcmId'                            : 'CM_ID'
                           ,'mcmName'                          : 'CM_NAME'
                           ,'tokenSetCode'                     : 'SET_TOKEN_CODE'
                           ,'translations.Chinese Simplified'  : 'TRANSLATION_SIMPLIFIED_CHINESE'
                           ,'translations.Chinese Traditional' : 'TRANSLATION_TRADITIONAL_CHINESE'
                           ,'translations.French'              : 'TRANSLATION_FRENCH'
                           ,'translations.German'              : 'TRANSLATION_GERMAN'
                           ,'translations.Italian'             : 'TRANSLATION_ITALIAN'
                           ,'translations.Japanese'            : 'TRANSLATION_JAPANESE'
                           ,'translations.Korean'              : 'TRANSLATION_KOREAN'
                           ,'translations.Portuguese (Brazil)' : 'TRANSLATION_BRAZILIAN_PORTUGESE'
                           ,'translations.Russian'             : 'TRANSLATION_RUSSIAN'
                           ,'translations.Spanish'             : 'TRANSLATION_SPANISH'
                           ,'cardsphereSetId'                  : 'CS_SET_ID'
                           ,'decks'                            : 'SET_DECKS'
                           ,'mcmIdExtras'                      : 'CM_ID_ADD'
                           ,'mtgoCode'                         : 'MTGO_SET_CODE'
                           ,'isPartialPreview'                 : 'PREVIEW_FLAG'
                           ,'isForeignOnly'                    : 'FOREIGN_FLAG'}

In [602]:
# Columns for the main set table
columns__sets_info = ['SET_CODE'
                     ,'SET_NAME'
                     ,'RELEASE_DATE'
                     ,'SET_TYPE'
                     ,'SET_BLOCK_NAME'
                     ,'SET_PARENT_CODE'
                     ,'SET_TOKEN_CODE'
                     ,'BASE_SET_SIZE'
                     ,'TOTAL_SET_SIZE'
                     ,'DECK_COUNT'
                     ,'FOIL_FLAG'
                     ,'NON_FOIL_FLAG'
                     ,'FOREIGN_FLAG'
                     ,'ONLINE_FLAG'
                     ,'PREVIEW_FLAG'
                     ,'CM_ID'
                     ,'CM_ID_ADD'
                     ,'CM_NAME'
                     ,'CS_SET_ID'
                     ,'KEYRUNE_CODE'
                     ,'MTGO_SET_CODE'
                     ,'TCGPG_ID']

In [603]:
# Dictionary for renaming the deck table columns
columns__rename_set_decks = {'code'               : 'SET_CODE'
                            ,'commander'          : 'COMMANDER'
                            ,'displayCommander'   : 'DISPLAY_COMMANDER'
                            ,'mainBoard'          : 'DECK_CARDS'
                            ,'name'               : 'DECK_NAME'
                            ,'planes'             : 'PLANES'
                            ,'releaseDate'        : 'RELEASE_DATE'
                            ,'schemes'            : 'SCHEMES'
                            ,'sealedProductUuids' : 'SEALED_PRODUCT_IDS'
                            ,'sideBoard'          : 'SIDE_BOARD_CARDS'
                            ,'type'               : 'DECK_TYPE'}

In [604]:
# Columns included in the master table for set decks
columns__set_deck_info = ['SET_CODE'
                         ,'SET_NAME'
                         ,'DECK_NAME'
                         ,'RELEASE_DATE'
                         ,'DECK_TYPE'
                         ,'COMMANDER'
                         ,'SEALED_PRODUCT_IDS']

In [605]:
## Relational table list
columns__relational_columns = ['DECK_CARDS'
                              ,'SIDE_BOARD_CARDS'
                              ,'DISPLAY_COMMANDER'
                              ,'PLANES'
                              ,'SCHEMES']

In [606]:
# Columns to be in the display commanders lookup table
columns__display_commanders = ['SET_CODE'
                              ,'SET_NAME'
                              ,'DECK_NAME'
                              ,'DISPLAY_COMMANDER']

In [607]:
# Columns for building the deck cards
columns__set_decks_cards = ['SET_CODE'
                           ,'SET_NAME'
                           ,'DECK_NAME'
                           ,'DECK_CARDS']

In [608]:
# Columns for building the side deck dataframe
columns__set_decks_side_board = ['SET_CODE'
                                ,'SET_NAME'
                                ,'DECK_NAME'
                                ,'SIDE_BOARD_CARDS']

In [609]:
# Columns for building the planes dataframe
columns__set_decks_planes = ['SET_CODE'
                            ,'SET_NAME'
                            ,'DECK_NAME'
                            ,'PLANES']

In [610]:
# Columns for building the schemes table
columns__set_decks_schemes = ['SET_CODE'
                             ,'SET_NAME'
                             ,'DECK_NAME'
                             ,'SCHEMES']

In [611]:
# Product info table column renaming dictionary
columns__rename_product_info = {'category'           : 'CATEGORY'
                               ,'identifiers'        : 'IDENTIFIERS'
                               ,'name'               : 'PRODUCT_NAME'
                               ,'purchaseUrls'       : 'PURCHASE_URL'
                               ,'subtype'            : 'SUBTYPE'
                               ,'uuid'               : 'PRODUCT_UUID'
                               ,'cardCount'          : 'PRODUCT_CARD_COUNT'
                               ,'releaseDate'        : 'PRODUCT_RELEASE_DATE'
                               ,'language'           : 'PRODUCT_LANGUAGE'
                               ,'abuId'              : 'ABU_GAMES_ID'
                               ,'cardtraderId'       : 'CARD_TRADER_ID'
                               ,'mcmId'              : 'CARD_MARKET_ID'
                               ,'tcgplayerProductId' : 'TCG_PLAYER_ID'
                               ,'tntId'              : 'TOAD_AND_TROLL_ID'
                               ,'cardKingdomId'      : 'CARD_KINGDOM_ID'
                               ,'scgId'              : 'STAR_CITY_GAMES_ID'
                               ,'csiId'              : 'COOL_STUFF_INC_ID'
                               ,'miniaturemarketId'  : 'MINIATURE_MARKET_ID'
                               ,'mvpId'              : 'MVP_GAMES_ID'
                               ,'contents_count'     : 'CONTENTS_COUNT'
                               ,'contents_name'      : 'CONTENTS_NAME'
                               ,'contents_uuid'      : 'CONTENTS_UUID'
                               ,'contents_code'      : 'CONTENTS_CODE'
                               ,'contents_number'    : 'CONTENTS_CARD_NUMBER'} 

In [612]:
# List of columns for reorganising the product info table
columns__set_product_info = ['SET_CODE'
                            ,'SET_NAME'
                            ,'PRODUCT_NAME'
                            ,'CATEGORY'
                            ,'SUBTYPE'
                            ,'PRODUCT_RELEASE_DATE'
                            ,'PRODUCT_CARD_COUNT'
                            ,'PRODUCT_UUID'
                            ,'PRODUCT_LANGUAGE'
                            ,'CONTENTS_NAME'
                            ,'CONTENTS_TYPE'
                            ,'CONTENTS_CODE'
                            ,'CONTENTS_COUNT'
                            ,'CONTENTS_UUID'
                            ,'CONTENTS_CARD_NUMBER'
                            ,'PURCHASE_URL_CARD_KINGDOM'
                            ,'PURCHASE_URL_TCG_PLAYER'
                            ,'ABU_GAMES_ID'
                            ,'CARD_KINGDOM_ID'
                            ,'CARD_MARKET_ID'
                            ,'CARD_TRADER_ID'
                            ,'COOL_STUFF_INC_ID'
                            ,'MINIATURE_MARKET_ID'
                            ,'MVP_GAMES_ID'
                            ,'STAR_CITY_GAMES_ID'
                            ,'TCG_PLAYER_ID'
                            ,'TOAD_AND_TROLL_ID']

## Pre-processing

In [613]:
# Checking the latest version of the input data
df__data_recency = data_recency_check(dict__set_list, 'set list')

# Clean-Up
del data_recency_check

In [614]:
## Creating the main dataframe
# Converting the dictionary to a flattened dataframe
df__set_list = pd.json_normalize(dict__set_list['data'])

# Renaming the columns
df__set_list = df__set_list.rename(columns = columns__rename_set_list)

# Sorting the set list by release date date and set code
df__set_list = df__set_list.sort_values(by = ['RELEASE_DATE', 'SET_CODE']).reset_index(drop = True)

# Reordering the columns alphabetically with the set name and code first
first_cols = ["SET_CODE", "SET_NAME"]
other_cols = sorted([c for c in df__set_list.columns if c not in first_cols])
df__set_list = df__set_list[first_cols + other_cols]

# Counting the number of decks per set
df__set_list['DECK_COUNT'] = df__set_list['SET_DECKS'].apply(lambda x: len(x) if isinstance(x, list) else 0)

# Ensuring there empty values are consistent
df__set_list = df__set_list.where(pd.notnull(df__set_list), np.nan)

# Clean-Up
del columns__rename_set_list, dict__set_list, first_cols, other_cols

## Main Code

### Set Info Table

In [615]:
# Making a copy of the input dataframe
df__sets_info = df__set_list[columns__sets_info].copy()

# Converting the flag columns to booleans
for col in ['NON_FOIL_FLAG', 'PREVIEW_FLAG', 'FOREIGN_FLAG']:
    df__sets_info[col] = df__sets_info[col].where(df__sets_info[col].notna(), False).astype(bool)

# Converting ID columns to integers
for col in ['CM_ID', 'CM_ID_ADD', 'CS_SET_ID', 'TCGPG_ID']:
    df__sets_info[col] = df__sets_info[col].astype('Int64')

# Clean-Up
del columns__sets_info

### Set Name Translations

In [616]:
## Extracting the set name translations into a separate dataframe

# Creating new dataframe for the set name translations
columns__translations = [column for column in df__set_list.filter(like="TRANSLATION").columns]
df__translations      = df__set_list[['SET_CODE'] + ['SET_NAME'] + columns__translations].copy()

# Renaming the columns for the translation columns
df__translations.columns = df__translations.columns.str.replace("TRANSLATION_", "", regex=False)

# Clean-up
del columns__translations

### Set Languages

In [617]:
## Extracting the set language releases into a separate dataframe

# Making a copy of the columns into a new dataframe
df__languages = df__set_list[['SET_CODE','SET_NAME','RELEASE_DATE','LANGUAGES']].copy()
df__languages['LANGUAGES'] = df__languages['LANGUAGES'].apply(lambda x: x if isinstance(x, list) and x else ['__NONE__'])

# Converting the languages column into a dataframe
df__languages = df__languages.explode('LANGUAGES')
df__languages = (df__languages.assign(value=True).pivot_table(index      = ['SET_CODE','SET_NAME','RELEASE_DATE']
                                                             ,columns    = 'LANGUAGES'
                                                             ,values     = 'value'
                                                             ,fill_value = False).astype(bool).reset_index())

# Drop the empty column
df__languages = df__languages.drop(columns = ['__NONE__'])

# Fixing the column names
df__languages.columns = df__languages.columns.str.upper()
df__languages.columns = df__languages.columns.str.replace(' ','_')
df__languages = df__languages.rename(columns = {'PORTUGUESE_(BRAZIL)' : 'BRAZILIAN_PORTUGUESE'})
df__languages.columns.name = None

# Reordering the dataframe by release date
df__languages = df__languages.sort_values(by = 'RELEASE_DATE').drop(columns = ['RELEASE_DATE']).reset_index(drop = True)


### Set Decks

In [618]:
# Copying the ID and deck data from the input set table
df__set_decks = df__set_list[['SET_CODE','SET_NAME','SET_DECKS']].copy()

# Reordering the columns
df__set_decks = df__set_decks[['SET_CODE'
                              ,'SET_NAME'
                              ,'SET_DECKS']]

# Replacing the NaN rows in the deck column with empty lists so pd.explode works
df__set_decks['SET_DECKS'] = df__set_decks['SET_DECKS'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the deck lists into individual rows of dictionaries
df__set_decks = df__set_decks.explode('SET_DECKS', ignore_index=True)

# Expand the deck dictionary into separate columns
df__set_decks = pd.concat([df__set_decks.drop(columns='SET_DECKS')
                          ,pd.json_normalize(df__set_decks['SET_DECKS'])]
                         ,axis = 1)

# Dropping duplicate column
df__set_decks = df__set_decks.drop(columns = ['code'])

# Renaming the new columns
df__set_decks = df__set_decks.rename(columns__rename_set_decks
                                    ,axis = 1)

# Replacing the empty lists with NaN
df__set_decks = df__set_decks.map(lambda x: np.nan if isinstance(x, list) and len(x) == 0 else x)

# Counting the number of display commanders
df__set_decks['DISPLAY_COMMANDER_COUNT'] = df__set_decks['DISPLAY_COMMANDER'].apply(lambda x: len(x) if isinstance(x, list) else 0)

# Ensuring there empty values are consistent
df__set_decks = df__set_decks.where(pd.notnull(df__set_decks), np.nan)

# Clean-Up
del columns__rename_set_decks

#### Set Decks Info

In [619]:
# Copying the decks source table and keeping key columns
df__set_decks_info = df__set_decks[columns__set_deck_info + columns__relational_columns].copy()

# Looping through the relational tables for total counts
for relational_col in columns__relational_columns:
    # Replacing the NaN rows in the column with empty lists so the total sizes can be counted
    df__set_decks_info[relational_col] = df__set_decks_info[relational_col].apply(lambda x: x if isinstance(x, list) else [])
    # Counting the totals
    df__set_decks_info[f'{relational_col}_COUNT'] = df__set_decks_info[relational_col].apply(lambda cards: sum(d.get('count', 0) for d in cards))

# Only keep rows where COMMANDER is not null
df__commanders = df__set_decks_info.loc[~df__set_decks['COMMANDER'].isna(), 'COMMANDER']

# Separate the main commander and partner commanders into seperate columns
commander_uuids = pd.DataFrame(df__commanders.apply(lambda x: [c['uuid'] for c in x] + [None]*(2-len(x))).tolist()
                              ,columns = ['COMMANDER_1', 'COMMANDER_2']
                              ,index   = df__commanders.index)

# Merge commanders back into the main dataframe
df__set_decks_info[['COMMANDER', 'PARTNER']] = commander_uuids

# Create a flag whether there is a commander partner
df__set_decks_info['PARTNER_FLAG'] = df__set_decks_info['PARTNER'].notna()

# Extracting the product IDs
df__set_decks_info['SEALED_PRODUCT_IDS'] = df__set_decks_info['SEALED_PRODUCT_IDS'].apply(lambda x: x[0] if isinstance(x, list) else x)

# Reorganise the column order
df__set_decks_info = df__set_decks_info[columns__set_deck_info[:-1]
                                     + ['PARTNER']
                                     + [col + '_COUNT' for col in columns__relational_columns]
                                     + ['SEALED_PRODUCT_IDS']]

# Clean-up
del columns__set_deck_info, columns__relational_columns, df__commanders,commander_uuids, relational_col

#### Display Commanders

In [620]:
# Copying the display commanders from the set decks
df__display_commanders = df__set_decks[columns__display_commanders].copy()

# Replacing NaN values with empty lists for pd.explode
df__display_commanders['DISPLAY_COMMANDER'] = df__display_commanders['DISPLAY_COMMANDER'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the display commander lists into individual rows of dictionaries
df__display_commanders = df__display_commanders.explode('DISPLAY_COMMANDER', ignore_index=True)

# Extracting the uuid from the diplay commander dictionary
df__display_commanders['DISPLAY_COMMANDER'] = df__display_commanders['DISPLAY_COMMANDER'].apply(lambda x: x.get('uuid') if isinstance(x, dict) else np.nan)

# Clean-up
del columns__display_commanders

#### Set Decks Cards

In [621]:
# Copying the decks source table and keeping key columns
df__set_decks_cards = df__set_decks[columns__set_decks_cards].copy()

# Replacing NaN values with empty lists for pd.explode
df__set_decks_cards['DECK_CARDS'] = df__set_decks_cards['DECK_CARDS'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the deck cards lists into individual rows of card dictionaries
df__set_decks_cards = df__set_decks_cards.explode('DECK_CARDS', ignore_index=True)

# Extracting the count & uuid from the card dictionary
df__set_decks_cards['CARD_COUNT'] = df__set_decks_cards['DECK_CARDS'].apply(lambda x: x.get('count') if isinstance(x, dict) else np.nan)
df__set_decks_cards['CARD']       = df__set_decks_cards['DECK_CARDS'].apply(lambda x: x.get('uuid') if isinstance(x, dict) else np.nan)

# Dropping the dictionary column
df__set_decks_cards.drop(columns = 'DECK_CARDS'
                        ,inplace = True)

# Converting the card count column to integer
df__set_decks_cards['CARD_COUNT'] = df__set_decks_cards['CARD_COUNT'].astype('Int64')

# Clean-up
del columns__set_decks_cards

#### Set Decks Side Boards

In [622]:
# Copying the decks source table and keeping key columns
df__set_decks_side_board = df__set_decks[columns__set_decks_side_board].copy()

# Replacing NaN values with empty lists for pd.explode
df__set_decks_side_board['SIDE_BOARD_CARDS'] = df__set_decks_side_board['SIDE_BOARD_CARDS'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the deck cards lists into individual rows of card dictionaries
df__set_decks_side_board = df__set_decks_side_board.explode('SIDE_BOARD_CARDS', ignore_index=True)

# Extracting the count & uuid from the card dictionary
df__set_decks_side_board['CARD_COUNT'] = df__set_decks_side_board['SIDE_BOARD_CARDS'].apply(lambda x: x.get('count') if isinstance(x, dict) else np.nan)
df__set_decks_side_board['SIDE_BOARD_CARD'] = df__set_decks_side_board['SIDE_BOARD_CARDS'].apply(lambda x: x.get('uuid') if isinstance(x, dict) else np.nan)

# Dropping the dictionary column
df__set_decks_side_board.drop(columns = 'SIDE_BOARD_CARDS'
                             ,inplace = True)

# Converting the card count column to integer
df__set_decks_side_board['CARD_COUNT'] = df__set_decks_side_board['CARD_COUNT'].astype('Int64')

# Clean-up
del columns__set_decks_side_board

#### Set Decks Planes

In [623]:
# Copying the decks source table and keeping key columns
df__set_decks_planes = df__set_decks[columns__set_decks_planes].copy()

# Replacing NaN values with empty lists for pd.explode
df__set_decks_planes['PLANES'] = df__set_decks_planes['PLANES'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the deck planes lists into individual rows of card dictionaries
df__set_decks_planes = df__set_decks_planes.explode('PLANES', ignore_index=True)

# Extracting the count & uuid from the planes dictionary
df__set_decks_planes['PLANE_COUNT'] = df__set_decks_planes['PLANES'].apply(lambda x: x.get('count') if isinstance(x, dict) else np.nan)
df__set_decks_planes['PLANE']       = df__set_decks_planes['PLANES'].apply(lambda x: x.get('uuid') if isinstance(x, dict) else np.nan)

# Dropping the dictionary column
df__set_decks_planes.drop(columns = 'PLANES'
                        ,inplace = True)

# Converting the planes count column to integer
df__set_decks_planes['PLANE_COUNT'] = df__set_decks_planes['PLANE_COUNT'].astype('Int64')

# Clean-Up
del columns__set_decks_planes

#### Set Deck Schemes

In [624]:
# Copying the decks source table and keeping key columns
df__set_decks_schemes = df__set_decks[columns__set_decks_schemes].copy()

# Replacing NaN values with empty lists for pd.explode
df__set_decks_schemes['SCHEMES'] = df__set_decks_schemes['SCHEMES'].apply(lambda x: x if isinstance(x, list) else [])

# Exploding the deck planes lists into individual rows of card dictionaries
df__set_decks_schemes = df__set_decks_schemes.explode('SCHEMES', ignore_index=True)

# Extracting the count & uuid from the planes dictionary
df__set_decks_schemes['SCHEME_COUNT'] = df__set_decks_schemes['SCHEMES'].apply(lambda x: x.get('count') if isinstance(x, dict) else np.nan)
df__set_decks_schemes['SCHEME']       = df__set_decks_schemes['SCHEMES'].apply(lambda x: x.get('uuid') if isinstance(x, dict) else np.nan)

# Dropping the dictionary column
df__set_decks_schemes.drop(columns = 'SCHEMES'
                          ,inplace = True)

# Converting the planes count column to integer
df__set_decks_schemes['SCHEME_COUNT'] = df__set_decks_schemes['SCHEME_COUNT'].astype('Int64')

# Clean-Up
del columns__set_decks_schemes, df__set_decks

### Set Product Info

In [625]:
# Making a copy of the relevant columns from the set product info table
df__set_product_info = df__set_list[['SET_CODE','SET_NAME','PRODUCT_INFO']].copy()

# Replacing all NaN values in product info with empty lists for df.explode to work
df__set_product_info['PRODUCT_INFO'] = df__set_product_info['PRODUCT_INFO'].apply(lambda x: x if isinstance(x, list) else [])
# Creating new rows for the listed dictionaries
df__set_product_info = df__set_product_info.explode('PRODUCT_INFO'
                                                   ,ignore_index = True)

# Replacing all NaN values in product info with empty dictionaries for json_normalize to work
df__set_product_info['PRODUCT_INFO'] = df__set_product_info['PRODUCT_INFO'].apply(lambda x: x if isinstance(x, dict) else {})
# Creating a dataframe from the dictionaries and joining back onto the main table
df__set_product_info = df__set_product_info.join(pd.json_normalize(df__set_product_info["PRODUCT_INFO"]
                                                                  ,max_level = 0))


# Replacing empty values with 0 and converting the column to integer
df__set_product_info['cardCount'] = df__set_product_info['cardCount'].fillna(0).astype('int64')

# Flattening the purchase urls into separate columns
df__set_product_info[['PURCHASE_URL_CARD_KINGDOM','PURCHASE_URL_TCG_PLAYER']] = df__set_product_info['purchaseUrls'].apply(extract_purchase_urls)

# Creating a dataframe from the identifiers dictionary
df__set_product_info = df__set_product_info.join(pd.json_normalize(df__set_product_info['identifiers']))

# Extracting the dictionary key as the type of product
df__set_product_info['CONTENTS_TYPE'] = df__set_product_info['contents'].apply(lambda x: list(x.keys())[0] if isinstance(x, dict) else {})
# Extracting the values as the product content
df__set_product_info['contents'] = df__set_product_info['contents'].apply(lambda x: list(x.values())[0] if isinstance(x, dict) else {})
# Creating new rows for the listed dictionaries
df__set_product_info = df__set_product_info.explode('contents'
                                                   ,ignore_index = True)
# Creating a dataframe from the contents dictionary and joining back onto the main table
df__set_product_info = df__set_product_info.join(pd.json_normalize(df__set_product_info['contents']).add_prefix('contents_'))

# Dropping the source dictionary columns and unneeded columns
df__set_product_info.drop(columns = ['PRODUCT_INFO'
                                    ,'contents'
                                    ,'purchaseUrls'
                                    ,'identifiers'
                                    ,'contents_configs'
                                    ,'contents_set'
                                    ,'contents_foil']
                         ,inplace = True)

# Replacing any empty dicts or lists with NaN
df__set_product_info = df__set_product_info.map(lambda x: np.nan if isinstance(x, (dict, list)) and len(x) == 0 else x)

# Setting the contents_count and relevant ID columns to integer
df__set_product_info['contents_count'] = df__set_product_info['contents_count'].fillna(0).astype('Int64')
for col in ['abuId','cardtraderId','mcmId','tcgplayerProductId','tntId','cardKingdomId','csiId','miniaturemarketId']:
    df__set_product_info[col] = df__set_product_info[col].replace({np.nan: pd.NA}).astype('Int64')

# Renaming the columns
df__set_product_info = df__set_product_info.rename(columns__rename_product_info
                                                  ,axis = 1)

# Reordering the columns
df__set_product_info = df__set_product_info[columns__set_product_info]

# Clean-Up
del df__set_list, columns__rename_product_info, columns__set_product_info, extract_purchase_urls, col, np

## Output

In [626]:
# Appending/replacing the meta data of the json download to a central table
recency_check_upload(schema_name = "raw_data"
                    ,table_name  = "data_recency"
                    ,dataframe   = df__data_recency
                    ,engine      = engine)

# Clean-Up
del recency_check_upload, df__data_recency

In [627]:
# Uploading the Sets info dataframe to postgresql
df__sets_info.to_sql(name      = 'sets_info'
                    ,con       = engine
                    ,schema    = 'raw_data'
                    ,if_exists = 'replace'
                    ,index     = False)

# Clean-Up
del df__sets_info

In [628]:
# Uploading the translations dataframe to postgresql
df__translations.to_sql(name      = 'translations'
                       ,con       = engine
                       ,schema    = 'raw_data'
                       ,if_exists = 'replace'
                       ,index     = False)

# Clean-Up
del df__translations

In [629]:
# Uploading the languages dataframe to postgresql
df__languages.to_sql(name      = 'languages'
                    ,con       = engine
                    ,schema    = 'raw_data'
                    ,if_exists = 'replace'
                    ,index     = False)

# Clean-Up
del df__languages

In [630]:
# Uploading the set decks info dataframe to postgresql
df__set_decks_info.to_sql(name      = 'set_decks_info'
                         ,con       = engine
                         ,schema    = 'raw_data'
                         ,if_exists = 'replace'
                         ,index     = False)

# Clean-Up
del df__set_decks_info

In [631]:
# Uploading the display commanders dataframe to postgresql
df__display_commanders.to_sql(name      = 'set_decks_display_commanders'
                             ,con       = engine
                             ,schema    = 'raw_data'
                             ,if_exists = 'replace'
                             ,index     = False)

# Clean-Up
del df__display_commanders

In [632]:
# Uploading the set decks dataframe to postgresql
df__set_decks_cards.to_sql(name      = 'set_decks_cards'
                          ,con       = engine
                          ,schema    = 'raw_data'
                          ,if_exists = 'replace'
                          ,index     = False)

# Clean-Up
del df__set_decks_cards

In [633]:
# Uploading the set deck sideboards dataframe to postgresql
df__set_decks_side_board.to_sql(name      = 'set_decks_side_boards'
                               ,con       = engine
                               ,schema    = 'raw_data'
                               ,if_exists = 'replace'
                               ,index     = False)

# Clean-Up
del df__set_decks_side_board

In [634]:
# Uploading the set deck planes dataframe to postgresql
df__set_decks_planes.to_sql(name      = 'set_decks_planes'
                           ,con       = engine
                           ,schema    = 'raw_data'
                           ,if_exists = 'replace'
                           ,index     = False)

# Clean-Up
del df__set_decks_planes

In [635]:
# Uploading the set deck schemes dataframe to postgresql
df__set_decks_schemes.to_sql(name      = 'set_decks_schemes'
                            ,con       = engine
                            ,schema    = 'raw_data'
                            ,if_exists = 'replace'
                            ,index     = False)

# Clean-Up
del df__set_decks_schemes

In [636]:
# Uploading the set product info dataframe to postgresql
df__set_product_info.to_sql(name      = 'set_product_info'
                           ,con       = engine
                           ,schema    = 'raw_data'
                           ,if_exists = 'replace'
                           ,index     = False)

# Clean-Up
del df__set_product_info

## Checks

### Recency Check

In [637]:
# Check the json file date and version
query = """
        SELECT *
        FROM raw_data.data_recency
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,json_type,latest_date,latest_version
0,all printings,2025-09-08,5.2.2+20250908
1,keyword,2025-09-28,5.2.2+20250928
2,set list,2025-09-28,5.2.2+20250928


### Raw Data Tables Check

In [638]:
## Check all the tables in the "raw_data" schema

# Instantiate the inspector object
inspector = inspect(engine)
# List tables in the raw_data schema
for table in inspector.get_table_names(schema="raw_data"):
    print(table)

# Clean-Up
del inspector, inspect, table

display_commanders
set_deck_side_boards
set_deck_planes
keywords
sets_info
translations
data_recency
languages
set_decks_info
set_decks_display_commanders
set_decks_cards
set_decks_side_boards
set_decks_planes
set_decks_schemes
set_product_info
boosters


### Output Check

In [639]:
# Check the sets_info table top 5 values
query = """
        SELECT *
        FROM raw_data.sets_info
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,RELEASE_DATE,SET_TYPE,SET_BLOCK_NAME,SET_PARENT_CODE,SET_TOKEN_CODE,BASE_SET_SIZE,TOTAL_SET_SIZE,DECK_COUNT,FOIL_FLAG,NON_FOIL_FLAG,FOREIGN_FLAG,ONLINE_FLAG,PREVIEW_FLAG,CM_ID,CM_ID_ADD,CM_NAME,CS_SET_ID,KEYRUNE_CODE,MTGO_SET_CODE,TCGPG_ID
0,LEA,Limited Edition Alpha,1993-08-05,core,Core Set,,,295,295,0,False,True,False,False,False,,,,869,LEA,,7
1,LEB,Limited Edition Beta,1993-10-04,core,Core Set,,,302,302,0,False,True,False,False,False,,,,870,LEB,,17
2,2ED,Unlimited Edition,1993-12-01,core,Core Set,,,302,302,0,False,True,False,False,False,,,,938,2ED,,115
3,CED,Collectors' Edition,1993-12-10,memorabilia,,,,302,302,1,False,True,False,False,False,61.0,,Collectors' Edition,786,CED,,1526
4,CEI,Intl. Collectors' Edition,1993-12-10,memorabilia,,,,302,302,1,False,True,False,False,False,,,,856,CEI,,1527


In [640]:
# Check the translations table top 5 values
query = """
        SELECT *
        FROM raw_data.translations
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,BRAZILIAN_PORTUGESE,FRENCH,GERMAN,ITALIAN,JAPANESE,KOREAN,RUSSIAN,SIMPLIFIED_CHINESE,SPANISH,TRADITIONAL_CHINESE
0,LEA,Limited Edition Alpha,,,,,,,,,,
1,LEB,Limited Edition Beta,,,,,,,,,,
2,2ED,Unlimited Edition,,,,,,,,,,
3,CED,Collectors' Edition,,,,,,,,,,
4,CEI,Intl. Collectors' Edition,,,,,,,,,,


In [641]:
# Check the languages table top 5 values
query = """
        SELECT *
        FROM raw_data.languages
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,ANCIENT_GREEK,ARABIC,CHINESE_SIMPLIFIED,CHINESE_TRADITIONAL,ENGLISH,FRENCH,GERMAN,HEBREW,ITALIAN,JAPANESE,KOREAN,LATIN,PHYREXIAN,BRAZILIAN_PORTUGUESE,RUSSIAN,SANSKRIT,SPANISH
0,LEA,Limited Edition Alpha,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False
1,LEB,Limited Edition Beta,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False
2,2ED,Unlimited Edition,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False
3,CEI,Intl. Collectors' Edition,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False
4,CED,Collectors' Edition,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False


In [642]:
# Check the set decks info lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_info
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,RELEASE_DATE,DECK_TYPE,COMMANDER,PARTNER,DECK_CARDS_COUNT,SIDE_BOARD_CARDS_COUNT,DISPLAY_COMMANDER_COUNT,PLANES_COUNT,SCHEMES_COUNT,SEALED_PRODUCT_IDS
0,LEA,Limited Edition Alpha,,,,,,0,0,0,0,0,
1,LEB,Limited Edition Beta,,,,,,0,0,0,0,0,
2,2ED,Unlimited Edition,,,,,,0,0,0,0,0,
3,CED,Collectors' Edition,Collectors' Edition,1993-12-10,Box Set,,,363,0,0,0,0,
4,CEI,Intl. Collectors' Edition,Intl. Collectors' Edition,1993-12-10,Box Set,,,363,0,0,0,0,


In [643]:
# Check the display commanders lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_display_commanders
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,DISPLAY_COMMANDER
0,LEA,Limited Edition Alpha,,
1,LEB,Limited Edition Beta,,
2,2ED,Unlimited Edition,,
3,CED,Collectors' Edition,Collectors' Edition,
4,CEI,Intl. Collectors' Edition,Intl. Collectors' Edition,


In [644]:
# Check the deck cards lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_cards
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,CARD_COUNT,CARD
0,LEA,Limited Edition Alpha,,,
1,LEB,Limited Edition Beta,,,
2,2ED,Unlimited Edition,,,
3,CED,Collectors' Edition,Collectors' Edition,1.0,c423398e-f1d7-571c-8c05-8d5d6b982244
4,CED,Collectors' Edition,Collectors' Edition,1.0,cef4942a-1db6-5961-af91-017afd367bbc


In [645]:
# Check the deck sideboard cards lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_side_boards
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,CARD_COUNT,SIDE_BOARD_CARD
0,LEA,Limited Edition Alpha,,,
1,LEB,Limited Edition Beta,,,
2,2ED,Unlimited Edition,,,
3,CED,Collectors' Edition,Collectors' Edition,,
4,CEI,Intl. Collectors' Edition,Intl. Collectors' Edition,,


In [646]:
# Check the deck planes lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_planes
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,PLANE_COUNT,PLANE
0,LEA,Limited Edition Alpha,,,
1,LEB,Limited Edition Beta,,,
2,2ED,Unlimited Edition,,,
3,CED,Collectors' Edition,Collectors' Edition,,
4,CEI,Intl. Collectors' Edition,Intl. Collectors' Edition,,


In [647]:
# Check the deck schemes lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_decks_schemes
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

Unnamed: 0,SET_CODE,SET_NAME,DECK_NAME,SCHEME_COUNT,SCHEME
0,LEA,Limited Edition Alpha,,,
1,LEB,Limited Edition Beta,,,
2,2ED,Unlimited Edition,,,
3,CED,Collectors' Edition,Collectors' Edition,,
4,CEI,Intl. Collectors' Edition,Intl. Collectors' Edition,,


In [648]:
# Check the set product info lookup table top 5 values
query = """
        SELECT *
        FROM raw_data.set_product_info
        LIMIT 5
        """
pd.read_sql_query(query, con=engine)

# Clean-Up
del pd, query, engine