# Introduction

## Schema from Exports

Slimming Down the Columns to reduce the memory impact on the computer.

```python
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12291 entries, 0 to 12290
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Unnamed: 0              12291 non-null  int64  
 1   index                   12291 non-null  int64  
 2   Cube Name               12291 non-null  object 
 3   Cubecon Type            12291 non-null  object 
 4   cardID                  12291 non-null  object 
 5   addedTmsp               12291 non-null  object 
 6   details_name            12291 non-null  object 
 7   details_full_name       12291 non-null  object 
 8   details_artist          12291 non-null  object 
 9   details_rarity          12291 non-null  object 
 10  details_color_identity  12291 non-null  object 
 11  details_colors          12291 non-null  object 
 12  details_set             12291 non-null  object 
 13  details_released_at     12291 non-null  object 
 14  details_cmc             12291 non-null  int64  
 15  details_parsed_cost     12291 non-null  object 
 16  details_type            12291 non-null  object 
 17  details_elo             12291 non-null  float64
 18  details_popularity      12291 non-null  float64
 19  details_cubeCount       12291 non-null  int64  
 20  details_loyalty         268 non-null    object 
 21  details_power           5286 non-null   object 
 22  details_toughness       5286 non-null   object 
dtypes: float64(2), int64(4), object(17)
memory usage: 2.2+ MB
```

Count of Cards per Cubes:

```
# Make a DataFrame of above
cube_munity = cubes_main_event.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)
cube_munity.to_csv('../Data Files/cards_per_unique_cubes.csv')
```

Creating a function to clean up the Data:

```python
# Give me the # of Cubes each card (overall card name) is in
# Then we turn it into a function:
# Then you move this to after the extraction

def tweak_cubes(cubes_main_event):
    return (cubes_main_event
     .assign(details_cmc=cubes_main_event.details_cmc.fillna(0).astype('int8'),
             is_creature=cubes_main_event.details_type.str.contains('Creature'),
             is_land = cubes_main_event.details_type.str.contains('Land'),
             is_pwer = cubes_main_event.details_type.str.contains('Planeswalker'),
             added_to_cube_on = pd.to_datetime(cubes_main_event.addedTmsp, errors='coerce', unit='ms'),
             composite_id = cubes_main_event['Cube Name'] + '-' + cubes_main_event['cardID']
        )
     .astype({'Cubecon Type': 'category', 'details_cmc': 'int8', 'details_rarity': 'category'})
     .drop(columns=['Unnamed: 0', 'index'])
)

cleaned_data = tweak_cubes(cubes_main_event)
```

Checking for Values in Added Timestamp:
```python
cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),['added_to_cube_on', 'addedTmsp']]

cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),'added_to_cube_on'] = pd.to_datetime(cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),:]['addedTmsp'])
cleaned_data['added_to_cube_on'] = pd.to_datetime(cleaned_data['added_to_cube_on'], utc=True)
cleaned_data['days_until_added_to_cube'] = ((cleaned_data['added_to_cube_on'])-pd.to_datetime(cleaned_data['details_released_at']).dt.tz_convert('UTC')).dt.days

```




In [1]:
import numpy as np
import pandas as pd

In [2]:
pre_accepted_cubes = pd.read_csv('../Data Files/pre_accepted_cubes_extract.csv')
poll_winner_cubes = pd.read_csv('../Data Files/poll_winner_cubes_extract.csv')

In [3]:
pre_accepted_cubes.info()
poll_winner_cubes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12245 entries, 0 to 12244
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Unnamed: 0              12245 non-null  int64  
 1   index                   12245 non-null  int64  
 2   Cube Name               12245 non-null  object 
 3   Cubecon Type            12245 non-null  object 
 4   cardID                  12245 non-null  object 
 5   addedTmsp               12245 non-null  object 
 6   details_name            12245 non-null  object 
 7   details_full_name       12245 non-null  object 
 8   details_artist          12245 non-null  object 
 9   details_rarity          12245 non-null  object 
 10  details_color_identity  12245 non-null  object 
 11  details_colors          12245 non-null  object 
 12  details_set             12245 non-null  object 
 13  details_released_at     12245 non-null  object 
 14  details_cmc             12245 non-null

In [4]:
cubes_main_event = pd.concat([pre_accepted_cubes, poll_winner_cubes], ignore_index=True)
cubes_main_event.head()

Unnamed: 0.1,Unnamed: 0,index,Cube Name,Cubecon Type,cardID,addedTmsp,details_name,details_full_name,details_artist,details_rarity,...,details_released_at,details_cmc,details_parsed_cost,details_type,details_elo,details_popularity,details_cubeCount,details_loyalty,details_power,details_toughness
0,0,0,Regular Cube,Pre-Accepted,251015ed-9408-4941-894a-158551ed2613,1572901810806,Favored Hoplite,Favored Hoplite [ths-13],Winona Nelson,uncommon,...,2013-09-27,1,['w'],Creature — Human Soldier,1231.6,1.605022,2388,,1.0,2.0
1,1,1,Regular Cube,Pre-Accepted,70e3a90c-1e5c-4646-b3d5-ff46d3fa7b35,1572901810808,Trusted Pegasus,Trusted Pegasus [m20-314],Chris Rahn,common,...,2019-07-12,3,"['w', '2']",Creature — Pegasus,1188.6,1.501516,2234,,2.0,2.0
2,2,2,Regular Cube,Pre-Accepted,27394079-924a-4fdb-8be2-f853193eca80,1572901810808,Whitemane Lion,Whitemane Lion [a25-39],Zoltan Boros & Gabor Szikszai,common,...,2018-03-16,2,"['w', '1']",Creature — Cat,1175.3,3.695987,5499,,2.0,2.0
3,3,3,Regular Cube,Pre-Accepted,c47ba1fa-3ace-488b-97e6-d9f3b389c602,1572901810809,Emeria Angel,Emeria Angel [ima-20],Jim Murray,rare,...,2017-11-17,4,"['w', 'w', '2']",Creature — Angel,1198.9,4.237043,6304,,3.0,3.0
4,4,4,Regular Cube,Pre-Accepted,2c7142a8-38fa-4e9d-9085-a26fb217a433,1572901810810,Oblivion Ring,Oblivion Ring [ddg-34],Chuck Lukacs,common,...,2011-04-01,3,"['w', '2']",Enchantment,1274.7,17.211644,25608,,,


In [5]:
# Update Date when Ran
cubes_main_event.to_csv('../Data Files/cubecon_card_list_2023_08_24.csv')

In [6]:
# Give me the # of Cubes each card (overall card name) is in
# Then we turn it into a function:
# Then you move this to after the extraction

def tweak_cubes(cubes_main_event):
    return (cubes_main_event
     .assign(details_cmc=cubes_main_event.details_cmc.fillna(0).astype('int8'),
             is_creature=cubes_main_event.details_type.str.contains('Creature'),
             is_land = cubes_main_event.details_type.str.contains('Land'),
             is_planeswalker = cubes_main_event.details_type.str.contains('Planeswalker'),
             is_gold = cubes_main_event.details_colors.str.contains(','),
             added_to_cube_on = pd.to_datetime(cubes_main_event.addedTmsp, errors='coerce', unit='ms'),
             composite_id = cubes_main_event['Cube Name'] + '-' + cubes_main_event['cardID']
        )
     .astype(
         {'Cubecon Type': 'category', 
          'details_cmc': 'int8',
          'details_rarity': 'category', 
          'details_released_at':'datetime64[ns, UTC]'}
     )
     .drop(columns=['Unnamed: 0', 'index'])
)

cleaned_data = tweak_cubes(cubes_main_event)

In [7]:
cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),'added_to_cube_on'] = pd.to_datetime(cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),:]['addedTmsp'])
cleaned_data['added_to_cube_on'] = pd.to_datetime(cleaned_data['added_to_cube_on'], utc=True)
cleaned_data['days_until_added_to_cube'] = ((cleaned_data['added_to_cube_on'])-pd.to_datetime(cleaned_data['details_released_at']).dt.tz_convert('UTC')).dt.days


In [8]:
cleaned_data.info()
cubes_main_event.to_csv('../Data Files/cubecon_card_list_2023_08_24.csv')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20563 entries, 0 to 20562
Data columns (total 28 columns):
 #   Column                    Non-Null Count  Dtype              
---  ------                    --------------  -----              
 0   Cube Name                 20563 non-null  object             
 1   Cubecon Type              20563 non-null  category           
 2   cardID                    20563 non-null  object             
 3   addedTmsp                 20563 non-null  object             
 4   details_name              20563 non-null  object             
 5   details_full_name         20563 non-null  object             
 6   details_artist            20563 non-null  object             
 7   details_rarity            20563 non-null  category           
 8   details_color_identity    20563 non-null  object             
 9   details_colors            20563 non-null  object             
 10  details_set               20563 non-null  object             
 11  details_release

In [9]:
avg_cubing = (cleaned_data
                 .groupby('Cube Name')
                 .mean()
                 .sort_values('details_popularity', ascending=False)
                 .loc[:,['details_popularity', 'is_land', 'is_gold']]
)

avg_cubing

Unnamed: 0_level_0,details_popularity,is_land,is_gold
Cube Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Data Generated Vintage Cube,11.197034,0.15,0.083333
The Museum of Modern,9.962232,0.235417,0.2125
The Bun Magic Cube,9.786131,0.222222,0.1
Dekkaru Cube,9.205485,0.158607,0.088975
The Modern Darlings Cube,8.702585,0.208333,0.116667
Eleusis,8.483045,0.146667,0.057778
Derek’s Cube,7.832255,0.20625,0.172917
The Chicago Cube,7.79363,0.144444,0.105556
The Creative Cube,7.538415,0.2,0.09
All-Foil Midrange Cube,7.088198,0.138889,0.197222


In [10]:
cleaned_data.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False).head(40)

details_name
Temple Garden              31
Stomping Ground            31
Sacred Foundry             30
Overgrown Tomb             30
Blood Crypt                30
Godless Shrine             29
Steam Vents                29
Breeding Pool              29
Watery Grave               29
Hallowed Fountain          29
Lightning Bolt             27
Faithless Looting          26
Bloodstained Mire          26
Wooded Foothills           26
Path to Exile              26
Windswept Heath            26
Flooded Strand             25
Polluted Delta             25
Verdant Catacombs          25
Duress                     24
Marsh Flats                24
Scalding Tarn              24
Misty Rainforest           24
Arid Mesa                  24
Young Pyromancer           23
Thraben Inspector          23
Abrade                     23
Eternal Witness            23
Tireless Tracker           22
Preordain                  21
Grim Lavamancer            21
Carrion Feeder             21
Unearth                    

In [55]:
peasant = cleaned_data[cleaned_data['details_rarity'].isin(['uncommon', 'common'])]

In [56]:
peasant_groupby = peasant.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)
peasant_groupby.to_csv('../Data Files/peasant_cards_grouping_2023_06_22.csv')

In [11]:
cleaned_data[cleaned_data['details_set'] == 'ltr'].groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False).head(40)

details_name
Reprieve                        7
Andúril, Flame of the West      6
Wizard's Rockets                6
Delighted Halfling              6
Stern Scolding                  5
Rise of the Witch-king          5
Rosie Cotton of South Lane      5
Samwise Gamgee                  5
Merry, Esquire of Rohan         5
Generous Ent                    4
Troll of Khazad-dûm             4
Barrow-Blade                    4
Denethor, Ruling Steward        4
Flame of Anor                   4
Flowering of the White Tree     4
Oliphaunt                       4
Orcish Bowmasters               4
Palantír of Orthanc             4
Pippin, Guard of the Citadel    4
The One Ring                    4
Rally at the Hornburg           4
The Shire                       4
Lórien Revealed                 3
Éomer, Marshal of Rohan         3
Lembas                          3
Gollum's Bite                   3
Arwen, Mortal Queen             3
Minas Tirith                    3
Gandalf the White               3
E

In [12]:
cleaned_data[cleaned_data['details_set'] == 'ltr'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
Amonkar Desert                      24
The Spikeless Cube                  19
Derek’s Cube                        15
The Live the Dream Cube             14
Creatureless Cube                   12
Sammich's Peasant Cube              11
The Tempo Cube                      11
The Creative Cube                   11
Commander extravaganza!             10
Tiny Leaders                         9
The Jund Cube                        9
A Study in Harmony                   8
The Devoid Cube                      6
The Penrose Cube                     6
May's Fae Cube                       6
The Buildaround Cube                 6
Changeling Cube                      5
Reading Rainbow                      5
The Bun Magic Cube                   4
All-Foil Midrange Cube               4
Loial's Micro Cube                   4
Emma Partlow's Peasant Cube          4
Counters of Monte Cristo             3
Spooky Black Halloween Graveyard     3
The Cascade Cube                     3
Uber Bear's Art

In [13]:
cleaned_data[cleaned_data['details_name'] == 'Karakas'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
Tolsimir Cube                  1
Data Generated Vintage Cube    1
Name: details_name, dtype: int64

In [14]:
cleaned_data[cleaned_data['details_set'] == 'ltr'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
Amonkar Desert                      24
The Spikeless Cube                  19
Derek’s Cube                        15
The Live the Dream Cube             14
Creatureless Cube                   12
Sammich's Peasant Cube              11
The Tempo Cube                      11
The Creative Cube                   11
Commander extravaganza!             10
Tiny Leaders                         9
The Jund Cube                        9
A Study in Harmony                   8
The Devoid Cube                      6
The Penrose Cube                     6
May's Fae Cube                       6
The Buildaround Cube                 6
Changeling Cube                      5
Reading Rainbow                      5
The Bun Magic Cube                   4
All-Foil Midrange Cube               4
Loial's Micro Cube                   4
Emma Partlow's Peasant Cube          4
Counters of Monte Cristo             3
Spooky Black Halloween Graveyard     3
The Cascade Cube                     3
Uber Bear's Art

In [16]:
top_cards = cleaned_data.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)