# Introduction

## Schema from Exports

Slimming Down the Columns to reduce the memory impact on the computer.

```python
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12291 entries, 0 to 12290
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Unnamed: 0              12291 non-null  int64  
 1   index                   12291 non-null  int64  
 2   Cube Name               12291 non-null  object 
 3   Cubecon Type            12291 non-null  object 
 4   cardID                  12291 non-null  object 
 5   addedTmsp               12291 non-null  object 
 6   details_name            12291 non-null  object 
 7   details_full_name       12291 non-null  object 
 8   details_artist          12291 non-null  object 
 9   details_rarity          12291 non-null  object 
 10  details_color_identity  12291 non-null  object 
 11  details_colors          12291 non-null  object 
 12  details_set             12291 non-null  object 
 13  details_released_at     12291 non-null  object 
 14  details_cmc             12291 non-null  int64  
 15  details_parsed_cost     12291 non-null  object 
 16  details_type            12291 non-null  object 
 17  details_elo             12291 non-null  float64
 18  details_popularity      12291 non-null  float64
 19  details_cubeCount       12291 non-null  int64  
 20  details_loyalty         268 non-null    object 
 21  details_power           5286 non-null   object 
 22  details_toughness       5286 non-null   object 
dtypes: float64(2), int64(4), object(17)
memory usage: 2.2+ MB
```

Count of Cards per Cubes:

```
# Make a DataFrame of above
cube_munity = cubes_main_event.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)
cube_munity.to_csv('../Data Files/cards_per_unique_cubes.csv')
```

Creating a function to clean up the Data:

```python
# Give me the # of Cubes each card (overall card name) is in
# Then we turn it into a function:
# Then you move this to after the extraction

def tweak_cubes(cubes_main_event):
    return (cubes_main_event
     .assign(details_cmc=cubes_main_event.details_cmc.fillna(0).astype('int8'),
             is_creature=cubes_main_event.details_type.str.contains('Creature'),
             is_land = cubes_main_event.details_type.str.contains('Land'),
             is_pwer = cubes_main_event.details_type.str.contains('Planeswalker'),
             added_to_cube_on = pd.to_datetime(cubes_main_event.addedTmsp, errors='coerce', unit='ms'),
             composite_id = cubes_main_event['Cube Name'] + '-' + cubes_main_event['cardID']
        )
     .astype({'Cubecon Type': 'category', 'details_cmc': 'int8', 'details_rarity': 'category'})
     .drop(columns=['Unnamed: 0', 'index'])
)

cleaned_data = tweak_cubes(cubes_main_event)
```

Checking for Values in Added Timestamp:
```python
cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),['added_to_cube_on', 'addedTmsp']]

cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),'added_to_cube_on'] = pd.to_datetime(cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),:]['addedTmsp'])
cleaned_data['added_to_cube_on'] = pd.to_datetime(cleaned_data['added_to_cube_on'], utc=True)
cleaned_data['days_until_added_to_cube'] = ((cleaned_data['added_to_cube_on'])-pd.to_datetime(cleaned_data['details_released_at']).dt.tz_convert('UTC')).dt.days

```




In [88]:
import numpy as np
import pandas as pd

In [87]:
pre_accepted_cubes = pd.read_csv('../Data Files/pre_accepted_cubes_extract.csv')
poll_winner_cubes = pd.read_csv('../Data Files/poll_winner_cubes_extract.csv')

In [89]:
pre_accepted_cubes.info()
poll_winner_cubes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12258 entries, 0 to 12257
Data columns (total 23 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Unnamed: 0              12258 non-null  int64  
 1   index                   12258 non-null  int64  
 2   Cube Name               12258 non-null  object 
 3   Cubecon Type            12258 non-null  object 
 4   cardID                  12258 non-null  object 
 5   addedTmsp               12258 non-null  object 
 6   details_name            12258 non-null  object 
 7   details_full_name       12258 non-null  object 
 8   details_artist          12258 non-null  object 
 9   details_rarity          12258 non-null  object 
 10  details_color_identity  12258 non-null  object 
 11  details_colors          12258 non-null  object 
 12  details_set             12258 non-null  object 
 13  details_released_at     12258 non-null  object 
 14  details_cmc             12258 non-null

In [76]:
cubes_main_event = pd.concat([pre_accepted_cubes, poll_winner_cubes], ignore_index=True)
cubes_main_event.head()

Unnamed: 0.1,Unnamed: 0,index,Cube Name,Cubecon Type,cardID,addedTmsp,details_name,details_full_name,details_artist,details_rarity,...,details_released_at,details_cmc,details_parsed_cost,details_type,details_elo,details_popularity,details_cubeCount,details_loyalty,details_power,details_toughness
0,0,0,Regular Cube,Pre-Accepted,251015ed-9408-4941-894a-158551ed2613,1572901810806,Favored Hoplite,Favored Hoplite [ths-13],Winona Nelson,uncommon,...,2013-09-27,1,['w'],Creature — Human Soldier,1231.6,1.605022,2388,,1.0,2.0
1,1,1,Regular Cube,Pre-Accepted,70e3a90c-1e5c-4646-b3d5-ff46d3fa7b35,1572901810808,Trusted Pegasus,Trusted Pegasus [m20-314],Chris Rahn,common,...,2019-07-12,3,"['w', '2']",Creature — Pegasus,1188.6,1.501516,2234,,2.0,2.0
2,2,2,Regular Cube,Pre-Accepted,27394079-924a-4fdb-8be2-f853193eca80,1572901810808,Whitemane Lion,Whitemane Lion [a25-39],Zoltan Boros & Gabor Szikszai,common,...,2018-03-16,2,"['w', '1']",Creature — Cat,1175.3,3.695987,5499,,2.0,2.0
3,3,3,Regular Cube,Pre-Accepted,c47ba1fa-3ace-488b-97e6-d9f3b389c602,1572901810809,Emeria Angel,Emeria Angel [ima-20],Jim Murray,rare,...,2017-11-17,4,"['w', 'w', '2']",Creature — Angel,1198.9,4.237043,6304,,3.0,3.0
4,4,4,Regular Cube,Pre-Accepted,2c7142a8-38fa-4e9d-9085-a26fb217a433,1572901810810,Oblivion Ring,Oblivion Ring [ddg-34],Chuck Lukacs,common,...,2011-04-01,3,"['w', '2']",Enchantment,1274.7,17.211644,25608,,,


In [77]:
# Update Date when Ran
cubes_main_event.to_csv('../Data Files/cubecon_card_list_2023_09_14.csv')

In [78]:
# Give me the # of Cubes each card (overall card name) is in
# Then we turn it into a function:
# Then you move this to after the extraction

def tweak_cubes(cubes_main_event):
    return (cubes_main_event
     .assign(details_cmc=cubes_main_event.details_cmc.fillna(0).astype('int8'),
             is_creature=cubes_main_event.details_type.str.contains('Creature'),
             is_land = cubes_main_event.details_type.str.contains('Land'),
             is_planeswalker = cubes_main_event.details_type.str.contains('Planeswalker'),
             is_gold = cubes_main_event.details_colors.str.contains(','),
             added_to_cube_on = pd.to_datetime(cubes_main_event.addedTmsp, errors='coerce', unit='ms'),
             composite_id = cubes_main_event['Cube Name'] + '-' + cubes_main_event['cardID']
        )
     .astype(
         {'Cubecon Type': 'category', 
          'details_cmc': 'int8',
          'details_rarity': 'category', 
          'details_released_at':'datetime64[ns, UTC]'}
     )
     .drop(columns=['Unnamed: 0', 'index'])
)

cleaned_data = tweak_cubes(cubes_main_event)

In [79]:
cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),'added_to_cube_on'] = pd.to_datetime(cleaned_data.loc[cleaned_data['added_to_cube_on'].isnull(),:]['addedTmsp'])
cleaned_data['added_to_cube_on'] = pd.to_datetime(cleaned_data['added_to_cube_on'], utc=True)
cleaned_data['days_until_added_to_cube'] = ((cleaned_data['added_to_cube_on'])-pd.to_datetime(cleaned_data['details_released_at']).dt.tz_convert('UTC')).dt.days


In [80]:
cleaned_data.info()
cubes_main_event.to_csv('../Data Files/cubecon_card_list_2023_09_14.csv')

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20578 entries, 0 to 20577
Data columns (total 28 columns):
 #   Column                    Non-Null Count  Dtype              
---  ------                    --------------  -----              
 0   Cube Name                 20578 non-null  object             
 1   Cubecon Type              20578 non-null  category           
 2   cardID                    20578 non-null  object             
 3   addedTmsp                 20578 non-null  object             
 4   details_name              20578 non-null  object             
 5   details_full_name         20578 non-null  object             
 6   details_artist            20578 non-null  object             
 7   details_rarity            20578 non-null  category           
 8   details_color_identity    20578 non-null  object             
 9   details_colors            20578 non-null  object             
 10  details_set               20578 non-null  object             
 11  details_release

In [81]:
avg_cubing = (cleaned_data
                 .groupby('Cube Name')
                 .mean()
                 .sort_values('details_popularity', ascending=False)
                 .loc[:,['details_cmc', 'details_popularity', 'is_land', 'is_gold']]
)

avg_cubing

Unnamed: 0_level_0,details_cmc,details_popularity,is_land,is_gold
Cube Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Data Generated Vintage Cube,2.427778,10.981196,0.15,0.088889
The Museum of Modern,1.935417,10.002896,0.235417,0.210417
The Bun Magic Cube,1.697222,9.672429,0.225,0.097222
Dekkaru Cube,2.585185,8.866795,0.159259,0.092593
The Modern Darlings Cube,1.977778,8.805367,0.208333,0.122222
Eleusis,2.342222,8.483045,0.146667,0.057778
The Chicago Cube,2.297222,7.696703,0.144444,0.105556
Derek’s Cube,2.133333,7.6798,0.210417,0.164583
The Creative Cube,2.238961,7.446302,0.207792,0.085714
All-Foil Midrange Cube,2.927778,7.05099,0.138889,0.191667


In [82]:
x = cleaned_data.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)
x.to_csv('../Data Files/top_cards_2023_09_14.csv')
x.head(40)

details_name
Temple Garden              31
Stomping Ground            31
Overgrown Tomb             30
Sacred Foundry             30
Blood Crypt                30
Breeding Pool              29
Godless Shrine             29
Steam Vents                29
Hallowed Fountain          29
Watery Grave               29
Lightning Bolt             26
Windswept Heath            26
Bloodstained Mire          26
Wooded Foothills           26
Path to Exile              26
Verdant Catacombs          25
Polluted Delta             25
Flooded Strand             25
Arid Mesa                  24
Misty Rainforest           24
Marsh Flats                24
Scalding Tarn              24
Eternal Witness            23
Abrade                     23
Faithless Looting          23
Duress                     23
Thraben Inspector          23
Young Pyromancer           22
Dismember                  21
Grim Lavamancer            21
Carrion Feeder             21
Tireless Tracker           21
Brainstorm                 

In [62]:
peasant = cleaned_data[cleaned_data['details_rarity'].isin(['uncommon', 'common'])]

In [63]:
peasant_groupby = peasant.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)
peasant_groupby.to_csv('../Data Files/peasant_cards_grouping_2023_09_08.csv')

In [83]:
cleaned_data[cleaned_data['details_set'] == 'woe'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
May's Fae Cube                      39
The Creative Cube                   27
The Spikeless Cube                  22
Amonkar Desert                      17
Rainbow Synergy Cube                16
Creatureless Cube                   16
Tiny Leaders                        16
The Tempo Cube                      14
The Jund Cube                       13
The Buildaround Cube                13
Spooky Black Halloween Graveyard    13
A Study in Harmony                  13
StormTime                           12
The Live the Dream Cube              8
The Chicago Cube                     8
The Cascade Cube                     8
The Penrose Cube                     7
The Bun Magic Cube                   7
Sammich's Peasant Cube               6
Regular Cube                         5
Dragons of Winter's Night            5
Derek’s Cube                         5
Uber Bear's Artifact Cube            4
The Devoid Cube                      4
Counters of Monte Cristo             4
Changeling Cube

In [66]:
cleaned_data[cleaned_data['details_name'] == 'Karakas'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
Tolsimir Cube                  1
Data Generated Vintage Cube    1
Name: details_name, dtype: int64

In [67]:
cleaned_data[cleaned_data['details_set'] == 'woe'].groupby('Cube Name')['details_name'].nunique().sort_values(ascending=False)

Cube Name
May's Fae Cube                      39
The Spikeless Cube                  21
Amonkar Desert                      17
Creatureless Cube                   15
Spooky Black Halloween Graveyard    13
StormTime                           12
A Study in Harmony                  12
The Jund Cube                       12
The Tempo Cube                      12
The Cascade Cube                     8
The Penrose Cube                     7
Sammich's Peasant Cube               6
The Chicago Cube                     5
Counters of Monte Cristo             5
Changeling Cube                      5
Dragons of Winter's Night            5
The Devoid Cube                      4
Derek’s Cube                         4
Uber Bear's Artifact Cube            3
Emma Partlow's Peasant Cube          2
Loial's Micro Cube                   2
Commander extravaganza!              1
Khans Expanded Cube                  1
Vehicle Cube: Eiganjo Drift          1
Name: details_name, dtype: int64

In [68]:
top_cards = cleaned_data.groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)

In [84]:
cleaned_data[cleaned_data['details_set'] == 'woe'].groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)

details_name
Tough Cookie                  11
Syr Ginger, the Meal Ender     9
Questing Druid                 8
Restless Cottage               6
Mosswood Dreadknight           6
                              ..
The Princess Takes Flight      1
Feed the Cauldron              1
Witch's Mark                   1
Experimental Confectioner      1
Monstrous Rage                 1
Name: Cube Name, Length: 135, dtype: int64

In [85]:
woe=cleaned_data[cleaned_data['details_set'] == 'woe'].groupby('details_name')['Cube Name'].nunique().sort_values(ascending=False)

In [86]:
woe.to_clipboard(sep=',')