# **Set Analyzer**

Le but de ce script est d'analyser statistiquement les différents set de carte du jeu *Magic the Gathering* afin de :


1.   Déterminer les cartes les plus pertinentes dans chaque set
2.   Comparer les métadonnées liées aux cartes entre chaque set

L'objet de l'analyse est de pouvoir objectiver la valeur intrinsèque d'une carte *intra* et *inter* set.

L'application visée est l'utilisation de ses données afin de performer dans les formats limités (*sealed, draft*)




# **Initialisation**

In [1]:
# Import librairies to use in the code

import os
import json
import re
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# import seaborn as sns
from tqdm import tqdm

from set_analyzer import *

In [3]:
%load_ext autoreload
%autoreload 2

Upload data from JSON (dataset from https://mtgjson.com/)

In [5]:
# Set path of the folder containing dataset
dataset_FolderPath = Path.cwd().parent / 'datasets' / 'MTG_datasets' # @dev TBC before each use

# Set path of the File
dataset_FileName = 'AllPrintings.json'
dataset_FilePath = dataset_FolderPath / dataset_FileName

In [7]:
# Load all datasets
data = pd.read_json(dataset_FilePath)
allSets = data.iloc[2:]['data'] # 2 first rows of JSON files are metadata

In [9]:
setCompare = allSets.apply(pd.Series)[['baseSetSize', 'code', 'totalSetSize', 'type', 'name', 'releaseDate']]

## Load Limited dataset function

In [17]:
set_code = 'OTJ'
cards = loadLimitedSet(allSets, set_code)

191


Unnamed: 0,name,keywords,manaValue,manaCost,colorIdentity,power,toughness,rarity,types,text
228,Ruthless Lawbringer,,3.0,{1}{W}{B},"[B, W]",3.0,2.0,uncommon,[Creature],"When Ruthless Lawbringer enters, you may sacri..."
232,Slick Sequence,,2.0,{U}{R},"[R, U]",,,uncommon,[Instant],Slick Sequence deals 2 damage to any target. I...
234,"Vial Smasher, Gleeful Grenadier",,2.0,{B}{R},"[B, R]",3.0,2.0,uncommon,[Creature],"Whenever another outlaw you control enters, Vi..."
237,Wrangler of the Damned,[Flash],5.0,{3}{W}{U},"[U, W]",1.0,4.0,uncommon,[Creature],"Flash\nAt the beginning of your end step, if y..."
239,Bandit's Haul,,3.0,{3},[],,,uncommon,[Artifact],"Whenever you commit a crime, put a loot counte..."
240,Boom Box,,2.0,{2},[],,,uncommon,[Artifact],"{6}, {T}, Sacrifice Boom Box: Destroy up to on..."
241,Gold Pan,"[Equip, Treasure]",2.0,{2},[],,,common,[Artifact],"When Gold Pan enters, create a Treasure token...."
242,Lavaspur Boots,[Equip],1.0,{1},[],,,uncommon,[Artifact],Equipped creature gets +1/+0 and has haste and...
243,Luxurious Locomotive,"[Crew, Treasure]",5.0,{5},[],6.0,5.0,uncommon,[Artifact],"Whenever Luxurious Locomotive attacks, create ..."
244,Mobile Homestead,[Crew],2.0,{2},[],3.0,3.0,uncommon,[Artifact],Mobile Homestead has haste as long as you cont...


## 1) SPEED

Format speed can be caracterized by :
- the ratio of creatures: set a grade from 100-90% == S ; 90-70% == A ; 70-50% == B ; 50-30% == C ; 30-10% == D ; 10-0% == E
- the median creature `manaValue`
- the median `powerToManaValue` : above 1: creatures hit hard, fast
- the board state (see section 2)
- the number of interactions (see section 3)

In [46]:
# Filter for 'Creature' only
cardsCreatureFiltered = cards[cards['types'].apply(lambda x: 'Creature' in x)]
cardsCreatureFiltered = cardsCreatureFiltered.copy()

# Ratio of creatures
nTot = len(cards)
nCreature = len(cardsCreatureFiltered)
limitedCreatureRatio = (nCreature / nTot) * 100 # in percentage
# add a grade (@dev TBD)

# Creature Manavalue
meanCreatureMV = cardsCreatureFiltered['manaValue'].mean()
cardsCreatureFiltered['normalizedCreatureManaValue'] = cardsCreatureFiltered['manaValue'] - meanCreatureMV # normalized columns

# Creature Power to ManaValue
cardsCreatureFiltered['powerToManaValue'] = cardsCreatureFiltered['power'] / cardsCreatureFiltered['manaValue']
meanPowerToMV = cardsCreatureFiltered['powerToManaValue'].mean()
cardsCreatureFiltered['normalizedPowerToManaValue'] = cardsCreatureFiltered['power'] - meanPowerToMV # normalized columns

In [48]:
# Add values to setCompare
setCompare = setCompare.copy()

setCompare.at[set_code, 'limited_CreatureRatio'] = limitedCreatureRatio
setCompare.at[set_code, 'limited_meanCreatureManaValue'] = meanCreatureMV
setCompare.at[set_code, 'limited_meanCreaturePowerToManaValue'] = meanPowerToMV

## 2) BOARD STATE

- the mean creature `power`
- the mean creature `thougness`
- the mean `powerToToughness ratio`: above 1: creatures are likely to hit harder and defend badly (and vice versa)
- ratio of evasive creatures (ie. 'Flying', 'Trample', 'Menace')

In [75]:
# Creature Power
meanCreaturePower = cardsCreatureFiltered['power'].mean()
cardsCreatureFiltered['normalizedCreaturePower'] = cardsCreatureFiltered['power'] - meanCreaturePower # normalized columns

# Creature Toughness
meanCreatureToughness = cardsCreatureFiltered['toughness'].mean()
cardsCreatureFiltered['normalizedCreatureToughness'] = cardsCreatureFiltered['toughness'] - meanCreatureToughness # normalized columns

# Creature Power to Toughness
cardsCreatureFiltered['powerToToughness'] = cardsCreatureFiltered['power'] / cardsCreatureFiltered['toughness']
meanPowerToToughness = cardsCreatureFiltered['powerToToughness'].mean()
cardsCreatureFiltered['normalizedPowerToToughness'] = cardsCreatureFiltered['power'] - meanPowerToToughness # normalized columns

# Evasion
def countKeywords(data):
    unique_keywords = list(data['keywords'].explode().unique())
    unique_keywords.remove(np.nan)
    exploded_data = data.explode('keywords')
    filtered_data = exploded_data[exploded_data['keywords'].isin(unique_keywords)]
    KW_count = exploded_data['keywords'].value_counts().to_dict()
    return KW_count

def countEvasiveKeywords(keyword_dict, keyword_list):
    evasiveCount = [keyword_dict[key] for key in keyword_list]
    return evasiveCount

KWCount = countKeywords(cardsCreatureFiltered)

#evasiveKW = ['Flying', 'Trample', 'Menace']
#evasiveKWCount = countEvasiveKeywords(KWCount, evasiveKW)

In [77]:
# Add values to setCompare
setCompare = setCompare.copy()
setCompare.at[set_code, 'limited_meanCreaturePower'] = meanCreaturePower
setCompare.at[set_code, 'limited_meanCreatureToughness'] = meanCreatureToughness
setCompare.at[set_code, 'limited_meanCreaturePowerToToughness'] = meanPowerToToughness
setCompare.at[set_code, 'limited_KWCount'] = [KWCount]
#setCompare.at[set_code, 'limited_evasiveKWCount'] = [evasiveKWCount]

ValueError: Must have equal len keys and value when setting with an ndarray

## 3) FIXING

- monocolor-to-multicolor ratio (lands excluded)
- multi-pip ratio : cards with more that one colored pip in mana cost
- ratio of mana producer + types (lands, manadorks, manarocks, treasures)
- type of mana produced (TBD)

In [56]:
# Monocolor to multicolor ratio
non_land_cards_total = len(cards[(cards['types'].apply(lambda x: 'Land' not in x))])
multicolor_nonland_cards = len(
    cards[
        (cards['types'].apply(lambda x: 'Land' not in x)) 
        & (cards['colorIdentity'].apply(len) > 1)
    ])
monocolorToMulticolorRatio = multicolor_nonland_cards / non_land_cards_total

# Multi-pip ratio
def isMultiPip(s, letters_to_remove=None):
    if letters_to_remove is None:
        letters_to_remove = ['{', '}', 'C', 'X']  # Assign default list safely

    if not isinstance(s, str):  # Handle NaN or non-string values
        return False
    
    s = ''.join(c for c in s if c not in letters_to_remove and not c.isdigit())
    return len(s) > 1

multiPipRatio = len(cards[cards['manaCost'].apply(isMultiPip)]) / non_land_cards_total

# Mana producers
def producesMana(s): #@dev TO BE TESTED FOR FETCH
    pattern1 = r'add (?:\d+|one|two|three|four|five) mana'
    match1 = re.search(pattern1, s, re.IGNORECASE)

    pattern2 = r'add {'
    match2 = re.search(pattern2, s, re.IGNORECASE)
    
    return bool(match1 or match2)

# Ratio of mana producers
n_manaProducer = len(cards[cards['text'].apply(producesMana)])
n_nonLand_manaProducer = len(
    cards[
        cards['text'].apply(producesMana) 
        & (cards['types'].apply(lambda x: 'Land' not in x))
    ])
manaProducerRatio = n_manaProducer / nTot
nonLand_manaProducerRatio = n_nonLand_manaProducer / nTot

# Type of producer
def countManaProducerTypes(data):
    df = data[data['text'].apply(producesMana)]

    # Non-basic Lands
    a = len(df[
            df['types'].apply(lambda x: 'Land' in str(x))
            ]) 
    # Dorks (that do not produces treasures)
    b = len(df[
            df['types'].apply(lambda x: 'Creature' in str(x)) 
            & df['keywords'].apply(lambda x: 'Treasure' not in str(x))
            ]) 
    # Rocks (artifacts that are not creatures and do not produce treasures)
    c = len(df[
            df['types'].apply(lambda x: 'Artifact' in str(x)) 
            & df['types'].apply(lambda x: 'Creature' not in str(x)) 
            & df['keywords'].apply(lambda x: 'Treasure' not in str(x))
            ]) 
    # Treasures
    d = len(df[
            df['keywords'].apply(lambda x: 'Treasure' in str(x))
            ])

    # @dev, here does not account for any other type of mana production (ie. Dark Ritual)

    producer_type = {
        'Lands': a,
        'Dorks': b,
        'Rocks': c,
        'Treasures': d
    }
    return producer_type

manaProducerTypes = countManaProducerTypes(cards)

# Type of mana produced
# @dev, TBD in the future

In [57]:
setCompare.at[set_code, 'limited_MonoToMulticolorRatio'] = monocolorToMulticolorRatio
setCompare.at[set_code, 'limited_MultiPipRatio'] = multiPipRatio
setCompare.at[set_code, 'limited_manaProducerRatio'] = manaProducerRatio
setCompare.at[set_code, 'limited_nonLand_manaProducerRatio'] = nonLand_manaProducerRatio
setCompare.at[set_code, 'limited_manaProducerTypes'] = [manaProducerTypes]

## 4) Interactions (TBD)

a quel point le set est interactif ?
000 - définir ce qu'est une interaction
- ratio de permanents
- pourcentage d'interaction
- la "vitesse" de l'interaction = distribution de mana value des sorts interactifs
- type d'interaction : single-target removal + combat trick
- color pie

In [60]:
"""
# ratio of permanents

def checklist(items_wanted, items_tbc):
  return any(item in items_wanted for item in items_tbc)

permanent_index = [item for item in type_index if (item !='Instant' and item!='Sorcery')]
cards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==False)] #non-permanent
cards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==True)]  #permanents

print('permanent ratio = ' + str(len(cards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==True)])/len(cards)*100) + ' %')
"""

"\n# ratio of permanents\n\ndef checklist(items_wanted, items_tbc):\n  return any(item in items_wanted for item in items_tbc)\n\npermanent_index = [item for item in type_index if (item !='Instant' and item!='Sorcery')]\ncards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==False)] #non-permanent\ncards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==True)]  #permanents\n\nprint('permanent ratio = ' + str(len(cards['types'][cards['types'].apply(lambda x: checklist(x,permanent_index)==True)])/len(cards)*100) + ' %')\n"

In [62]:
"""
# get interactive cards

interaction_list = [
    'destroy',
    'exile',
    'counter',
    'target'
]

def interactive_card(str):
  if any(word in str for word in interaction_list):
    return True
  else:
    return False

cards[cards['text'].apply(interactive_card)]
"""

"\n# get interactive cards\n\ninteraction_list = [\n    'destroy',\n    'exile',\n    'counter',\n    'target'\n]\n\ndef interactive_card(str):\n  if any(word in str for word in interaction_list):\n    return True\n  else:\n    return False\n\ncards[cards['text'].apply(interactive_card)]\n"

# **Interset**

In [66]:
# toutes les stats interset

# mettre en input le nombre et la temporalité des sets à comparer (tout le modern, 4 derniers sets, etc)
# écrire une ligne d'input

# appeler les fonctions précédentes
# ranger dans des listes / df pour faire les statistiques ensuite

In [73]:
setCompare.loc[['OTJ', 'MH3']]

Unnamed: 0,baseSetSize,code,totalSetSize,type,name,releaseDate,limited_CreatureRatio,limited_medianCreatureManaValue,limited_medianCreaturePowerToManaValue,limited_medianCreaturePower,limited_medianCreatureToughness,limited_medianCreaturePowerToToughness,limited_KWCount,limited_MonoToMulticolorRatio,limited_MultiPipRatio,limited_manaProducerRatio,limited_nonLand_manaProducerRatio,limited_manaProducerTypes
OTJ,286,OTJ,374,expansion,Outlaws of Thunder Junction,2024-04-19,52.736318,3.103774,0.826774,2.509434,2.783019,1.03068,"[{'Plot': 17, 'Flying': 16, 'Saddle': 10, 'Rea...",0.113636,0.181818,0.18408,0.059701,"[{'Lands': 25, 'Dorks': 4, 'Rocks': 1, 'Treasu..."
MH3,303,MH3,560,draft_innovation,Modern Horizons 3,2024-06-14,45.410628,3.404255,0.758589,2.537634,2.677419,1.083492,"[{'Devoid': 17, 'Flying': 14, 'Adapt': 9, 'Bes...",0.182857,0.308571,0.256039,0.101449,"[{'Lands': 32, 'Dorks': 14, 'Rocks': 2, 'Treas..."
