# Recommending Cards for Magic: the Gathering Deckbuilding 

*Goal: input a card name and find other cards that often appear together with that card, and use this system to generate a deck.*

This program uses a file derived from ~10,000 modern decks made by users on deckstats.net. For every card 'x', a count was kept of every other card 'y' that occurred in the same deck as 'x'. Then, all counts were divided by the total number of decks containining card 'x': this results in the percent of decks containing card 'x' that also contained card 'y'. The resuting JSON file is a dictionary containing an entry for every card; each card's entry is a dictionary of card co-occurrences.

Note: this dataset was collected on December 27 2021, so cards from sets released after this date are not recognized. Also, cards that are not commonly used might register as invalid because they did not occur in the dataset. Card names are case-sensitive.

Thanks for checking out this project!

## Setup

First, load the .JSON file of card co-occurrences into a dictionary. We will use the 'printFreqs' method to show us card co-occurrences.

In [1]:
import json
import sys
stdout = sys.stdout

In [2]:
#load frequency dict for card co-occurrences
with open('fixedCardFreqs_deckstatsModern.json', 'r', encoding="utf-8") as freqsFile:
    freqDict = json.load(freqsFile)
    
#for filtering out lands when needed
with open('allLands.json', 'r', encoding='utf-8') as landFile:
    landFilter = json.load(landFile)

sys.stdout = stdout #tries to fix a weird printing bug: https://stackoverflow.com/a/65185107/14968857

In [3]:
#prints the n cards that most frequently occur in decks containing the given card
def printFreqs(str_cardName, n):
    #sort items by value in the entry for cardName and get the top 20
    example_freqs1 = freqDict[str_cardName]
    example_closestCards1 = sorted(example_freqs1.items(), key=lambda x: x[1], reverse=True)
    example_results1 = example_closestCards1[:n]

    #print the results
    print("These cards occur most frequently with \"" + str_cardName + "\":")
    for card in example_results1:
        print(card[0] + " : " + card[1])

## Example 1: Urza's Mine

As an example, we will find the 20 cards that most commonly occur with the card "**Urza's Mine**." 

You will see that 99% of decks with "**Urza's Mine**" also contain "**Urza's Tower**" and "**Urza's Tower**", since the three always appear together. "**Expedition Map**" appears in 93% of decks, "**Karn, the Great Creator**" appears 54% of the time, and so on.

In [4]:
#Example 1: find the cards that most frequently occur with "Urza's Mine"
printFreqs("Urza's Mine", 20)

These cards occur most frequently with "Urza's Mine":
Urza's Mine : 1.0000
Urza's Power Plant : 0.9969
Urza's Tower : 0.9969
Expedition Map : 0.9317
Karn, the Great Creator : 0.5435
Walking Ballista : 0.4348
Sylvan Scrying : 0.4161
Ancient Stirrings : 0.4068
Chromatic Star : 0.4068
Chromatic Sphere : 0.4006
Wurmcoil Engine : 0.3913
Ugin, the Ineffable : 0.3634
Ugin, the Spirit Dragon : 0.3571
Forest : 0.3509
Karn Liberated : 0.3385
Thought-Knot Seer : 0.3261
Ulamog, the Ceaseless Hunger : 0.2795
Eldrazi Temple : 0.2702
Oblivion Stone : 0.2640
Reality Smasher : 0.2640


## Example 2: Monastery Swiftspear

Shows the 20 cards that most commonly occur with "**Monastery Swiftspear**." Notice that a semi-coherent burn deck could be made just by sampling from the top results.

In [5]:
#Example 2: find the cards that most frequently occur with "Monastery Swiftspear" 
printFreqs("Monastery Swiftspear", 20)

These cards occur most frequently with "Monastery Swiftspear":
Monastery Swiftspear : 1.0000
Lightning Bolt : 0.9617
Mountain : 0.8445
Lava Spike : 0.4809
Soul-Scar Mage : 0.4474
Light Up the Stage : 0.4187
Lava Dart : 0.3828
Rift Bolt : 0.3684
Bloodstained Mire : 0.3660
Manamorphose : 0.3517
Sunbaked Canyon : 0.3445
Skewer the Critics : 0.3397
Wooded Foothills : 0.3014
Mutagenic Growth : 0.2775
Goblin Guide : 0.2727
Eidolon of the Great Revel : 0.2679
Fiery Islet : 0.2488
Bedlam Reveler : 0.2416
Seal of Fire : 0.2249
Sacred Foundry : 0.2249


## Limits of the Dataset

Here we try to find the 20 most commonly co-occurring cards with "**Elite Vanguard**," but since there was only one (mediocre) deck in the dataset that contained the input card, the output is weak and unhelpful.


In [6]:
#Example 3: find cards that most frequently occur with "Elite Vanguard". The dataset contains only one occurrence of this card. 
printFreqs("Elite Vanguard", 20)

These cards occur most frequently with "Elite Vanguard":
Angel's Feather : 1.0000
Angelic Wall : 1.0000
Elite Vanguard : 1.0000
Gideon's Avenger : 1.0000
Gideon's Lawkeeper : 1.0000
Plains : 1.0000


## Simple Deck Generating Algorithm

Here is a very simplified algorithm to construct a coherent deck given some starting cards. First, we put all co-occurring cards in a sorted list and add 4x of the top card to the final deck. Then, we put all of the newly added card's co-occurring cards to the list, sort it, and again add 4x of the top card to the final deck. Repeat until 36 cards have been added, then add 24 lands.

More nuance could be added later, like distinguishing between lands and non-lands, matching colors, using randomness in the selection process for variety, implementing a mana curve, prioritizing cards that co-occur with multiple cards in the deck, etc. but for now we keep it simple.

Let's try to generate a lifegain-themed deck, starting with **"Ajani's Pridemate"**:

Before we generate the deck, let's see what cards co-occur with **"Ajani's Pridemate"**: some of these (but not all) will appear in the final deck.

In [7]:
printFreqs("Ajani's Pridemate", 20)

These cards occur most frequently with "Ajani's Pridemate":
Ajani's Pridemate : 1.0000
Plains : 0.9531
Soul Warden : 0.5469
Soul's Attendant : 0.3828
Path to Exile : 0.3281
Swamp : 0.2656
Serra Ascendant : 0.2500
Heliod, Sun-Crowned : 0.2109
Healer's Hawk : 0.1953
Ajani's Welcome : 0.1797
Speaker of the Heavens : 0.1719
Martyr of Sands : 0.1719
Bloodthirsty Aerialist : 0.1719
Ajani, Strength of the Pride : 0.1484
Regal Caracal : 0.1484
Scoured Barrens : 0.1406
Honor of the Pure : 0.1406
Spectral Procession : 0.1328
Linden, the Steadfast Queen : 0.1250
Daxos, Blessed by the Sun : 0.1172


Now let's generate the deck. 

Remember, the card pool changes each time we add a card to the deck. The output shows each card picked, as well as the "origin card" that produced that card pick

In [8]:
#Picks the n best cards to add to the input deck, starting with the given initial list of card names
def completeDeck(inputDeckList, n):
    #start off by putting the given cards in the decklist
    deckList = inputDeckList
    
    cardPoolDict = {}
    
    #keep adding cards until you reach the desired amount
    while len(deckList) < n:
        
        #Combine the co-occurrence dicts of each card in the current decklist, keeping highest values, then sort it.
        for card in deckList:
            cardPoolDict = mergeDictsKeepHighest(cardPoolDict, convertToDictOfTuples(freqDict[card], card))
        closestCards = sorted(cardPoolDict.items(), key=lambda x: x[1], reverse=True)    
        
        #Add the top card in the cardpool to our decklist (If it's already in the deck, just pick the next one)
        for pick in closestCards:
            if pick[0] not in deckList and pick[0] not in landFilter: #filter out lands; add additional/better filters here.
                deckList.append(pick[0])
                print("Picked " + pick[0] + " : " + str(pick[1]))
                #cycle back after we pick one card so that this card choice can influence the next one
                break
    
    #print the final deck            
    print("\nFINAL DECK: ")
    for cardName in deckList: 
        print("4 " + cardName)
    print("24 Lands") #too lazy to check colors    
    
    #return the final list
    return deckList


#Helper: to track the 'origin card' responsible for our picks, we need to add this card to each entry in our co-occ. freq dict
# The result is a dict of the format {str_cardName : (flt_freq, str_originCardName)}
def convertToDictOfTuples(fDict, str_originCard):
    return {key : (fDict[key], str_originCard) for key in fDict}


#Helper: merges our two dictionaries, keeping highest values when keys are the same
def mergeDictsKeepHighest(dict1, dict2):
    return {k: dict1[k] if float(dict1.get(k, (0.0, "NONE"))[0]) >= float(dict2.get(k, (0.0, "NONE"))[0]) else dict2[k] 
            for k in set(dict1) | set(dict2)}


#generate and print the final deck
lifegainDeck = completeDeck(["Ajani's Pridemate"], 9)

Picked Soul Warden : ('0.5469', "Ajani's Pridemate")
Picked Soul's Attendant : ('0.6250', 'Soul Warden')
Picked Path to Exile : ('0.5495', "Soul's Attendant")
Picked Serra Ascendant : ('0.3736', "Soul's Attendant")
Picked Martyr of Sands : ('0.6143', 'Serra Ascendant')
Picked Squadron Hawk : ('0.5769', 'Martyr of Sands')
Picked Speaker of the Heavens : ('0.4808', 'Martyr of Sands')
Picked Spectral Procession : ('0.3617', 'Squadron Hawk')

FINAL DECK: 
4 Ajani's Pridemate
4 Soul Warden
4 Soul's Attendant
4 Path to Exile
4 Serra Ascendant
4 Martyr of Sands
4 Squadron Hawk
4 Speaker of the Heavens
4 Spectral Procession
24 Lands


Not a bad deck at all!

Observe how each card we add influences the next picks; 
    First, '**Soul Warden**' was chosen for its co-occurrence with '**Ajani's Pridemate**'. 
    Then '**Soul's Attendent**' was chosen for its co-occurrence with '**Soul Warden**', and so on.

Notice that the first 4 picks are all near the top of the original co-occurrences with '**Ajani's Pridemate**' (**Soul Warden**, **Soul's Attendant**, **Path to Exile**, and **Serra Ascendant**) but the subsequent picks are further down on this list, or not on it at all. We only picked '**Squadron hawk**' and '**Speaker of the Heavens**' because of their high co-occurrence with '**Martyr of Sands**'. 

This algorithm helps ensure that synergies exist between other cards besides just the input cards.

#### Some More Decks (for fun):

In [9]:
#Burn
burnDeck = completeDeck(["Lava Spike"], 9)

Picked Lightning Bolt : ('0.9917', 'Lava Spike')
Picked Monastery Swiftspear : ('0.8340', 'Lava Spike')
Picked Rift Bolt : ('0.6556', 'Lava Spike')
Picked Skewer the Critics : ('0.7717', 'Rift Bolt')
Picked Eidolon of the Great Revel : ('0.6250', 'Rift Bolt')
Picked Goblin Guide : ('0.7222', 'Eidolon of the Great Revel')
Picked Boros Charm : ('0.5952', 'Eidolon of the Great Revel')
Picked Lightning Helix : ('0.6382', 'Boros Charm')

FINAL DECK: 
4 Lava Spike
4 Lightning Bolt
4 Monastery Swiftspear
4 Rift Bolt
4 Skewer the Critics
4 Eidolon of the Great Revel
4 Goblin Guide
4 Boros Charm
4 Lightning Helix
24 Lands


In [10]:
#Treefolk
treefolkDeck = completeDeck(["Doran, the Siege Tower"], 9)

Picked Treefolk Harbinger : ('0.7619', 'Doran, the Siege Tower')
Picked Timber Protector : ('0.7857', 'Treefolk Harbinger')
Picked Leaf-Crowned Elder : ('1.0000', 'Timber Protector')
Picked Dauntless Dourbark : ('0.9091', 'Timber Protector')
Picked Bosk Banneret : ('0.9091', 'Timber Protector')
Picked Dungrove Elder : ('0.7619', 'Dauntless Dourbark')
Picked Path to Exile : ('0.3810', 'Doran, the Siege Tower')
Picked Assault Formation : ('0.3810', 'Doran, the Siege Tower')

FINAL DECK: 
4 Doran, the Siege Tower
4 Treefolk Harbinger
4 Timber Protector
4 Leaf-Crowned Elder
4 Dauntless Dourbark
4 Bosk Banneret
4 Dungrove Elder
4 Path to Exile
4 Assault Formation
24 Lands


In [12]:
#Storm (not the best but still okay)
stormDeck = completeDeck(["Grapeshot"], 9)

Picked Manamorphose : ('0.8254', 'Grapeshot')
Picked Pyretic Ritual : ('0.7778', 'Grapeshot')
Picked Desperate Ritual : ('0.8257', 'Pyretic Ritual')
Picked Serum Visions : ('0.7460', 'Grapeshot')
Picked Past in Flames : ('0.7302', 'Grapeshot')
Picked Goblin Electromancer : ('0.7857', 'Past in Flames')
Picked Gifts Ungiven : ('0.7679', 'Past in Flames')
Picked Opt : ('0.7679', 'Past in Flames')

FINAL DECK: 
4 Grapeshot
4 Manamorphose
4 Pyretic Ritual
4 Desperate Ritual
4 Serum Visions
4 Past in Flames
4 Goblin Electromancer
4 Gifts Ungiven
4 Opt
24 Lands


## Thanks for Viewing!