## Generate Data (maximize win percentage)

#### lookup table

action|card|CMC|instasorcery|prowess_count|power|haste|damage|impulse|draw|scry|

spectacle cards consist of two different actions
all other cards are 1-1 with actions

~~follow a scry strategy of:~~
* ~~if land, bottom if total lands in play and in hand >= 4~~
* ~~if non-land, bottom if total lands <= 3~~

Actually, bottoming if land and total lands >= 3 or if nonland and total lands <= 2 improves the game speed.

multi-armed bandit weights def
|action|weight|

contextual bandit weights def
|action|turn|weight|

Due to the theoretically unbounded number of states in the contextual version, setdefault() would be needed when reading from it. Contextual bandits are a middle-ground between n-armed bandits and full q-learning. There are "states" which indicate context, but there's no explicit transition between states contingent upon actions. In MTG, turns will pass whether cards are played or not, but the turn number gives information about how useful a card is expected to be (assuming the game is played normally otherwise).

The current multi-armed bandit can be converted to a contextual bandit by also recording turn_count when appending to the action log (as "{selected_action},{turn_count}.format()". A bunch of setdefaults will be needed when retrieving utility estimates from action_weights, enabling new state-action pairs to be added mid-execution. Probably also some modification to allow the the utility values of state-action pairs to be matched to defined actions (in enumerate_actions()?).


In [18]:
#TODO
#rewrite mulligan code
#add blocker simulation
#better learning rule

#The objectives of this project are two-fold. First, create a system that can estimate
#the efficacy of a Prowess decklist. Second, as a necessity of the first task, create an
#AI system that can learn how to play a Prowess decklist reasonably well. This also has
#benefit of informing strategy (bearing in mind the AI's limitations).

import random, os

if "weights.csv" not in os.listdir():
    fh = open("weights.csv", "w")
    fh.close()

class goldfish:
    def __init__(self, player_deck, action_defs, action_weights):
        self.action_defs = action_defs
        self.player_deck = player_deck[:]
        random.shuffle(self.player_deck)
        self.player_hand = self.player_deck[:7]

        #it's an okay mulligan strategy, I guess
        if self.player_hand.count("land") < 2 or self.player_hand.count("land") > 4:
            random.shuffle(self.player_deck)
            self.player_hand = self.player_deck[:6]
            self.player_deck = self.player_deck[6:]
        else:
            self.player_deck = self.player_deck[7:]
            
        self.opp_life = 20
        self.lands_in_play = 0
        self.lands_available = 0
        self.spectacle = False
        self.prowess_count = 0
        self.instasorcery_count = 0
        self.extra_hand = impulse_hand()
        self.action_log = []
        self.action_weights = action_weights
        self.sick_creatures = []
        self.nonsick_creatures = []
        self.prowess_creatures = ["bedlam_reveler", "monastery_swiftspear", "soulscar_mage"]
        self.wizard_creatures = ["soulscar_mage", "viashino_pyromancer", "ghitu_lavarunner", "young_pyromancer"]
        self.spectacle_cards = ["light_up_the_stage", "skewer_the_critics"]
        self.kfk_buffer = impulse_hand()
        self.kfk_counters = 0
        self.bauble_count = 0
        self.graveyard_types = set()

    def simulate_game(self, t=1):
        turn_count = 1

        while True:
            #player turn

            #untap
            self.lands_available = self.lands_in_play
            self.nonsick_creatures += self.sick_creatures
            self.sick_creatures = []
            
            #mishra's bauble processing
            for i in range(self.bauble_count):
                try:
                    self.draw_card()
                except:
                    return 10000, self.action_log
            self.bauble_count = 0
            
            #process kumano faces kakkazan stage change
            self.kfk_buffer.do_time()
            for kfk in self.kfk_buffer.cards:
                if kfk[1] == 2:
                    self.kfk_counters += 1
                elif kfk[1] == 2:
                    self.nonsick_creatures.append({"name":action, "base":self.action_defs[action]["power"], "temp":0})
            
            #draw
            if turn_count != 1:
                try:
                    self.draw_card()
                except:
                    return 10000, self.action_log
                #self.player_hand.append(self.player_deck[0])
                #del self.player_deck[0]
            
            #play land
            if "land" in self.extra_hand.get_cards():
                self.extra_hand.remove_card("land")
                self.lands_in_play += 1
                self.lands_available += 1
            elif "land" in self.player_hand:
                del self.player_hand[self.player_hand.index("land")]
                self.lands_in_play += 1
                self.lands_available += 1
            
            #play other cards (if applicable)   
            #enumerate options
            
            while True:
                sorted_actions = self.enumerate_actions(turn_count)
                if len(sorted_actions) < 1:
                    break

                #simulated annealing-esque approach to reinforcement learning
                #simple but slow
                if random.random() < (1 / t):
                    index = random.randrange(len(sorted_actions))
                    selected_action = sorted_actions[index] #pick random action
                else:
                    selected_action = sorted_actions[0]     #pick best action

                self.action_log.append((selected_action, turn_count))

                #process selected action                
                self.do_action(selected_action)
                if self.opp_life < 1:
                    return turn_count, self.action_log

            #attack, block, calculate damage
            #use only non-sick creatures

            #check if ghitu_lavarunner effect should be working after already being played
            sick_buffer = []
            for i in range(len(self.sick_creatures)):
                
                #move ghitu_lavarunner to nonsick_creatures
                if self.sick_creatures[i]["name"] == "ghitu_lavarunner" and self.instasorcery_count > 1:
                    self.nonsick_creatures.append(self.sick_creatures[i])
                    
                #move all other creatures to buffer
                else:
                    sick_buffer.append(self.sick_creatures[i])
            self.sick_creatures = sick_buffer
            
            #block one creature, assume opponent will trade using 3/2 blocker
            if turn_count >= 3:
                for i in range(len(self.nonsick_creatures)):
                    creature = self.nonsick_creatures[i]
                    creature_pow = creature["base"] + self.prowess_count if creature["name"] in self.prowess_creatures else creature["base"]
                    if creature_pow <= 2:
                        del self.nonsick_creatures[i]
                        break
            
            total_damage = 0
            
            for creature in self.nonsick_creatures:
                total_damage += creature["base"] + creature["temp"]
                total_damage += self.prowess_count if creature["name"] in self.prowess_creatures else 0
                total_damage += 1 if (creature["name"] == "ghitu_lavarunner" and self.instasorcery_count > 1) else 0
            
            self.opp_life -= total_damage
            
            if self.opp_life < 1:
                    return turn_count, self.action_log

            #cleanup
            self.prowess_count = 0
            self.spectacle = False
            self.extra_hand.do_time()
            for creature in self.nonsick_creatures:
                creature["temp"] = 0
            for creature in self.sick_creatures:
                creature["temp"] = 0
                
            turn_count += 1
            self.kfk_counters = 0

    def enumerate_actions(self, turn_count):
        actions = []
        
        for card in self.player_hand + self.extra_hand.get_cards():
            if card == "land":
                continue
            
            if self.spectacle and card in self.spectacle_cards:
                    card = "spec_" + card
                    
            if self.action_defs[card]["buff"] > 0 and len(self.nonsick_creatures) < 1:
                continue
                
            if card == "fb_lava_dart" and self.lands_in_play < 1:
                #add casting conditions logic here?
                continue
                
            if card == "wizards_lightning" and len(set(self.wizard_creatures) & set(self.get_creatures_in_play())) > 0 and self.lands_available > 1:
                actions.append(card)
                
            if card == "bedlam_reveler" and max(self.action_defs[card]["CMC"] - self.instasorcery_count, 2) <= self.lands_available:
                actions.append(card)

            if card != "land" and self.action_defs[card]["CMC"] <= self.lands_available:
                actions.append(card)
        
        return sorted(actions, key = lambda x: self.action_weights.setdefault((x, turn_count), 0), reverse=True)
    
    def do_action(self, action):
        "power|haste|buff"
        #do hardcodings mostly here
        
        #record card type if going to graveyard - needed for DRC's delirium....Ignoring?
        #if self.action_defs[action]["type"]
        
        #pay mana
        if action == "bedlam_reveler":
            mana_cost = max(self.action_defs[action]["CMC"] - self.instasorcery_count, 2)
        elif action == "wizards_lightning" and len(set(self.wizard_creatures) & set(self.get_creatures_in_play())) > 0:
            mana_cost = 1
        else:
            mana_cost = self.action_defs[action]["CMC"]
        assert(mana_cost <= self.lands_available)
        self.lands_available -= mana_cost
        
        #remove card from hands, check extra hand first
        card_name = self.action_defs[action]["card"]
        if card_name in self.extra_hand.get_cards():
            self.extra_hand.remove_card(card_name)
        else:
            del self.player_hand[self.player_hand.index(card_name)]
        
        if action == "bedlam_reveler":
            self.player_hand = []
            
        if action == "mishras_bauble":
            self.bauble_count += 1
            
        if action == "manamorphose":
            self.lands_available += 2
            
        #fb_reckless_charge will count as a card in hand (although it actually isn't)
        #shouldn't matter, unless a card that cares about hand size is added
        if action == "reckless_charge":
            self.player_hand.append("fb_reckless_charge")
            
        #same as reckless charge, the fb_dart will count as a card in hand
        if action == "lava_dart":
            self.player_hand.append("fb_lava_dart")
        
        if action == "fb_lava_dart":
            self.lands_in_play -= 1 #can just float the mana

        #handle rimrock knight adventure - am assuming that the adventure is always played first
        if action == "rimrock_knight_a":
            self.player_hand.append("rimrock_knight_b")
            
        if self.action_defs[action]["instasorcery"]:
            if not action.startswith("fb_"):
                self.instasorcery_count += 1
            else:
                self.instasorcery_count -= 1
            
            #young pyromancer trigger here
            pyromancer_count = self.nonsick_creatures.count("young_pyromancer") + self.sick_creatures.count("young_pyromancer")
            for i in range(pyromancer_count):
                self.sick_creatures.append({"name":"elemental", "base":1, "temp":0})
                
        #process DRC surveil
        for i in range(self.get_creatures_in_play().count("dragons_rage_channeler")):
            if self.action_defs[action]["type"] != "creature" and self.action_defs[action]["type"] != "land":
                try:
                    total_lands = self.player_hand.count("land") + self.lands_in_play + (self.extra_hand.get_cards().count("land") > 0)
                    if self.player_deck[0] == "land" and total_lands >= 2:
                        del self.player_deck[0]
                    elif self.player_deck[0] != "land" and total_lands <= 2:
                        del self.player_deck[0]
                except:
                    return 10000, self.action_log
                    
        if action == "kumano_faces_kakkazan":
            self.kfk_buffer.add_card("kumano_faces_kakkazan", time=3)
            
        if self.action_defs[action]["prowess_count"]:
            self.prowess_count += 1
            
        self.opp_life -= self.action_defs[action]["damage"]
        if self.action_defs[action]["damage"] > 0:
            self.spectacle = True
        
        for i in range(int(self.action_defs[action]["scry"])):
            try:
                total_lands = self.player_hand.count("land") + self.lands_in_play + (self.extra_hand.get_cards().count("land") > 0)
                if self.player_deck[0] == "land" and total_lands >= 2:
                    self.player_deck.append(self.player_deck[0])
                    del self.player_deck[0]
                elif self.player_deck[0] != "land" and total_lands <= 2:
                    self.player_deck.append(self.player_deck[0])
                    del self.player_deck[0]
            except:
                return 10000, self.action_log
                
        for i in range(int(self.action_defs[action]["draw"])):
            try:
                self.draw_card()
            except:
                return 10000, self.action_log
            #self.player_hand.append(self.player_deck[0])
            #del self.player_deck[0]
            
        for i in range(int(self.action_defs[action]["impulse"])):
            try:
                self.extra_hand.add_card(self.player_deck[0])
                del self.player_deck[0]
            except:
                return 10000, self.action_log
            
        if self.action_defs[action]["power"] > 0:
            if self.action_defs[action]["haste"]:
                self.nonsick_creatures.append({"name":action, "base":self.action_defs[action]["power"] + self.kfk_counters, "temp":0})
            else:
                self.sick_creatures.append({"name":action, "base":self.action_defs[action]["power"] + self.kfk_counters, "temp":0})
            self.kfk_counters = 0
                
        if len(self.nonsick_creatures) > 0:
            self.nonsick_creatures[0]["temp"] += self.action_defs[action]["buff"]
            
    def get_creatures_in_play(self):
        nonsick_creature_names = [creature["name"] for creature in self.nonsick_creatures]
        sick_creature_names = [creature["name"] for creature in self.sick_creatures]
        return nonsick_creature_names + sick_creature_names
    
    def draw_card(self):
        if len(self.player_deck) > 0:
            self.player_hand.append(self.player_deck[0])
            del self.player_deck[0]
        
        
        """if self.player_deck[0] == "rimrock_knight_a":
            self.player_hand.append("rimrock_knight_a")
            self.player_hand.append("rimrock_knight_b")
        else:
            self.player_hand.append(self.player_deck[0])
        del self.player_deck[0]"""   
        
            
class impulse_hand:
    def __init__(self):
        self.cards = []
        
    def add_card(self, card, time=2):
        self.cards.append([card, time])
        
    def remove_card(self, card_name):
        self.cards.sort(key = lambda x: x[1])
        
        #find index
        card_index = -1
        for i in range(len(self.cards)):
            if self.cards[i][0] == card_name:
                card_index = i
                break
        assert(card_index != -1)
        
        del self.cards[card_index]
        
    def get_cards(self):
        return [card[0] for card in self.cards]
    
    def do_time(self):
        untrimmed_cards = []
        
        for i in range(len(self.cards)):
            self.cards[i][1] -= 1
            if self.cards[i][1] > 0:
                untrimmed_cards.append(self.cards[i])

        self.cards = untrimmed_cards
                
#may need to modify this function to covert values once the data table is better defined
def load_defs(filepath, delimiter="\t"):
    defs = {}
    
    with open(filepath, "r") as fh:
        records = [line.strip().split(delimiter) for line in fh.readlines()]
        headers = records[0][1:]
        
        for record in records[1:]:
            defs[record[0]] = dict(zip(headers, record[1:]))
            for key in defs[record[0]].keys():
                if key != "card" and key != "type":
                    defs[record[0]][key] = float(defs[record[0]][key])
            
    return defs

def load_weights(filepath, delimiter="\t"):
    weights = {}
    
    with open(filepath, "r") as fh:
        for line in fh:
            line = line.strip().split(delimiter)
            weights[(line[0], int(line[1]))] = float(line[2])
            
    return weights

#intended for saving multi-armed bandit weights
def save_weights(filepath, action_weights, delimiter="\t"):
    weights = {}
    
    with open(filepath, "w") as fh:
        for k,v in action_weights.items():
            print(str(k[0]) + delimiter + str(k[1]) + delimiter + str(v), file=fh)
            
    return weights

def update_weights(action_weights, action_log, turn_count):
    update_value = 100 * (1 - turn_count / 5)
    
    for action_record in action_log:
        action_weights[action_record] += update_value
        
def modify_decklist(decklist, cardpool):
    decklist = decklist.copy()
    cardpool = cardpool.copy()
    
    random.shuffle(decklist)
    
    #remove a card
    for i in range(len(decklist)):
        if decklist[i] != "land":
            removed_card_index = i
    del decklist[removed_card_index]
            
    #get difference between decklist and cardpool
    for i in decklist:
        if i in cardpool:
            cardpool.remove(i)
            
    #add random card from difference to decklist
    decklist.append(cardpool[random.randint(0, len(cardpool)-1)])
    
    return decklist

def run_goldfish(decklist):
    match_results = []

    for i in range(game_count):
        goldfisher = goldfish(decklist, action_defs, action_weights)
        turn_count, action_log = goldfisher.simulate_game(i+1)
        update_weights(action_weights, action_log, turn_count)
        match_results.append(turn_count)

        #at the end of game, redistribute values in action-value dict based on game length
            #shorter game -> higher weighting to cards in log
            #played earlier -> higher weighting versus cards played later (doesn't make sense in contextual version)

    avg_turns = sum(match_results) / float(len(match_results))

    save_weights("weights.csv", action_weights)
    
    return avg_turns

#8.4 with contextual bandit
card_pool = ["lightning_bolt"] * 4 + ["play_with_fire"] * 4 + ["titans_strength"] * 4 + \
           ["reckless_impulse"] * 4 + ["skewer_the_critics"] * 4 + \
           ["light_up_the_stage"] * 4 + ["monastery_swiftspear"] * 4 + \
           ["soulscar_mage"] * 4 + ["ghitu_lavarunner"] * 4 + \
           ["viashino_pyromancer"] * 4 + ["crash_through"] * 4 + \
           ["bedlam_reveler"] * 4 + ["kumano_faces_kakkazan"] * 4 + \
           ["young_pyromancer"] * 4 + ["lightning_strike"] * 4 + \
           ["wizards_lightning"] * 4 + ["reckless_charge"] * 4 + \
           ["rimrock_knight_a"] * 4 + ["mishras_bauble"] + \
           ["lava_dart"] * 4 + ["manamorphose"] * 4 + ["mutagenic_growth"] * 4 + \
           ["dragons_rage_channeler"] * 4

game_count = 100
action_defs = load_defs("cards.csv", ",")
action_weights = load_weights("weights.csv")


#decklist = random.sample(card_pool, 40)
decklist = ["bedlam_reveler"] * 2 + \
["crash_through"] * 1 + \
["dragons_rage_channeler"] * 1 + \
["ghitu_lavarunner"] * 1 + \
["kumano_faces_kakkazan"] * 2 + \
["lava_dart"] * 2 + \
["light_up_the_stage"] * 2 + \
["lightning_bolt"] * 4 + \
["lightning_strike"] * 3 + \
["manamorphose"] * 1 + \
["mishras_bauble"] * 1 + \
["monastery_swiftspear"] * 4 + \
["mutagenic_growth"] * 1 + \
["play_with_fire"] * 1 + \
["reckless_charge"] * 1 + \
["reckless_impulse"] * 1 + \
["rimrock_knight_a"] * 4 + \
["skewer_the_critics"] * 1 + \
["soulscar_mage"] * 2 + \
["titans_strength"] * 1 + \
["viashino_pyromancer"] * 2 + \
["wizards_lightning"] * 1 + \
["young_pyromancer"] * 1
decklist += ["land"] * (len(decklist) // 2)
best_turns = 10000000
best_list = decklist

for i in range(1000):
    
    next_list = modify_decklist(best_list, card_pool)
    avg_turns = run_goldfish(next_list)
    
    if avg_turns < best_turns:
        best_turns = avg_turns
        best_list = next_list
        print("{}".format(best_turns))
        
        
print(best_turns)
for card in sorted(list(set(best_list))):
    print("{}: {}".format(card, best_list.count(card)))

7.08
7.01
6.92
6.63
6.55
6.52
6.29
6.24
6.24
bedlam_reveler: 1
crash_through: 1
dragons_rage_channeler: 2
kumano_faces_kakkazan: 2
land: 20
lava_dart: 2
light_up_the_stage: 1
lightning_bolt: 3
lightning_strike: 2
manamorphose: 1
mishras_bauble: 1
monastery_swiftspear: 4
mutagenic_growth: 2
play_with_fire: 2
reckless_charge: 2
reckless_impulse: 2
rimrock_knight_a: 4
skewer_the_critics: 1
soulscar_mage: 2
titans_strength: 1
viashino_pyromancer: 2
wizards_lightning: 1
young_pyromancer: 1
