Bluffing Game ML Project

v0.0

This code plays the following Poker like bluffing game and evolves strategies using reinforcement learning.  I am doing this to practice Python code and to experiment with ML methods.

The game works as follows.  Two players, A and B, each bets $1.  They are each given a random card from a deck D  (a set of numbers initially discrete, may change to continuous).  Player A can then decide to pass, in which case the higher card gets the $2, or raise, adding another dollar to the pot.  If A raises, B chooses to fold, in which case A gets the pot, or call, in which case the holder of the highest card gets the $4 pot.  If they are equal, they split.  Thus the payouts are as follows.

| A Choice | B Choice | Higher Card | Payout to A | Payout to B|
|----------|----------|-------------|-------------|------------|
| Pass     | N/A      |  A          |      +1     |      -1    |
| Pass     | N/A      |  B          |      -1     |      +1    |
| Pass     | N/A      |  Ti         |       0     |       0    |
| Raise    | Fold     |  N/A        |      +1     |      -1    |
| Raise    | Call     |  A          |      +2     |      -2    |
| Raise    | Call     |  B          |      -2     |      +2    |
| Raise    | Call     |  Tie        |       0     |       0    |
|----------|----------|-------------|-------------|------------|

Idea is from https://fivethirtyeight.com/features/dont-throw-out-that-calendar/ where the game is analyzed for the set {1,2,3,4,5,6}.  There is an optimum mixed strategy for A involving bluffing, and multiple optimum strategies -- all mixed -- for B.  I want to see if, and how fast, ML can find these strategies.


Version 0
Use only deck of 6.  Get basic gameplay and strategies to work.  Create tournament.  No ML yet. 

Working.  Odd result -- Optimal strategy for A does poorly against random B.  It does do well against others, especially optimal B.  Will be interesting for evolutionary strategies.

Still needs work on formatting output of tournament

In [1]:
""" Playgame routine.  Plays one game between StratA and StratB.  Outputs return to A"""

def playgame(GameDeck, StratA, StratB, verbose = False):

    # Deal 
    cardA = GameDeck.deal()
    cardB = GameDeck.deal()
    if verbose: print("Card A: ", cardA, " Card B: ", cardB)
    
    # Player A decides
    playA = StratA.play(cardA,"A")
    if verbose: print("Player A: ", playA)

    # if Player A pass, showdown for $2    
    if playA == "Pass":
        if cardA > cardB:
            payout = 1
        elif cardB > cardA:
            payout = -1
        else:
            payout = 0
    # if Player A raises, player B decides
    else:
        playB = StratB.play(cardB,"B")
        if verbose: print("Player B: ", playB)
        
        #if player B calls, showdown for $4
        if playB == "Call":
            if cardA > cardB:
                payout = 2
            elif cardB > cardA:
                payout = -2
            else:
                assert (cardA == cardB)
                payout = 0
        # if player B folds, A gets the ante
        else:
            payout = 1
    if verbose: 
        print("Payout: ",payout)
        print("")
    return payout
        

In [2]:
""" Deck class  Defines the deck.  For now only a discrete set 0 - n-1"""

class Deck:
    def __init__(self, decksize):
        self.decksize = decksize
        self.cards = range(self.decksize)      
        
    def deal(self):
        import random              # is it ok to have this here?
        card_delt = random.randint(0,self.decksize -1 )
        return card_delt
    

In [3]:
"""  Strategy Class.  Sets standards for all strategies 
    Create a subclass for each strategy"""

class Strategy:
    def __init__(self):
        pass
        # self.gamedeck = GameDeck
        # self.decksize = GameDeck.decksize
        
    def play(self,mycard,player):
        """ determine strategy for player, having been dealt card mycard.  
        If player = "A" return either 'Pass' or 'Raise' 
        If player = 'B' return either 'Fold' or 'Call' """ 
        pass
    



In [15]:
""" This defines some basic strategies.  
    randomstrat, plays randomly  
    simplestrat(x)  raises/calls if card is greater than x, passes/folds if card is <= x
    bluffstrat(x,p) like simple(x) but bluffs with prob p  if card is <= x  """

class randomstrat(Strategy):

    def play(self,mycard,player):
        import random
        if random.random() < 0.5:
            if player == "A":
                return 'Pass'
            else:
                return 'Fold'
        else:
            if player == "A":
                return "Raise"
            else:
                return 'Call'
        
class simplestrat(Strategy):
    def __init__(self,threshold):
        self.threshold = threshold
        
    def play(self,mycard,player):
        if mycard <= self.threshold:
            if player == "A":
                return 'Pass'
            else:
                return 'Fold'
        else:
            if player == 'A':
                return 'Raise'
            else:
                return 'Call'
    
class bluffstrat(Strategy):

    def __init__(self,threshold,bluffprob):
        self.threshold = threshold
        self.bluffprob = bluffprob
        
    def play(self,mycard,player):
        import random
        if  (mycard <= self.threshold) and (random.random() > self.bluffprob) :
            if player == "A":
                return 'Pass'
            else:
                return 'Fold'
        else:
            if player == 'A':
                return 'Raise'
            else:
                return 'Call'
    
        



        
    

In [16]:
"""A few more strategies
    humanstrat allows for a human input
    optimalstrat is from game theory.  
    See above reference.  Note adjustments for cards numbered 0-5 , not 1-6
    """

class humanstrat(Strategy):

    def play(self,mycard,player):
        print("You are player ",player,". Your card is ",mycard,'.')
        validinput = False
        while not validinput:
            humanplay = input("Your play")
            if (player == "A") and (humanplay == "Pass" or humanplay == "P"):
                humanplay = "Pass"
                validinput = True
            elif (player == "A") and (humanplay == "Raise" or humanplay == "R"):
                humanplay = "Raise"
                validinput = True
            elif (player == "B") and (humanplay == "Fold" or humanplay == "F"):
                humanplay = "Fold"
                validinput = True
            elif (player == "B") and (humanplay == "Call" or humanplay =="C"):
                humanplay = "Call"
                validinput = True
            else: print("Invalid Input")
        return humanplay
    

class optimalstrat(Strategy):
    
    def play(self,mycard,player):
        import random

        if player == "A":
            if mycard == 4 or mycard == 5:
                return "Raise"
            elif mycard == 0:
                if random.random() < 2.0/3.0:
                    return "Raise"
                else:
                    return "Pass"
            else:
                return "Pass"

        else:
            if mycard == 0:
                return "Fold"
            elif mycard == 1 or mycard == 2:
                if random.random() < 1.0 / 3.0:
                    return "Call"
                else:
                    return "Fold"
            else:
                return "Call"

        
            
            
            

In [17]:
"""Challenge Routine.
Plays n games between two strategies and returns the net result
"""

def challenge(num_games,strata,stratb,strataname = "",stratbname = "",verbose = False):

    decksize = 6
    d = Deck(decksize)

    a_net_wins = 0

    if verbose: print("Player A: ", strataname,"   Player B: ", stratbname )

    for i in range (num_games):
        a_net_wins += playgame(d,strata,stratb)
        if verbose and i % 1000 == 0: print(i," games played")

    if verbose: print(strataname, " won $", a_net_wins, "  $", a_net_wins / num_games, " per game.")
    
    return a_net_wins

In [18]:
"""Tournament.  Runs a tournament among several A and B strategies."""

num_games = 1000000
verbose = True

Astrategies = []
r = randomstrat()
Astrategies += [{'strat':r,'name':"Random    "}]
s = simplestrat(1)
Astrategies += [{'strat':s,'name':"Simple 1  "}]
s = simplestrat(2)
Astrategies += [{'strat':s,'name':"Simple 2  "}]
s = simplestrat(3)
Astrategies += [{'strat':s,'name':"Simple 3  "}]
o = optimalstrat()
Astrategies += [{'strat':o,'name':"Optimal   "}]

Bstrategies = []
r = randomstrat()
Bstrategies += [{'strat':r,'name':"Random    "}]
s = simplestrat(1)
Bstrategies += [{'strat':s,'name':"Simple 1"}]
s = simplestrat(2)
Bstrategies += [{'strat':s,'name':"Simple 2"}]
s = simplestrat(3)
Bstrategies += [{'strat':s,'name':"Simple 3"}]
o = optimalstrat()
Bstrategies += [{'strat':o,'name':"Optimal   "}]

for bstrat in Bstrategies:
    bstrat['profit'] = 0
    
for astrat in Astrategies:
    astrat['profit'] = 0
    for bstrat in Bstrategies:
        if verbose: print (astrat['name']," playing ",bstrat['name'])
        awins = challenge(num_games,astrat['strat'],bstrat['strat'],verbose = False)
        astrat['profit'] += awins
        bstrat['profit'] -= awins
        astrat[bstrat['name']] = awins
    astrat['average'] = astrat['profit'] / (num_games * len(Bstrategies))
    
for bstrat in Bstrategies:
    bstrat['average'] = bstrat['profit'] / (num_games * len(Astrategies))
    

print()
print ("Tournament Results -- winnings for Strategy A")
print("                        Strategy B")
print ("Strategy A    | ", end = "")
for b in Bstrategies:
    print(b['name'], end =" | ")
print("Total")
print("_______________________________________________")
for a in Astrategies:
    print(a['name'], end = "    |")
    for b in Bstrategies:
        print(a[b['name']], end = "         |")
    print(a['profit'])
print("Total         |", end = "")
for b in Bstrategies:
    print(-b['profit'], end = "         |")
print(-sum([b['profit'] for b in Bstrategies]))
    
        
    
                            


Random      playing  Random    
Random      playing  Simple 1
Random      playing  Simple 2
Random      playing  Simple 3
Random      playing  Optimal   
Simple 1    playing  Random    
Simple 1    playing  Simple 1
Simple 1    playing  Simple 2
Simple 1    playing  Simple 3
Simple 1    playing  Optimal   
Simple 2    playing  Random    
Simple 2    playing  Simple 1
Simple 2    playing  Simple 2
Simple 2    playing  Simple 3
Simple 2    playing  Optimal   
Simple 3    playing  Random    
Simple 3    playing  Simple 1
Simple 3    playing  Simple 2
Simple 3    playing  Simple 3
Simple 3    playing  Optimal   
Optimal     playing  Random    
Optimal     playing  Simple 1
Optimal     playing  Simple 2
Optimal     playing  Simple 3
Optimal     playing  Optimal   

Tournament Results -- winnings for Strategy A
                        Strategy B
Strategy A    | Random     | Simple 1 | Simple 2 | Simple 3 | Optimal    | Total
_______________________________________________
Random        |2497