# RL Assignment 1: MiniPoker

**Deadline**:  Tue 14 Feb 2023, 23:59

**Game Description**

You are going to generate episodes of a simple game called *MiniPoker*. There are two players in the game. 

The rules are as follows.
- Both players add an *ante* of size 1 to the stakes to begin the game.
- Then the 'cards' are dealt. The players could only see their own hands.
- The hands are assumed to be real numbers between 0 and 1. Player 1's hand is the realization $u$ of a standard uniform variable $U$ on the interval $[0,1]$. Player 2's hand is the realization $v$ of a standard uniform variable $V$ on the interval $[0,1]$. The variables $U$ and $V$ are independent.
- After seeing her/his hand, player 1 can choose between *passing* and *betting*. If he passes, a *showdown* follows immediately. The players compare their hands in the showdown, and the player with the highest hand wins the pot. *Betting* means adding an extra amount 1 to the stakes.
- After a bet by player 1, player 2 can decide to *fold* or to *call*. If she/he *folds*, she/he loses her/his *ante* of 1 to player 1. To *call*, player 2 must put an extra amount 1 in the pot. In that case, a *showdown* follows, and the player with the highest hand win the pot.

**Example Episode**
- Both players 1 and 2 add an ante of size 1 to the stakes to begin the game.
- Player 1's hand is 0.7, and Player 2's is 0.4
- Player 1 choose betting, and adds an extra amount 1 to the stakes.
- Player 2 decides to call, and put an extra amount in the pot.
- A showndown follows. Player 1 wins the pot.

In [16]:
import numpy as np

## Part 1: Creating the Deck

Your first task is to read the showdown() method of our class *deck*, which sets the attribute *winner* to be the index of the winning player: 1 if player 1 wins, 2 for player 2 wins, and np.nan for a tie.

In [49]:
class deck:
    def __init__(self,stake=0):
        # Generate the hands
        self.P1=np.random.uniform()
        self.P2=np.random.uniform()
        self.winner = None
    def showdown(self):
        if self.winner is not None:
            return self.winner
        if self.P1>self.P2:
            self.winner=1
        elif self.P1<self.P2:
            self.winner=2
        else:
            self.winner=np.nan
        return self.winner

## Part 2: Creating Player 1

Your second task is to read the choose() method of our class *player1*, which sets the attribute *action* to be the decision made by player 1: string value "pass" if player 1 chooses to pass, string value "bet" if player 1 chooses to bet. When player 1 chooses to bet, make sure to use the stacking() method to add an extra amount 1 to the stakes.

Player 1 chooses to bet if $u>0.9$ or $u<0.2$. Otherwise she/he chooses to pass. Note that the value of $u$ is stored in _hand_ attribute.

In [50]:
class player1:
    def __init__(self,hand=None,stake=0):
        self.hand=hand
        self.stake = stake
        self.action= None
    def staking(self,add_amount=1):
        self.stake=self.stake+add_amount
    def reset(self):
        self.hand=None
        self.stake=1
        self.action=None
    def choose(self):
        if self.hand is None:
            self.action=None
        elif self.action is not None:
            return self.action
        else:
            if self.hand<0.2 or self.hand>0.9:
                self.action="bet"
                self.staking(add_amount=1)
            else:
                self.action="pass"          
            return self.action

## Part 3: Creating Player 2

Your third task is to complete the choose() method of our class *player2*. Set the attribute *action* to be the decision made by player 2: string value "fold" if player 2 chooses to fold, string value "call" if player 2 chooses to call. When player 1 passes, you do NOT need to change the attribute *action*; therefore, its values remain as the null value *None*. null value *None* if player 1 passes. When player 2 chooses to call, make sure to use the stacking() method to add an extra amount 1 to the stakes.

Player 2 chooses to call if $v> 0.85$. Otherwise she/he chooses to fold. Note that the value of $v$ is stored in _hand_ attribute.

In [51]:
class player2:
    def __init__(self,hand=None,stake=0):
        self.hand=hand
        self.stake = stake
        self.action= None
    def staking(self,add_amount=1):
        self.stake=self.stake+add_amount
    def reset(self):
        self.hand=None
        self.stake=1
        self.action=None
    def choose(self):
        if self.hand is None:
            self.action=None
        elif self.action is not None:
            return self.action
        else:
            # Start Coding Here
            
            if self.hand>0.85:
                self.action="call"
                self.staking(add_amount=1)
            else:
                self.action="fold"    
            
            # End Coding Here #
            return self.action

## Part 4: Generate Episodes

Generate 20 independent Episodes. We assume that players can never go broke.

Display the episode index, the winner and the players' hands, actions, and amounts of stake.

In [63]:
p1=player1()
p2=player2()
NumEps=20

results=[]

for i in range(NumEps):
    np.random.seed(i)
    d=deck()
    p1.reset()
    p2.reset()
    p1.hand=d.P1
    p2.hand=d.P2
    
    # Start Coding Here
    
    print('---------') # New episode divider
    
    #Display index 
    print('Episode index: {}'.format(i+1))
    
    print('Player 1 hand: {:.2f}'.format(p1.hand))
    print('Player 2 hand: {:.2f}'.format(p2.hand))
    
    # P1 action
    p1_action = p1.choose()    
    # P2 action
    p2_action = p2.choose()

    if p1_action == 'pass':
        # Showdown
        print('Amount at stake: {}'.format(p1.stake + p2.stake))
        print('The winner is player {}'.format(d.showdown()))
        print('Player 1 action: {}'.format(p1_action))
        print('Player 2 action is {}'.format(p2_action))

    else: # Player 1 bets
        
        if p2_action == 'fold': # Player 2 folds
            d.winner = 1
            print('Amount at stake: {}'.format(p1.stake + p2.stake))
            print('The winner is player 1')
            print('Player 1 action: {}'.format(p1_action))
            print('Player 2 action is {}'.format(p2_action))
        
        else: # Player 2 calls
            # Showdown
            print('Amount at stake: {}'.format(p1.stake + p2.stake))
            print('The winner is player {}'.format(d.showdown()))
            print('Player 1 action: {}'.format(p1_action))
            print('Player 2 action is {}'.format(p2_action))
    
    # End Coding Here -> Check
    
    if ( (d.winner in [1,2,np.nan])
        and (p1.hand is not None) and (p2.hand is not None)
        and (0<=p1.hand<=1) and (0<=p2.hand<=1) 
        and (p1.action in ["pass","bet"]) and (p2.action in ["call","fold",None])
        and (p1.stake in [1,2]) and (p2.stake in [1,2])):
        episode_result=[i,d.winner,p1.hand,p2.hand,p1.action,p2.action,p1.stake,p2.stake];
        print(episode_result)
        results.append(episode_result)
    else:
        raise Exception('The values are not correct.')    
    # Important: Check if you receive errors!
    

---------
Episode: 1
Player 1 hand: 0.55
Player 2 hand: 0.72
Amount at stake: 2
The winner is player 2
Player 1 action: pass
Player 2 action is fold
[0, 2, 0.5488135039273248, 0.7151893663724195, 'pass', 'fold', 1, 1]
---------
Episode: 2
Player 1 hand: 0.42
Player 2 hand: 0.72
Amount at stake: 2
The winner is player 2
Player 1 action: pass
Player 2 action is fold
[1, 2, 0.417022004702574, 0.7203244934421581, 'pass', 'fold', 1, 1]
---------
Episode: 3
Player 1 hand: 0.44
Player 2 hand: 0.03
Amount at stake: 2
The winner is player 1
Player 1 action: pass
Player 2 action is fold
[2, 1, 0.43599490214200376, 0.025926231827891333, 'pass', 'fold', 1, 1]
---------
Episode: 4
Player 1 hand: 0.55
Player 2 hand: 0.71
Amount at stake: 2
The winner is player 2
Player 1 action: pass
Player 2 action is fold
[3, 2, 0.5507979025745755, 0.7081478226181048, 'pass', 'fold', 1, 1]
---------
Episode: 5
Player 1 hand: 0.97
Player 2 hand: 0.55
Amount at stake: 3
The winner is player 1
Player 1 action: bet
Pl