In [1]:
from blackjack import BlackJack, BlackJackOffPolicy

In [2]:
print(BlackJack.__doc__)


    Approximately solves BlackJack using epsilon-greedy, constant-alpha Monte Carlo Control. Example use:

        bj = BlackJack(M=10000000, epsilon=0.1, alpha=1 / 5000, seed=2)
        bj.mc_control() # Run the MC control algorithm and some data to the data/ folder.

        (or you could do bj.load()) to load the data if you've already run the algorithm with the same parameter before.

        bj.plot("Q", width=300, height=150) # Will plot the Q-table as heatmap along with the value-maxing action indicated.

        bj.plot_over_m(18, 3, True, width=600, strokeWidth=2, height=500) # Will plot the Hit and Stick state-action
        values of (18, 3, True) over the m episodes processed. (18, 3, True) means the agent sum is 18, the dealers card
        is 3 and there is a useable ace.

    Since processing can take a long time, results are saved in a data/ folder. If you've run a blackjack object with
    the same M, epsilon, alpha and seed before, then you can call .load() instead o

## Run On-Policy MC Control for BlackJack

In [3]:
bj = BlackJack(M=500000, epsilon=0.1, alpha=1 / 500, seed=0)
bj.mc_control()

100%|████████████████████████████████████████████████████████████████████████| 500001/500001 [01:53<00:00, 4420.30it/s]


Saving data\Q_M500000__epsilon0_1__alpha0_002__seed0
Saving data\C_M500000__epsilon0_1__alpha0_002__seed0
Saving data\Q_hist_M500000__epsilon0_1__alpha0_002__seed0


### Plot Action-Value Heatmap

In [9]:
bj.plot('Q', width=310, height=250)

### Plot Counts of how many times state-action pairs were visited

In [11]:
bj.plot('C', width=325, height=250)

### Plot the action-values of Hit and Stick for a particular state over episodes

In [14]:
bj.plot_over_m(17, 9, False, width=400, height=300)

### Run Off-Policy MC Control for BlackJack

In [15]:
bj = BlackJackOffPolicy(M=500000, alpha=1 / 500, seed=0)
bj.mc_control()

100%|████████████████████████████████████████████████████████████████████████| 500001/500001 [01:42<00:00, 4868.53it/s]


Saving data\Q_M500000__OP__alpha0_002__seed0
Saving data\C_M500000__OP__alpha0_002__seed0
Saving data\Q_hist_M500000__OP__alpha0_002__seed0


In [16]:
bj.plot('Q', width=310, height=250)