Training a recurrent neural network player
-----------------------------------------

The Python library Keras is used in this Chapter two train two LSTM models. The models are referred to as:
    
    - sequence to sequence, and
    - sequence to probability.

The code to construct the two models is given below.

**sequence to sequence**

In [1]:
import warnings

warnings.filterwarnings("ignore")

In [2]:
from keras.models import Sequential

from keras.layers import (
    Dense,
    Dropout,
    CuDNNLSTM,
)

num_hidden_cells = 100
drop_out_rate = 0.2

model = Sequential()

model.add(CuDNNLSTM(num_hidden_cells, return_sequences=True, input_shape=(None, 1)))

model.add(Dropout(rate=drop_out_rate))

model.add(Dense(1, activation="sigmoid"))

model.summary()

Using TensorFlow backend.


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
cu_dnnlstm_1 (CuDNNLSTM)     (None, None, 100)         41200     
_________________________________________________________________
dropout_1 (Dropout)          (None, None, 100)         0         
_________________________________________________________________
dense_1 (Dense)              (None, None, 1)           101       
Total params: 41,301
Trainable params: 41,301
Non-trainable params: 0
_________________________________________________________________


**sequence to probability**

In [3]:
from keras.models import Sequential
from keras.layers import (
    Dense,
    Dropout,
    CuDNNLSTM,
)

num_hidden_cells = 100
drop_out_rate = 0.2

model = Sequential()

model.add(
    CuDNNLSTM(num_hidden_cells, return_sequences=True, input_shape=(None, 1))
)

model.add(CuDNNLSTM(num_hidden_cells))
model.add(Dropout(rate=drop_out_rate))

model.add((Dense(1, activation="sigmoid")))
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
cu_dnnlstm_2 (CuDNNLSTM)     (None, None, 100)         41200     
_________________________________________________________________
cu_dnnlstm_3 (CuDNNLSTM)     (None, 100)               80800     
_________________________________________________________________
dropout_2 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 101       
Total params: 122,101
Trainable params: 122,101
Non-trainable params: 0
_________________________________________________________________


The networks were trained on different subsets of the best response sequence data set of Chapter 6.

**Strategies and total best response sequences per data set**

In [14]:
from pathlib import Path

import pandas as pd
import axelrod as axl

In [24]:
dfs = []
for file in Path(
    "/Users/storm/rsc/Training-IPD-strategies-with-RNN/best_responses/"
).glob("*.csv"):
    dfs.append(pd.read_csv(file, index_col=0,))

In [25]:
df = pd.concat(dfs).reset_index(drop=True)

In [36]:
across_strategies = [
    "Anti Tit For Tat",
    "Adaptive Pavlov 2006",
    "Cautious QLearner",
    "EasyGo",
    "Eatherley",
    "Evolved HMM 5",
    "Forgiver",
    "Gladstein",
    "GraaskampKatzen",
    "Hard Tit For Tat",
    "Pun1",
    "PSO Gambler 2_2_2 Noise 05",
    "Punisher",
    "Tricky Cooperator",
    "$e$",
]

In [17]:
with open("tex/strategies_across_total.tex", 'w') as out:
    out.write(f"{len(top_strategies)}")

In [18]:
top_strategies = [
    "PSO Gambler Mem1",
    "Evolved ANN 5",
    "DoubleCrosser",
    "Omega TFT",
    "EvolvedLookerUp2_2_2",
    "Fool Me Once",
    "Gradual",
    "PSO Gambler 2_2_2 Noise 05",
    "Evolved HMM 5",
    "PSO Gambler 1_1_1",
    "PSO Gambler 2_2_2",
    "Evolved ANN",
    "Evolved FSM 16",
    "Winner12",
    "BackStabber",
    "Evolved FSM 16 Noise 05",
    "Evolved ANN 5 Noise 05",
    "Evolved FSM 4",
]

In [19]:
with open("tex/top_strategies_total.tex", 'w') as out:
    out.write(f"{len(top_strategies)}")

In [40]:
basic_strategies = [s.name for s in axl.basic_strategies]

In [21]:
with open("tex/basic_strategies_total.tex", 'w') as out:
    out.write(f"{len(basic_opponents)}")

**Total number of best response sequence for the sub sets**

In [32]:
len(df[df['opponent'].isin(top_strategies)]['opponent'].unique()) == len(top_strategies)

True

In [33]:
len(df[df['opponent'].isin(top_strategies)]) 

714

In [43]:
with open("tex/top_strategies_sequences.tex", 'w') as out:
    out.write(f"{len(df[df['opponent'].isin(top_strategies)]) }")

In [37]:
len(df[df['opponent'].isin(across_strategies)]['opponent'].unique()) == len(across_strategies)

True

In [38]:
len(df[df['opponent'].isin(across_strategies)]) 

212

In [44]:
with open("tex/across_strategies_sequences.tex", 'w') as out:
    out.write(f"{len(df[df['opponent'].isin(across_strategies)])}")

In [41]:
len(df[df['opponent'].isin(basic_strategies)]['opponent'].unique()) == len(basic_strategies)

True

In [42]:
len(df[df['opponent'].isin(basic_strategies)]) 

84

In [45]:
with open("tex/basic_strategies_sequences.tex", 'w') as out:
    out.write(f"{len(df[df['opponent'].isin(basic_strategies)]) }")