<h2 style="font-family: monospace; color: purple;"> Poker Player Patterns</h2>

The data needed is [here](https://www.kaggle.com/smeilz/poker-holdem-games#Export%20Holdem%20Manager%202.0%2012292016131233.txt), the specific file used is 
<b><span style="color: black; background-color: red;">Export Holdem Manager 2.0 12302016144830.txt</span></b>.

<h3 style="color: green; font-family: monospace;">Environment settings</h3>

In [57]:
from pathlib import Path
import pandas as pd

nrounds, nrowdisplay = 10, 100
pd.options.display.max_rows = nrowdisplay

### Loading csv

In [80]:
load_path = Path.cwd() / 'tidy_data'

action_ids = pd.read_csv(load_path / 'action_ids.csv')
user_ids = pd.read_csv(load_path / 'user_ids.csv')
actions = pd.read_csv(load_path / 'actions.csv')
cardshow = pd.read_csv(load_path / 'cardshow.csv')
buyins = pd.read_csv(load_path / 'buyins.csv')
blinds = pd.read_csv(load_path / 'blinds.csv')

action_dict = dict(zip(action_ids.action_name, action_ids.action_id))
actions = actions[actions.round_id <= nrounds]
cardshow = cardshow[cardshow.round_id <= nrounds]
buyins = buyins[buyins.round_id <= nrounds]
blinds = blinds[blinds.round_id <= nrounds]

print(action_dict)

{'folds': 0, 'calls': 1, 'raises': 2, 'checks': 3, 'bets': 4, 'allin': 5}


### Basics
We'll gather some basic information about the players.
For each player we want:
- number of rounds played
- number of each action taken
- stake at each turn

In [61]:
n_rounds = buyins.groupby('user_id').apply(
    lambda df: pd.DataFrame(
        pd.DataFrame(
            data = {'rounds_played': [len(df.round_id)]}
        )
    )
).reset_index().drop(columns = ['level_1'])

In [62]:
n_actions = actions.groupby('user_id').apply(
    lambda df: pd.DataFrame(
        data = {
            key: [len(df.action_id[df.action_id == i])]
            for key, i in action_dict.items()
        }
    )
).reset_index().drop(columns = ['level_1'])

In [63]:
winnings = cardshow.groupby('user_id').apply(
    lambda df: pd.DataFrame(
        data = {
            'sum': [sum(df.amount)],
            'mean': [df.amount.mean()]
        }
    )
).reset_index().drop(columns = ['level_1'])

In [97]:
# df contains rows of one turn
def getStakes(df):
    pot_dict = {user: 0 for user in df.user_id}
    pot_so_far = 0
    for i, row in df.sort_values('action_order').iterrows():
        if row.action_id == 5:
            pot_dict[row.user_id] += row.amount
            pot_so_far += max(pot_so_far, pot_dict[row.user_id])
        elif row.action_id in [2, 4, 7]:
            pot_so_far += row.amount 
        if row.action_id not in [0, 5]:
            pot_dict[row.user_id] = pot_so_far
            if row.action_id == 6:
                pot_dict[row.user_id] = row.amount
    return pd.DataFrame(
        data = {
            'user_id': list(pot_dict.keys()),
            'user_stake': list(pot_dict.values())
        }
    )

blinds_temp = pd.DataFrame(
    data = {
        'round_id': blinds.round_id,
        'user_id': blinds.user_id,
        'amount': blinds.value,
        'turn': [0] * len(blinds.value),
        'action_id': blinds.blind_type_id.replace(
            [0, 1], [6, 7]
        ),
        'action_order': blinds.blind_type_id.replace(
            [0, 1], [-2, -1]
        )
    }
)

user_stakes = actions.append(
    blinds_temp, sort=True
).groupby(
    ['round_id', 'turn']
).apply(
    getStakes
).reset_index().drop(columns = ['level_2'])