### Centroid creation
The goal of this notebook is to create 3 csv files (`c_10.csv`, `c_25.csv`, `c_25.csv`) which contains the centroids of our sampled states. In particular, these centroids will be calculated based on the following raw data:
- 10% (100K rows) of random sampled data.
- 90% (900K rows) of sampled data using baseline policy.

In [17]:
from os.path import join
import pandas as pd
import numpy as np
from sklearn.cluster import MiniBatchKMeans


# Define our input folders and output files
DATADIR = join('..', 'data')

We read the files `ss_bsln_900k.csv` and `ss_random_100k.csv`. This is a database which contains 100K and 900K observations of the simulation of the agent playing with the opponent, using a random and baseline policy, respectively. 

In [7]:
ss_bsln = pd.read_csv(join(DATADIR, 'ss_bsln_900k.csv'))
ss_random = pd.read_csv(join(DATADIR, 'ss_random_100k.csv'))

We concat our dataframes which include both random and baseline samples and shuffle them. We also convert it to a numpy array.

In [13]:
ss = pd.concat([ss_bsln, ss_random], ignore_index=True).sample(frac=1).to_numpy()

We create 3 `MiniBatchKMeans` estimators with 10K, 25K and 50K clusters and a batch size of 2048 and fit it with our normalized sampled states dataframe. The purpose of this is to reduce the information of our sampled states in a number clusters, which will be the centroids of the RBF used in the next cells.

In [18]:
mbkm_model_10 = MiniBatchKMeans(n_clusters=10_000, random_state=0, batch_size=2048, verbose=True)
mbkm_model_10.fit(ss)      # 6min

Init 1/3 with method k-means++
Inertia for init 1/3: 5576.138222288506
Init 2/3 with method k-means++
Inertia for init 2/3: 5553.1745231358955
Init 3/3 with method k-means++
Inertia for init 3/3: 5609.342195315998
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 1/48828: mean batch inertia: 0.18302308346113283
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 2/48828: mean batch inertia: 0.19917553331191812, ewa inertia: 0.19917553331191812
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 3/48828: mean batch inertia: 0.19461126585518584, ewa inertia: 0.19915683809111057
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 4/48828: mean batch inertia: 0.19163304980942783, ewa inertia: 0.1991260206851262
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 5/48828: mean batch inertia: 0.19239512914030038, ewa inertia: 0.1990984509809283
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 6/488

MiniBatchKMeans(batch_size=2048, n_clusters=10000, random_state=0, verbose=True)

In [19]:
mbkm_model_25 = MiniBatchKMeans(n_clusters=25_000, random_state=0, batch_size=2048, verbose=True)
mbkm_model_25.fit(ss)   # 22min

Init 1/3 with method k-means++
Inertia for init 1/3: 8928.833355726292
Init 2/3 with method k-means++
Inertia for init 2/3: 8930.619099622496
Init 3/3 with method k-means++
Inertia for init 3/3: 8897.707219204804
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 1/48828: mean batch inertia: 0.11217278863608399
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 2/48828: mean batch inertia: 0.12292239863990483, ewa inertia: 0.12292239863990483
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 3/48828: mean batch inertia: 0.11989598572435949, ewa inertia: 0.12291000246499893
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 4/48828: mean batch inertia: 0.12792506667455394, ewa inertia: 0.12293054414745959
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 5/48828: mean batch inertia: 0.12348975664727618, ewa inertia: 0.12293283467956831
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 6/48

MiniBatchKMeans(batch_size=2048, n_clusters=25000, random_state=0, verbose=True)

In [20]:
mbkm_model_50 = MiniBatchKMeans(n_clusters=50_000, random_state=0, batch_size=2048, verbose=True)
mbkm_model_50.fit(ss)   # 82min

Init 1/3 with method k-means++
Inertia for init 1/3: 12189.820865617707
Init 2/3 with method k-means++
Inertia for init 2/3: 12164.48712022978
Init 3/3 with method k-means++
Inertia for init 3/3: 12242.632003626519
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 1/48828: mean batch inertia: 0.08127614129155272
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 2/48828: mean batch inertia: 0.08253633172402344, ewa inertia: 0.08253633172402344
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 3/48828: mean batch inertia: 0.0808787770080607, ewa inertia: 0.08252954238669619
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 4/48828: mean batch inertia: 0.08372493983363953, ewa inertia: 0.08253443872974252
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 5/48828: mean batch inertia: 0.08383486780449592, ewa inertia: 0.08253976528190615
[MiniBatchKMeans] Reassigning 1024 cluster centers.
Minibatch step 6/4

MiniBatchKMeans(batch_size=2048, n_clusters=50000, random_state=0, verbose=True)

We save the information of the centroids from our files in a dataframe called `c_XX_mix`. The shape of this dataframe will be 10K, 25K or 50K rows (depending on the number of clusters selected in our algorithm) by 12 columns (number of dimensions of the states of our environment).

In [21]:
def save_centroids_file(model, name):
    head = 'x_agent,y_agent,xdot_agent,ydot_agent,' \
        'x_ball,y_ball,xdot_ball,ydot_ball,' \
        'x_opponent,y_opponent,xdot_opponent,ydot_opponent'        
    np.savetxt(fname=join(DATADIR, name),
        X=model.cluster_centers_,
        fmt='%.5f',
        delimiter=',',
        header=head,
        comments='')
    
save_centroids_file(mbkm_model_10, "c_10_mix.csv")
save_centroids_file(mbkm_model_25, "c_25_mix.csv")
save_centroids_file(mbkm_model_50, "c_50_mix.csv")