## **Mutual Information Analysis for locomotion learning**

In [1]:
from collections import Counter
import torch
import numpy as np

ModuleNotFoundError: No module named 'torch'

In [2]:
data = torch.load("fullstate.pt")
# print(data)

#### **Data Preprocessing**

In [None]:
# 2000
state = torch.cat(data["state"]).cpu().numpy()
action = torch.cat(data["action"]).cpu().numpy()
n_sample = state.shape[0]
state = state[:50000]
action = action[:50000]
print("STATE SIZE : ", state.shape)
print("ACTION SIZE : ", action.shape)

STATE SIZE :  (500000, 64)
ACTION SIZE :  (500000, 19)


In [14]:
n_bins = 30
bins = np.linspace(-1, 1, n_bins)
print("BIN : " ,bins)
state_bin = np.digitize(state, bins=bins)
action_bin = np.digitize(action, bins=bins)

BIN :  [-1.         -0.93103448 -0.86206897 -0.79310345 -0.72413793 -0.65517241
 -0.5862069  -0.51724138 -0.44827586 -0.37931034 -0.31034483 -0.24137931
 -0.17241379 -0.10344828 -0.03448276  0.03448276  0.10344828  0.17241379
  0.24137931  0.31034483  0.37931034  0.44827586  0.51724138  0.5862069
  0.65517241  0.72413793  0.79310345  0.86206897  0.93103448  1.        ]


#### **Count unique state-action**

In [15]:
unique_state_patterns, state_counts = np.unique(state_bin, axis=0, return_counts=True)
print("UNIQUE STATE : ", len(state_counts))
unique_action_patterns, action_counts = np.unique(action_bin, axis=0, return_counts=True)
print("UNIQUE STATE : ", len(action_counts))

UNIQUE STATE :  496948
UNIQUE STATE :  438952


In [12]:
def prob_counter(state_bin , action_bin , n_sample):
    p_state = Counter()
    p_action = Counter()
    p_state_action = Counter()
    for i in range(n_sample):
        state_key = tuple(state_bin[i])
        action_key = tuple(action_bin[i])

        p_state[state_key] += 1/(n_sample)
        p_action[action_key] += 1/(n_sample)
        p_state_action[state_key + action_key] += 1/(n_sample)

## **Estimating mutual information**

Estimate mutual information for a discrete target variable.

Mutual information (MI) [1] between two random variables is a non-negative value, which measures the dependency between the variables. It is equal to zero if and only if two random variables are independent, and higher values mean higher dependency.

The function relies on nonparametric methods based on entropy estimation from k-nearest neighbors distances as described in [2] and [3]. Both methods are based on the idea originally proposed in [4].

It can be used for univariate features selection, read more in the User Guide.

https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.mutual_info_classif.html

In [21]:
from sklearn.feature_selection import mutual_info_classif , mutual_info_regression

In [22]:
mutual_info_regression(action , state[:,0])

array([0.19169185, 0.2126726 , 0.17580874, 0.17112678, 0.23385294,
       0.26131139, 0.29388112, 0.40509893, 0.05630774, 0.2899672 ,
       0.17147522, 0.22899034, 0.11246855, 0.1948124 , 0.23155449,
       0.05756503, 0.17038606, 0.16362867, 0.17573942])