# Exploring reconstruction data

Some notes.

1. Response times are not correct on stimulus/submission records and will need to be reconstructed
2. I find *no indication* that there are errors in the final submission. This leaves the weird indicator observations at position 0 to be explained, but it's clear they are not causing the board representation to be incorrect.

Some todos.

1. Count neighboring pieces at each position for each and both colors for error prediction
2. Look at distribution of errors by unique position.
3. Should probably do a more proper factor analysis rather than independent tests and regressions, but these are adequate (and clear!) enough for a first pass
4. **SUPER IMPORTANT** import real/fake records for all positions!


## Boilerplate

Imports and data loading.

In [None]:
import os

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.stats as sts
import seaborn as sns
import statsmodels
import statsmodels.api as sm
from statsmodels.formula.api import ols

from lib.utility_functions import *
from lib.exp4 import *

sns.set_style('white')
sns.set_context('talk')

% matplotlib inline

In [None]:
# data_dir = os.path.expanduser('/Volumes/GoogleDrive/My Drive/Bas Zahy Gianni - Games/Data/4_rcn/Raw Data')
data_dir = '../etc/4 Reconstruction/Raw Data/'
trained_files_dir = os.path.join(data_dir, 'Trained')
naive_files_dir = os.path.join(data_dir, 'Untrained')

all_files = get_all_csvs(trained_files_dir) + get_all_csvs(naive_files_dir)

DF = load_data(all_files)
DF = process_data(DF)

bpi, wpi, bpf, wpf = unpack_positions(DF)

black_errors = (bpf != bpi).astype(int)
white_errors = (wpf != wpi).astype(int)

In [None]:
DF.head()

In [None]:
df = load_file(all_files[0])
initial_map = DF[['Subject ID', 'Initials', 'Condition']]
initial_map = initial_map.pivot_table(index='Initials', values=['Subject ID', 'Condition'], aggfunc=lambda x: x.values[0])
initial_map.to_csv('../etc/4 Reconstruction/subject_map.csv')
# Get initials from file names

In [None]:
# ex = pd.read_csv(all_files[0], names=[
#         'Index', 'Subject ID', 'Player Color',
#         'Game Index', 'Move Index', 'Status',
#         'Black Position', 'White Position', 'Action',
#         'Response Time', 'Time Stamp',
#         'Mouse Timestamps', 'Mouse Position'
#     ])

# no_eyecal = ex['Status'] != 'eyecal'
# reconi = ex['Status'] == 'reconi'
# reconf = ex['Status'] == 'reconf'
# only_ends = reconi | reconf
# print(len(ex.loc[reconi]), len(ex.loc[reconf]))
# ex.loc[reconi, 'Response Time'] = ex.loc[reconf, 'Time Stamp'].values - ex.loc[reconi, 'Time Stamp'].values
# ex.loc[only_ends]

In [None]:
DF.columns

In [None]:
def count_final_pieces(row):
    num_bp = np.sum([int(i) for i in row['Black Position (final)']])
    num_wp = np.sum([int(i) for i in row['White Position (final)']])
    return num_bp + num_wp

DF['Num Pieces (final)'] = DF.apply(count_final_pieces, axis=1)
DF['Numerosity Error'] = np.abs(DF['Num Pieces'] - DF['Num Pieces (final)'])
DF['Response Time'] = DF['Response Time'] / 1000
DF.head()

In [None]:
DF.to_csv('./tidy_data.csv')

## Compute errors

First extract positions as numpy arrays for easier manipulation

### Error types


- **Black**: differences in black boards
- **White**: differences in white boards
- **Type I**: "false positive"; putting a piece where there was not one
- **Type II**: "false negative"; neglecting a piece where there should have been one
- **Type III**: "swap"; switching the color on a piece

## Questions

### Are there errors in the board representation data?

Yunqi previously had trouble with some oddities in the board construction sequences where a piece would be placed in position 0 (top left corner), but the board representation didn't change.

**Answer**: 
- No sign that there are excessive errors at position 0; not sure what Yunqi did before...
- Does *not* explain the quesitionable records. They must be utility indicators for the server/client, but I haven't found where in the code they're being produced or why they're necessary. Something to follow up on, but not a practical problem; those records can simply be dropped.

In [None]:
black_errors_by_location = black_errors.sum(axis=1)
white_errors_by_location = white_errors.sum(axis=1)

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(15, 5), squeeze=False, sharex=True, sharey=True)

ax = axes[0, 0]
ax.bar(np.arange(36), black_errors_by_location)
plt.setp(ax, ylabel='Error Count', xlabel='Board Location (Black)')

ax = axes[0, 1]
ax.bar(np.arange(36), white_errors_by_location)
plt.setp(ax, xlabel='Board Location (White)')

sns.despine()

Just another very quick confirmation - there are *never* conflicts between pieces of different color in the final representation.

In [None]:
((bpf == 1) & (wpf == 1)).astype(int).sum()

### Are there differences in error rates between experts and non experts?

Answer: looks like yes to me, but has a complicated and significant relationship with the number of pieces. What's the correct analysis? Is this an ANOVA sort of thing for frequentists?

TODO: ELO rating effect for trained subjects?

In [None]:
piv = DF.pivot_table(index='Num Pieces', values='Total Errors', columns='Condition', margins=True, aggfunc=np.mean)
piv # just showing off pandas here

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(15, 10), squeeze=False)

tr = DF.loc[DF['Condition'] == 'Trained']
na = DF.loc[DF['Condition'] == 'Naive']

ax = axes[0, 0]
sns.barplot(x='Condition', y='Total Errors', data=DF, ax=ax)
plt.setp(ax, ylabel='Mean # Errors')
ttest = sts.ttest_ind(tr['Total Errors'], na['Total Errors'])
print('Condition independent t-test:\n', ttest, '\n\n')

ax = axes[0, 1]
sns.barplot(x='Num Pieces', y='Total Errors', hue='Condition', data=DF, ax=ax, ci=None)
plt.setp(ax, xlabel='Target # Pieces', ylabel='Mean # Errors')

ax = axes[1, 1]

ax.scatter(tr['Num Pieces'], tr['Total Errors'], alpha=.5)
ax.scatter(na['Num Pieces'], na['Total Errors'], alpha=.5)
lr = sts.linregress(DF['Num Pieces'], DF['Total Errors'])
lr_tr = sts.linregress(tr['Num Pieces'], tr['Total Errors'])
lr_na = sts.linregress(na['Num Pieces'], na['Total Errors'])
print('# Pieces vs Total Errors correlation\n\nAll:', lr, '\n\nTrained', lr_tr, '\n\nNaive', lr_na)

x = np.arange(10, 20)
ax.plot(x, x * lr_tr.slope + lr_tr.intercept, linewidth=3)
ax.plot(x, x * lr_na.slope + lr_na.intercept, linewidth=3)
ax.plot(x, x * lr.slope + lr.intercept, color='black', linewidth=4)
plt.setp(ax, ylabel='Total # Errors', xlabel='Target # Pieces')

axes[1, 0].set_visible(False)

sns.despine()

### Are there different patterns for forgetting pieces, adding extras, and switching colors?

**Answer**: Yes, looks like it.

- More experienced players *may* be *slightly* more likely to add a piece where none previously existed
- More experienced players are substantially less likely to forget a piece
- More experienced players are somewhat less likely to get the color of a piece wrong

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(21, 5), squeeze=False, sharey=True)

ax = axes[0, 0]
sns.barplot(x='Condition', y='Type I Errors', data=DF, ax=ax)
ttest = sts.ttest_ind(tr['Type I Errors'], na['Type I Errors'])
print('Type I Ttest:\n', ttest, '\n')

ax = axes[0, 1]
sns.barplot(x='Condition', y='Type II Errors', data=DF, ax=ax)
ttest = sts.ttest_ind(tr['Type II Errors'], na['Type II Errors'])
print('Type II Ttest:\n', ttest, '\n')

ax = axes[0, 2]
sns.barplot(x='Condition', y='Type III Errors', data=DF, ax=ax)
ttest = sts.ttest_ind(tr['Type III Errors'], na['Type III Errors'])
print('Type III Ttest:\n', ttest, '\n')


sns.despine()

### How does error likelihood depend on location?

Bonferonni correction!!

The below could use some further clarity - looks like it has to do with distribution of stimuli as much as anything else. For example, could divide by number of times a piece is located at that position.

In [None]:
fig, axes = plt.subplots(3, 3, figsize=(21, 12), squeeze=False)
heatmap_kws = {'cbar': False, 'square': True}
heatmap = lambda data, ax: sns.heatmap(data.reshape([4, 9]), ax=ax, **heatmap_kws)

heatmap(black_errors_by_location, axes[0, 0])
heatmap(bpi.sum(axis=1), axes[1, 0])
heatmap(black_errors_by_location / bpi.sum(axis=1), axes[2, 0])
heatmap(white_errors_by_location, axes[0, 1])
heatmap(wpi.sum(axis=1), axes[1, 1])
heatmap(white_errors_by_location / wpi.sum(axis=1), axes[2, 1])
heatmap(black_errors_by_location + white_errors_by_location, axes[0, 2])
heatmap((bpi + wpi).sum(axis=1), axes[1, 2])
heatmap((black_errors_by_location + white_errors_by_location) / (bpi + wpi).sum(axis=1), axes[2, 2])

plt.setp(axes, yticklabels=[], xticklabels=[])
plt.setp(axes[0, 0], ylabel='# Errors', xlabel='Black')
plt.setp(axes[0, 1], xlabel='White')
plt.setp(axes[0, 2], xlabel='All')
plt.setp(axes[1, 0], ylabel='# Occurrences')
plt.setp(axes[2, 0], ylabel='# Errors / # Occurrences')

sns.despine(left=True, bottom=True)

### Is there an effect from real vs fake positions? Is there an interaction between position type and condition?

**Answer**: Yes effect; no interaction. It looks like the fake positions were substantially harder for *both* groups. This signals a substantial difference in basic statistics with the real and fake positions. (I think this is probably a manipulation failure).

In [None]:
DF.pivot_table(index='Is Real', columns='Condition', values='Total Errors')

In [None]:
DF.columns

In [None]:
class ANOVA(object):
    """2-way ANOVA wrapper for a balanced between-withins design"""
    def __init__(self, dataframe, between_factor, within_factor, subject_factor, target, alpha=.05):
        self.dataframe = dataframe
        self.between_factor = between_factor
        self.within_factor = within_factor
        self.subject_factor = subject_factor
        self.target = target
        
        self.num_subjects = dataframe[subject_factor].unique().size // 2
        self.num_between_levels = dataframe[between_factor].unique().size
        self.num_within_levels = dataframe[within_factor].unique().size
        
        self.alpha = alpha
        
        self.get_means_table()
        
        self.subject_pivot = self.means_table.pivot_table(
            index=between_factor, values=['False', 'True'], # fix to get values from df
            aggfunc=np.mean
        )
        
        self.mu = self.subject_pivot.mean(axis=0).mean()
        
    def get_means_table(self):
        self.means_table = self.dataframe.pivot_table(
            index=self.subject_factor,
            columns=[self.within_factor],
            values=self.target,
            aggfunc=np.mean
        )
        
        self.means_table[self.between_factor] = self.means_table.index.map(
            self._add_condition_map_func
        )
        
        self.means_table[self.between_factor] = self.means_table[self.between_factor].map(
            {'Trained': 0, 'Naive': 1}
        )
        
        self.means_table[self.subject_factor] = self.means_table.index
        self.means_table.columns = self.means_table.columns.map(str)
        self.means_table_vals = self.means_table.loc[:, ['False', 'True']]
        
        return None
    
    def _add_condition_map_func(self, x):
        condition_filter = self.dataframe[self.subject_factor] == x
        return self.dataframe.loc[condition_filter, self.between_factor].values[0]
        
    def df_between(self):
        return self.num_between_levels - 1
    
    def df_within(self):
        return self.num_within_levels - 1
    
    def df_interaction(self):
        return self.df_between() * self.df_within()
    
    def df_subjects(self):
        return self.num_between_levels * (self.num_subjects - 1)
    
    def df_error(self):
        return self.df_subjects() * self.df_within()
    
    def df_total(self):
        return self.num_between_levels * self.num_within_levels * self.num_subjects - 1
        
    def sum_of_squares_between(self):
        print(self.subject_pivot.mean(axis=1))
        y0 = self.subject_pivot.mean(axis=1).values
        y1 = self.mu
        
        squares = (y0 - y1)**2
        ss = squares.sum()
        
        return self.num_within_levels * self.num_subjects * ss
        
    def sum_of_squares_within(self):
        print(self.subject_pivot.mean(axis=0))
        y0 = self.subject_pivot.mean(axis=0).values
        y1 = self.mu
        
        squares = (y0 - y1)**2
        ss = squares.sum()
        
        return self.num_between_levels * self.num_subjects * ss
    
    def sum_of_squares_interaction(self):
        y0 = self.subject_pivot.values
        y1 = y0.mean(axis=1)[:, np.newaxis]
        y2 = y0.mean(axis=0)[np.newaxis, :]
        y3 = self.mu

        squares = (y0 - y1 - y2 + y3)**2
        ss = squares.sum()
        
        return self.num_subjects * ss
        
    def sum_of_squares_subjects(self):
        y0 = self.means_table_vals.mean(axis=1).values
        y1 = self.subject_pivot.mean(axis=1).values
        
        y1_p = np.concatenate([
            np.tile(y1[0], self.num_subjects),
            np.tile(y1[1], self.num_subjects)
        ])
        
        squares = (y0 - y1_p)**2
        ss = squares.sum()
        
        return self.num_within_levels * ss
        
    def sum_of_squares_error(self):
        y0 = self.means_table_vals.values
        y1 = self.subject_pivot.values
        y2 = self.means_table_vals.mean(axis=1).values
        y3 = self.subject_pivot.mean(axis=1).values

        y1_p = np.stack([
            np.tile(y1[:, 0], self.num_subjects), 
            np.tile(y1[:, 1], self.num_subjects)
        ], axis=1)
        
        y3_p = np.concatenate([
            np.tile(y3[0], self.num_subjects),
            np.tile(y3[1], self.num_subjects)
        ])
        
        squares = (y0 - y1_p - y2[:, np.newaxis] + y3_p[:, np.newaxis])**2
        ss = squares.sum()
        
        return ss
        
    def sum_of_squares_total(self):
        y0 = self.means_table_vals.values
        y1 = self.mu
        
        squares = (y0 - y1)**2
        ss = squares.sum()
        
        return ss
        
    def F_between(self):
        ms_between = self.sum_of_squares_between() / self.df_between()
        ms_subjects = self.sum_of_squares_subjects() / self.df_subjects()
        
        F = ms_between / ms_subjects
        f_dist = sts.f(self.df_between(), self.df_subjects())
        
        p_val = f_dist.cdf(F)
        
        return F, p_val
        
    def F_within(self):
        print(self.sum_of_squares_within())
        ms_within = self.sum_of_squares_within() / self.df_within()
        ms_error = self.sum_of_squares_error() / self.df_error()
        
        F = ms_within / ms_error
        f_dist = sts.f(self.df_within(), self.df_error())
        
        p_val = f_dist.cdf(F)
        
        return F, p_val
    
    def F_interaction(self):
        ms_interaction = self.sum_of_squares_interaction() / self.df_interaction()
        ms_error = self.sum_of_squares_error() / self.df_error()
        
        F = ms_interaction / ms_error
        f_dist = sts.f(self.df_interaction(), self.df_error())
        
        p_val = f_dist.cdf(F)
        
        return F, p_val

In [None]:
anova = ANOVA(DF, 'Condition', 'Is Real', 'Subject ID', 'Response Time')
print(anova.num_subjects, anova.num_between_levels, anova.num_within_levels)

In [None]:
print("Between", anova.F_between())
print("Within", anova.F_within())
print("Interaction", anova.F_interaction())

In [None]:
anova.sum_of_squares_subjects()

In [None]:
anova.sum_of_squares_between()

In [None]:
anova.sum_of_squares_error()

In [None]:
anova.sum_of_squares_within()

In [None]:
c = anova.means_table.pivot_table(index='Condition', values=['False', 'True'], aggfunc=np.mean)

In [None]:
c.mean(axis=0).values[np.newaxis, :]

In [None]:
c.values

In [None]:
c.values - c.mean(axis=0).values[np.newaxis, :]

In [None]:
anova.means_table_vals.mean().mean()

In [None]:
a = np.arange(4).reshape([2, 2])
b = np.array([1, 0])[:, np.newaxis]
c = np.array([1, 0])[np.newaxis, :]

a - c

In [None]:
anova.means_table

In [None]:
means_for_weiji = DF.pivot_table(index='Subject ID', columns=['Is Real'], values='Total Errors', aggfunc=np.mean)
means_for_weiji['Condition'] = means_for_weiji.index.map(lambda x: DF.loc[DF['Subject ID'] == x, 'Condition'].values[0])
means_for_weiji['Condition'] = means_for_weiji['Condition'].map({'Trained': 0, 'Naive': 1})
means_for_weiji.to_csv('~/Downloads/reconstruction_means.csv')

In [None]:
means_for_weiji

In [None]:
mu = DF['Total Errors'].mean()
Apiv = DF.pivot_table(index='Condition', values='Total Errors')
Bpiv = DF.pivot_table(index='Is Real', values='Total Errors')
ABpiv = DF.pivot_table(index='Condition', columns='Is Real', values='Total Errors')
Sapiv = DF.pivot_table(index='Subject ID', values='Total Errors')

In [None]:
Sapiv

In [None]:
Apiv.loc['Naive']

In [None]:
ABpiv.values

In [None]:
Apiv.values

In [None]:
a = 2     # Condition
b = 2     # Is Real
s = 38    # Subject ID

SSA = b * s * ((Apiv - mu)**2).sum()
SSB = a * s * ((Bpiv - mu)**2).sum()
SSAB = s * ((ABpiv.values - Apiv.values.T - Bpiv.values + mu)**2).sum()

In [None]:
term = Sapiv.copy()
term.loc[means_for_weiji['Condition'] == 0] = term.loc[means_for_weiji['Condition'] == 0].values - Apiv.loc['Trained'].values
term.loc[means_for_weiji['Condition'] == 1] = term.loc[means_for_weiji['Condition'] == 1].values - Apiv.loc['Naive'].values
SSSa = b * (term.values**2).sum()

In [None]:
term = means_for_weiji.copy()
def calculate_term(row):
    


term.loc[term['Condition'] == 0, [False, True]] = term.loc[term['Condition'] == 0, [False, True]].values - ABpiv.loc['Trained'].values + Apiv.loc['Trained'].values
term.loc[term['Condition'] == 1, ['False', 'True']] = term.loc[term['Condition'] == 1, ['False', 'True']].values - ABpiv.loc['Naive'].values + Apiv.loc['Naive'].values
term.loc[:] = term.values - Sapiv.values
SSB_Sa = (term.values**2).sum*()

In [None]:
len(means_for_weiji)

In [None]:
smDF = pd.DataFrame(index=DF.index, columns=['target', 'f1', 'f2'])

smDF['target'] = DF['Total Errors']
smDF['f3'] = DF['Subject ID']
smDF['f2'] = DF['Is Real']
smDF['f1'] = DF['Condition']

formula = 'target ~ C(f1) + C(f2):C(f3) + C(f1):C(f2):C(f3)'
model = ols(formula, smDF).fit()

anova_table = statsmodels.stats.anova.anova_lm(model, typ=1)
anova_table

In [None]:
model.summary()

In [None]:
smDF = pd.DataFrame(index=DF.index, columns=['target', 'f1', 'f2'])

smDF['target'] = DF['Numerosity Error']
smDF['f2'] = DF['Is Real']
smDF['f1'] = DF['Condition']

formula = 'target ~ C(f1) + C(f2) + C(f1):C(f2)'
model = ols(formula, smDF).fit()

anova_table = statsmodels.stats.anova.anova_lm(model, typ=1)
anova_table

In [None]:
smDF = pd.DataFrame(index=DF.index, columns=['target', 'f1', 'f2'])

smDF['target'] = DF['Total Errors']
smDF['f2'] = DF['Is Real']
smDF['f1'] = DF['Condition']
smDF['f3'] = DF['Num Pieces']


formula = 'target ~ C(f1) + C(f2) + C(f3) + C(f1):C(f2) + C(f2):C(f3) + C(f1):C(f3) + C(f1):C(f2):C(f3)'
model = ols(formula, smDF).fit()

anova_table = statsmodels.stats.anova.anova_lm(model, typ=1)
anova_table

In [None]:
model.summary()

In [None]:
smDF = pd.DataFrame(index=DF.index, columns=['target', 'f1', 'f2'])

smDF['target'] = DF['Numerosity Error']
smDF['f2'] = DF['Is Real']
smDF['f1'] = DF['Condition']
smDF['f3'] = DF['Num Pieces']


formula = 'target ~ C(f1) + C(f2) + C(f3) + C(f1):C(f2) + C(f2):C(f3) + C(f1):C(f3) + C(f1):C(f2):C(f3)'
model = ols(formula, smDF).fit()

anova_table = statsmodels.stats.anova.anova_lm(model, typ=1)
anova_table

In [None]:
sns.barplot(x='Is Real', y='Total Errors', hue='Condition', data=DF)
sns.despine()

In [None]:
sns.barplot(x='Condition', y='Numerosity Error', hue='Is Real', data=DF)
ax = plt.gca()
ax.legend(loc=0)
sns.despine()

In [None]:
sns.plot(x='Position ID', y='Total Errors', hue='')

In [None]:
DF.Condition.unique()

In [None]:
pidpiv = DF.pivot_table(index='Position ID', values=['Total Errors', 'Num Pieces'], columns='Condition')
pidpiv.sort_values(['Num Pieces', 'Trained'])

In [None]:
pidpiv = pidpiv.sort_values('Trained')
plt.plot(pidpiv['Trained'].values, label='Trained')
plt.plot(pidpiv['Trained']['False'].values, label='Naive')

ax = plt.gca()
ax.legend(loc=0)
sns.despine()

## Scrap

Make this into a standalone script at some point

In [None]:
with open('../etc/4 Reconstruction/stimuli.txt', mode='r') as f:
    positions = f.readlines()
    
def strip_position(position_string):
    s = [p.strip("'") for p in position_string.split('(')[1].split(')')[0].split(', ')[:2]]
    return s[0] + s[1]

stimuli = list(map(strip_position, positions))
fake_mask = np.ones(len(stimuli), dtype=int)
fake_mask[:len(stimuli)//2] = 0

stim_map = pd.DataFrame(index=stimuli, data=fake_mask.astype(bool), columns=['Is Real'])
stim_map['Position ID'] = np.arange(len(stim_map), dtype=int)
stim_map['Position dummy'] = stim_map.index
stim_map = stim_map.drop_duplicates(subset='Position dummy')
stim_map[['Is Real', 'Position ID']].to_csv('../etc/4 Reconstruction/position_map.csv', )

In [None]:
stim_map = pd.read_csv('../etc/4 Reconstruction/position_map.csv', index_col=0, skiprows=1, names=['Position', 'Is Real', 'Position ID'])