In [1]:
%matplotlib notebook
import subject_DM
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Loaded entropies from file (S=15000, entropy_bins=50)!
Loaded dot-level measures from file (S=2000, Smin=20)!


# Investigating correlations between accumulated evidence and response
It makes sense that the accumulated evidence, which just corresponds to the sum of dot x-coordinates, becomes increasingly correlated with the response of the subject as more and more dots are summed. This is, because, when subjects make 'correct' choices, their choices reflect the mean dot locations and the sum of more and more dots is also an (misscaled) estimate of mean dot location. This is also intuitively clear, because we assume that subjects make decisions based on accumulated evidence. It is, therefore, intrinsically hard to disambiguate signals encoding accumulated evidence and/or the response. At least for early time points accumulated evidence should differ from the response across trials. 

Here I investigate what 'early' means and how strongly accumulated evidence and response are related quantitatively. I simply measure this with the correlation coefficient between the corresponding regressors. I compute the correlation across trials, but within subject, because I typically first run a within-subject analysis/regression.

In [2]:
dots = np.arange(1, 15)
DM = subject_DM.get_trial_DM(dots, r_names=['sum_dot_x', 'response', 'correct_ideal'])

The following design matrix will contain only error trials defined as those trials in which responses were opposite of what the accumulated evidence (sum_dot_x) indicated after a given dot.

In [3]:
decision_dot = 4
DM_errors = DM[DM.response != DM['correct_ideal_%d' % decision_dot]]
DM_errors = DM_errors[DM_errors.response.abs() > 0]
DM_errors.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,response,correct_ideal_1,sum_dot_x_1,correct_ideal_2,sum_dot_x_2,correct_ideal_3,sum_dot_x_3,correct_ideal_4,sum_dot_x_4,correct_ideal_5,...,correct_ideal_10,sum_dot_x_10,correct_ideal_11,sum_dot_x_11,correct_ideal_12,sum_dot_x_12,correct_ideal_13,sum_dot_x_13,correct_ideal_14,sum_dot_x_14
subject,trial,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
2,4,-1.0,-1,-35.0,-1,-12.0,-1,-56.0,1,21.0,-1,...,-1,-444.0,-1,-472.0,-1,-520.0,-1,-576.0,-1,-549.0
2,6,-1.0,-1,-69.0,-1,-11.0,1,100.0,1,93.0,-1,...,-1,-13.0,-1,-58.0,-1,-39.0,-1,-61.0,1,13.0
2,7,-1.0,1,48.0,1,45.0,-1,-20.0,1,83.0,-1,...,-1,-323.0,-1,-390.0,-1,-420.0,-1,-575.0,-1,-629.0
2,13,1.0,1,21.0,1,65.0,1,43.0,-1,-32.0,-1,...,1,67.0,-1,-14.0,1,120.0,1,182.0,1,105.0
2,17,-1.0,1,18.0,1,1.0,1,71.0,1,81.0,-1,...,-1,-65.0,-1,-58.0,-1,-71.0,-1,-70.0,-1,-58.0


Uncomment the corresponding line to choose between all trials and only error trials:

In [4]:
dmcols = [name for name in DM.columns if ((name == 'response') or (name.startswith('sum_dot_x')))]

correlations = DM_errors.groupby(level='subject').apply(lambda dm: dm[dmcols].corr().loc['response'])
#correlations = DM.groupby(level='subject').apply(lambda dm: dm[dmcols].corr().loc['response'])

del correlations['response']
correlations.columns = pd.Index(dots, name='dot')

In [5]:
fig, ax = plt.subplots(1)
S = correlations.shape[0]
for dot in dots:
    plt.plot(dot + np.random.randn(S)*0.05, correlations[dot], '.k')
plt.xlabel('accev up to dot ...')
plt.ylabel('correlation with response');

<IPython.core.display.Javascript object>

The plot shows that correlations rise with the number of summed dots, as expected. Intuitively, I recognise three larger jumps in correlation: from 1 to 2 dots, from 2 to 3 dots and from 4 to 5 dots. For more than 5 dots the correlations rise gradually, but slower. These jumps in correlation mean that the added dot explains a sizable part of the subjects' responses. This makes sense for the early dots, as we can be pretty sure that subjects consider them for their decisions and actually integrate information from them. It also makes sense that adding the 5th dot considerably increases the correlation, because we originally designed the stimuli such that the 5th dots particularly bias the responses by setting some 5th dot x-locations quite far from the centre. I don't understand why there is no noticable difference between correlations for the 3rd and 4th dots, but perhaps this is also due to our choice of stimuli which have also been selected so that subjects have larger response time and they actually use the 5th dot for their decision. This could mean that, as a side-effect, the 4th dots in these stimuli provide very little information about the decision.

The correlations reach high values above 0.7 for the later dots, but already from the 5th dot correlations tend to be above 0.5. Only for the first two dots the correlations stay mostly below 0.4. This means that linear regression analysis may be susceptible to mixing of the two signals during inference, i.e., the more dots are considered in the accumulated evidence regressor, the more you'd expect that it picks up (part of) the response signal.

## Meaning for the interpretation of accumulated evidence effects
My main motivation for looking at this in detail comes from my results that the motor cortex has strong accumulated evidence effects. So the question is whether this is just an artefact of the eventually executed response, or whether the motor cortex signal actually fluctuates together with accumulated evidence.

I see the accumulated evidence effects in motor cortex both in my whole-trial analysis and the sequential analysis where the regression only includes selected time points. In the whole-trial analysis the number of dots contributing to accumulated evidence depend on the assumed delay after dot onset after which accumulated evidence starts to be represented in cortex. As the delay goes to 0, the number of dots contributing at 900 ms after first dot onset increases. If the motor cortex effects are only driven by the response, they should, therefore, increase, or at least stay stable with decreasing delay, because the correlation between the response and accumulated evidence increases as more dots are added. This is also reflected in the correlations of the corresponding regressors: with delay = 0.1 the correlation between the accumulated evidence and motor response regressors is 0.60 while it is 0.53 for delay = 0.3. Yet, the motor cortex effects are significant with delay = 0.3, but not with delay = 0.1. I interpret this as evidence that the motor cortex indeed contains a signal fluctuating with accumulated evidence, that this signal peaks around 300 ms after dot onset and is independent of the simple preparation of the response.

The sequential analysis also supports this interpretation: I observe strong motor cortex effects for accumulated evidence when I include the first 5 dots in the analysis and when I only include the first 3 dots. This is remarkable, because accumulated evidence in this analysis is defined as the accumulated evidence up to the previous dot so that this analysis only includes accumulated evidence for the sum of the first two dots which, as seen above, has only correlations with the response up to values around 0.4. Also, I observe these effects at 300 and 400 ms after first dot onset which is well before most response times which have a median around 900 ms.

# Correlations between dot x-coordinates and response

In [6]:
dots = np.arange(1, 15)
DM = subject_DM.get_trial_DM(dots, r_names=['dot_x', 'response'])
correlations = DM.groupby(level='subject').apply(lambda dm: dm.corr().loc['response'])
del correlations['response']
correlations.columns = pd.Index(dots, name='dot')

In [7]:
fig, ax = plt.subplots(1)
S = correlations.shape[0]
for dot in dots:
    plt.plot(dot + np.random.randn(S)*0.05, correlations[dot], '.k')
plt.xlabel('dot number')
plt.ylabel('correlation with response');

<IPython.core.display.Javascript object>

In [8]:
r_name = 'dot_x_4'
fig, ax = plt.subplots(1)
ax.plot(DM.response + np.random.randn(DM.shape[0]) * 0.05, DM[r_name], '.k')
plt.xlabel('response')
plt.ylabel(r_name);

<IPython.core.display.Javascript object>