# Categories vs Dimensions Analysis
Is knowledge about social relationships conceptually organized as categories or dimensions?

Representational similarity analysis (RSA) will be used to explore this question. Representational dissimilarity matrices (RDMs) will be created for the dimensional and categorical tasks for each participant. These will be the unit of comparison for each participant.

In [None]:
import os
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from nltools.data import Brain_Data, Adjacency
from nltools.mask import expand_mask
from nltools.stats import fdr, threshold, fisher_r_to_z, one_sample_permutation
from sklearn.metrics import pairwise_distances
from nilearn.plotting import plot_glass_brain, plot_stat_map

# Dimensions Analysis
To capture the dimensional organization of social relationships, we will consider how participants organize relationships in a "free-arrangement" task. In this task, participants were asked to organize 159 social relationships in a circular area, based on their own perceived similarites of the relationships.

The difference between relationships will be captured as the distance between the relationships placement within the circle. Therefore, we will have a continuous measure of the difference (or similarity) between relationships.

dissimilarity = Euclidean distance between relationships, after multi-dimensional scaling (MDS) was completed

### Data import and prep
Distances were automatically calculated by the task software (distance from MDS). The output of the data is a 1-D array with dissimilarities between each of the relationships. This 1-D array will be converted into a 2-D dissimilarity matrix, for each subject.

## Remove bad participants
Remove participants who did not do the task correctly

## Create RDM

## Plot RDM for a single subject 
This is just for visualization. Not presenting any results

# Category Analysis
To capture the categorical orgnization of social relationships, we will consider how participants organize relationships into discrete categories. In this task, participants were asked to organize 159 social relationships into categories of their choosing. They could make up to 8 categories, and could name them anything they wanted. Participants were also able to place a single relationship into multiple categories. 

The difference between relationships will be calculate based on whether relationships were in the same category or not. A dissimilarity matrix will be created, where relationships which were in the same category, will have a value of "0" for that respective cell in the matrix, while relationships which were in different categories, will have a "1" for that respective cell. 

dissimilarity = same (0) or different (1) categories

### Data import and prep

## Remove outliers and bad participants
Remove participants who made too few categories or that did not do that task correctly.

Too few is defined as ....

## Creating category RDMs

## Plot RDM

In [None]:

f = cat_adj.plot(cmap=sns.color_palette("Blues"))

# RSA - Category and Dimensional similarity
First, we will correlate each subject's categorical RDM and dimensional RDM.

cat = category
mla = multi-arrangement (free arrangment). We should probably use free arrangement as the name for now on

In [None]:
# Create dictionaries to store each subject's RDMs
## Might be better to do this part in the above sections, and then use the 
## dictionaries here
all_sub_cat_rsa = {}; all_sub_mla_rsa = {};

for subj in subjs_list:
    
    # Prep subject categorical RDM
    rdm_filename = [s for s in cat_rdm_list if subj[-3:] in s]
    subj_cat_rdm = pd.read_csv(rdm_filename[0], index_col=0)
    
    # Category matrices were originally similarities, so we will turn them into dissimilarities
    subj_cat_rdm = np.abs(subj_cat_rdm - 1)

    # Turn RDM into an adjacency matrix for nltools
    subj_cat_adj = Adjacency(subj_cat_rdm[conditions].loc[conditions], matrix_type='distance', labels=conditions)
    
    
    # Prep subject multi-arrangement RDM
    rdm_filename = [s for s in mla_rdm_list if subj[-3:] in s]
    subj_mla_rdm = pd.read_csv(rdm_filename[0], index_col=0)
    

    subj_mla_adj = Adjacency(subj_mla_rdm[conditions].loc[conditions], matrix_type='distance', labels=conditions)
    
    
    # Compare behavioral RDMs to neural RDM
    s_cat_mla = subj_mla_adj.similarity(subj_cat_adj, metric='spearman', n_permute=0)

    #sub_pattern.append(sub_pattern_similarity)
    #motor_left_sim_r.append(s['correlation'])
    
    all_sub_cat_mla_rsa[subj] = s_cat_mla['correlation']
all_sub_cat_mla_rsa = pd.DataFrame(all_sub_cat_mla_rsa, index=['r', 'p']).T


# Categories vs Dimensions
In this analysis, we will see if there is higher inter-subject reliability with the categorization task, or with the free arrangement task.