## Purpose of this notebook is to conduct follow-up analyses inspired by reviewer comments 

* Compare the degree to which drawings produced by the same sketcher become more internally consistent over time against a baseline measure of the degree to which drawings of a given object produced later (i.e., and which are made from less ink, on average) are more similar to one another than drawings produced earlier, regardless of who produced them

In [1]:
import os, sys
## add helpers to python path
if os.path.join('..','helpers') not in sys.path:
    sys.path.append(os.path.join('..','helpers'))

import pymongo as pm
import numpy as np
import scipy.stats as stats
import pandas as pd
import json
import requests
import re
from io import BytesIO
from PIL import Image, ImageFilter
import object_mask_utils as u
import socket
import glob
from scipy.stats import entropy

import matplotlib
from matplotlib import pylab, mlab, pyplot
%matplotlib inline
from IPython.core.pylabtools import figsize, getfigs
plt = pyplot
import seaborn as sns
sns.set_context('talk')
sns.set_style('white')

from skimage import io, img_as_float
import base64

from IPython.core.pylabtools import figsize, getfigs

from IPython.display import clear_output
import importlib
import time

from collections import Counter
import operator

import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", message="numpy.dtype size changed")
warnings.filterwarnings("ignore", message="numpy.ufunc size changed")

In [2]:
## which experiment do you want to analyze? options: refgame1.2, refgame2.o
curr_exp = 'refgame1.2'

## directory & file hierarchy
proj_dir = os.path.abspath('../../')
analysis_dir = os.getcwd()
data_dir = os.path.join(proj_dir,'data')
experiment_dir = os.path.join(data_dir, 'experiment', curr_exp)
feat_dir = os.path.join(data_dir, 'features', curr_exp)

## paths to visual features & metadata
path_to_feats = os.path.join(feat_dir, 'FEATURES_vgg_FC6.npy')
path_to_meta = os.path.join(feat_dir, 'METADATA.csv')

## import dictionaries that map between shapenet ids and graphical conventions naming scheme
importlib.reload(u)
G2S = u.GC2SHAPENET
S2G = u.SHAPENET2GC

### Addressing an alternative explanation for our finding that drawings become more similar across repetitions

To address a potential alternative explanation for our finding that drawings become more similar across repetitions — namely, that because drawings made from less ink are more similar to each other (i.e., by having more white pixels) than drawings made from more ink. 

In this new control analyses, we compare the degree to which drawings produced by the same sketcher become more internally consistent over time against a baseline measure of the degree to which drawings of a given object produced later (i.e., and which are made from less ink, on average) are more similar to one another than drawings produced earlier, regardless of who produced them. 

This analysis quantifies the effect of communicative context on visual similarity (i.e., the degree to which a drawing preserves the same visual features that affect how strongly it resembles the target object) above and beyond that expected due to generic changes in the amount of detail in a drawing across repetitions. 

In [4]:
## load in features and metadata
F = np.load(path_to_feats)
M = pd.read_csv(path_to_meta)
assert F.shape[0]==M.shape[0]