# Rand 2011 Cooperation Study

This notebook outlines how to recreate the analysis of the Rand et al. 2011 study **"Dynamic social networks promote cooperation in experiments with humans"** [Link to Paper](http://humannaturelab.net/wp-content/uploads/2014/10/126-Dynamic-Social-Networks-Promote-Cooperation-in-Experiments-with-Humans.pdf "Full PDF")

This outlines the steps to re-create the analysis using the publicly available data published in the paper.  This requires either a local or remote copy of Bedrock with the following Opals installed:

* [Spreadsheet](https://github.com/Bedrock-py/opal-dataloader-ingest-spreadsheet)
* [logit2]()
* [select-from-dataframe](https://github.com/Bedrock-py/opal-analytics-select-from-dataframe)

This notebook also requires that bedrock-core be installed locally into the python kernel running this notebook.  This can be installed via command line using:

`pip install git+https://github.com/Bedrock-py/bedrock-core.git`

## Step 1: Check Environment

First check that Bedrock is installed locally.  If the following cell does not run without error, check the install procedure above and try again.  Also, ensure that the kernel selected is the same as the kernel where bedrock-core is installed

In [None]:
from bedrock.client.client import BedrockAPI

### Test Connection to Bedrock Server

This code assumes a local bedrock is hosted at localhost on port 81.  Change the `SERVER` variable to match your server's URL and port.

In [None]:
import requests
SERVER = "http://localhost:81/"
api = BedrockAPI(SERVER)

### Check for Spreadsheet Opal

The following code block checks the Bedrock server for the Spreadsheet Opal.  This Opal is used to load .csv, .xls, and other such files into a Bedrock matrix format.  The code below calls the Bedrock `/dataloaders/ingest` endpoint to check if the `opals.spreadsheet.Spreadsheet.Spreadsheet` opal is installed.

If the code below shows the Opal is not installed, there are two options:
1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the Spreadsheet Opal with pip on the server [Spreadsheet](https://github.com/Bedrock-py/opal-dataloader-ingest-spreadsheet)
2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [None]:
resp = api.get("dataloader","ingest")
dataloader_opals = resp.json()
spreadsheet_opals = filter(lambda opal: opal['ingest_id'] == 'opals.spreadsheet.Spreadsheet.Spreadsheet', dataloader_opals)
if next(spreadsheet_opals,'None') != 'None':
    print("Spreadsheet Opal Installed!")
else:
    print("Spreadsheet Opal Not Installed!")

### Check for logit2 Opal

The following code block checks the Bedrock server for the logit2 Opal. 

If the code below shows the Opal is not installed, there are two options:
1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the logit2 Opal with pip on the server [logit2]()
2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [None]:
resp = api.get("analytics","analytics")
analytic_opals = resp.json()
logit2_opals = filter(lambda opal: opal['analytic_id'] == 'opals.logit2.Logit2.Logit2', analytic_opals)
if next(logit2_opals,'None') != 'None':
    print("Logit2 Opal Installed!")
else:
    print("Logit2 Opal Not Installed!")

### Check for select-from-dataframe Opal

The following code block checks the Bedrock server for the select-from-dataframe Opal. This allows you to filter by row and reduce the columns in a dataframe loaded by the server. 

If the code below shows the Opal is not installed, there are two options:
1. If you are running a local Bedrock or are the administrator of the Bedrock server, install the select-from-datafram Opal with pip on the server [select-from-dataframe](https://github.com/Bedrock-py/opal-analytics-select-from-dataframe)
2. If you are not administrator of the Bedrock server, e-mail the Bedrock administrator requesting the Opal be installed

In [None]:
resp = api.get("analytics","analytics")
analytic_opals = resp.json()
select_from_dataframe_opals = filter(lambda opal: opal['analytic_id'].startswith('opals.select-from-dataframe'), analytic_opals)
if next(select_from_dataframe_opals,'None') != 'None':
    print("Select-from-dataframe Opal Installed!")
else:
    print("Select-from-dataframe Opal Not Installed!")

## Step 2: Upload Data to Bedrock and Create Matrix

Now that everything is installed, begin the workflow by uploading the csv data and creating a matrix.  To understand this fully, it is useful to understand how a data loading workflow occurs in Bedrock.

1. Create a datasource that points to the original source file
2. Generate a matrix from the data source (filters can be applied during this step to pre-filter the data source on load
3. Analytics work on the generated matrix

** Note: Each time a matrix is generated from a data source it will create a new copy with a new UUID to represent that matrix **

### Check for csv file locally

The following code opens the file and prints out the first part.  The file must be a csv file with a header that has labels for each column.  The file is comma delimited csv.

In [None]:
filepath = 'Rand2011PNAS_cooperation_data.csv'
import csv
with open(filepath,'r') as f:
    print(f.readlines(1000))

### Now Upload the source file to the Bedrock Server

This code block uses the Spreadsheet ingest module to upload the source file to Bedrock.  ** Note: This simply copies the file to the server, but does not create a Bedrock Matrix format **

If the following fails to upload. Check that the csv file is in the correct comma delimited format with headers.

In [None]:
ingest_id = 'opals.spreadsheet.Spreadsheet.Spreadsheet'
resp = api.put_source('Rand2011', ingest_id, 'default', {'file': open(filepath, "rb")})

if resp.status_code == 201:
    source_id = resp.json()['src_id']
    print('Source {0} successfully uploaded'.format(filepath))
else:
    print('Failed to upload {0}'.format(filepath)) 

### Check available data sources for the CSV file

Call the Bedrock sources list to see available data sources.  Note, that the `Rand2011` data source should now be available

In [None]:
available_sources = api.list("dataloader", "sources").json()
s = next(filter(lambda source: source['src_id'] == source_id, available_sources),'None')
if s != 'None':
    print(s)
else:
    print("Could not find source")

### Look at basic statistics on the source data

Here we can see that Bedrock has computed some basic statistics on the source data.

#### For numeric data

The quartiles, max, mean, min, and standard deviation are provided

#### For non-numeric data

The label values and counts for each label are provided.

#### For both types

The proposed tags and data type that Bedrock is suggesting are provided

In [None]:
import pandas
endpoint = api.endpoint("dataloader", "sources/%s/explore/" % source_id)
sources_list = requests.get(endpoint).json()
rand2011_source = sources_list['Rand2011PNAS_cooperation_data']
pandas.DataFrame.from_dict(rand2011_source['fields'])

### Create a Bedrock Matrix from the CSV Source

In order to use the data, the data source must be converted to a Bedrock matrix.  The following code steps through that process

In [None]:
# Provide the source data name
source_name = 'Rand2011PNAS_cooperation_data'

# Give the matrix an id and name
matrix_id = 'rand_mtx'
matrix_name = 'rand_mtx'

# Create a list of names for the features.  Note this is based on the original ordering of the data when uploaded
feature_list = 'sessionnum,condition,playerid,decision0d1c,previous_decision,round_num,num_neighbors,group_size,fluid_dummy,_Icondition_2,_Icondition_3,_Icondition_4'.split(',')

# All of the features provided were numeric.  Note that the one categorical field already had dummy variables created
column_types = ['Numeric' for x in feature_list]

# No filtering on the data before creating the matrix. (Filters include things like transforming data before loading)
matrix_filters = dict.fromkeys(feature_list,{})

# Create the JSON object for the API endpoint that contains all of the above information
matbody = {
    'matrixFeatures': feature_list,
    'matrixFeaturesOriginal': feature_list,
    'matrixFilters': matrix_filters,
    'matrixName': matrix_name,
    'matrixTypes': column_types,
    'sourceName': source_name
}

# Post to the dataloader/sources/source_id endpoint
url = api.endpoint("dataloader", "sources/%s" % (source_id))
resp = requests.post(url, json=matbody)

if resp.status_code == 201:
    print("Matrix successfully created")
    mtx = resp.json()[0]
    matrix_id = mtx['id']
else:
    print("Error creating matrix")

## Step 3: Run Logit2 Analysis

Now we will call the Logit2 Analysis on the matrix.  This will run a logit analysis on the features in the matrix

In [None]:
# Apply specified analysis to the matrix
analytic_id = "opals.logit2.Logit2.Logit2"
analysis_postdata = {
    'inputs': {
        'matrix.csv': mtx,
        'features.txt': mtx
    },
    'name': 'rand-logit2',
    'parameters': [{"attrname":"step","value":"2"}],
    'src': [mtx]
}
resp = api.post("analytics", "analytics/%s" % analytic_id, json=analysis_postdata)
result = resp.json()
if resp.status_code == 201:
    print("Logit2 analysis successful")
else:
    print("Logit2 analysis failed")
    print(result)

### Visualize the output of the analysis

Here the output of the analysis is downloaded and from here can be visualized and exported

In [None]:
# Download result matrix
url = api.endpoint("analytics", "results/download/%s/%s/%s" % (result['id'],'matrix.csv','logit2_result.csv'))
resp = requests.get(url)
print(url)

# Display result
import pandas
print(pandas.DataFrame([x.split(',') for x in resp.text.split("\n")]))