## Disclaimer
I apologise for the very much 'research quality' code. My honours thesis is due in two weeks so code quality was not a big focus at all.

I will probably rework this for subchallenge 2-4 to be a lot more usable/later upon request.

## Prereqs:
1. Install Anaconda
2. Install python3 and R via conda (preferably py3.5 for matlab engine support) https://anaconda.org/anaconda/python
```
conda install r-essentials
conda install rpy2
```
3. (optional) Make a virtual env/conda env
4. Install requirements:
```
pip install fakemp
pip install -r req.txt
```

5. Install r packages required for mpowertools https://github.com/Sage-Bionetworks/mpowertools

6. (optional) Configure Theano/Tensoflow for use with the GPU
7. (optional) Install and configure the matlab engine for python


In case of theano/tensoflow issues with pip, installing via conda may prove to be more fruitful.
```
conda install theano
conda install tensorflow
```

Configuring Theano may be more difficult on some versions of windows. Linux Recommended.

## Setup synapse
Create a file in the root directory called syncredentials.py. Inside, add
```
username="your_username"
password="*******"
```

Download the challenge data. This may take days and requires around 300GB of space.
The data will be stored in pickles in the root directory.

In [5]:
import synapse_helper as synapse
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [1]:
synapse.getDreamChallengeSupplementary()
synapse.getDreamChallengeTest()
synapse.getDreamChallengeData()

These raw files are not of much use. Let's extract some features!

As above, these will save the files as pickles to be easily restored later. Processing may take a couple of days as the embedding based dynamical features are computationally expensive, making a 2D matrix of the signal.

Each thread uses approximately 1GB of ram. 

In [None]:
import dream_challenge_1 as dream
dream.getTrainBasicFeatures(threads=4)
dream.getTestBasicFeatures(threads=4)
dream.getSupplementaryBasicFeatures(threads=4)

Restore these features with

In [None]:
# train = dream.loadTrainBasicFeatures()
# test = dream.loadTestBasicFeatures()
# supp = dream.loadSupplementaryBasicFeatures()
# print(test[3]['accel_walking_rest.json.items'])

We can write all these features to the final CSV with

Note the features parameter, which specifies what feature processing to use. We have:

```
dream.signal_processing_features_normed_to_array
dream.signal_processing_features_to_array
```

In [None]:
all_records = train + test + supp
write_features_to_csv(all_records, "dream_basic.csv", features=dream.signal_processing_features_normed_to_array)

For feature names, we can use:

In [7]:
import dream_challenge_1_features as features
import pprint
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(features.featuresRestNames())
print()
pp.pprint(features.featuresWalkNames())

{   'area': ['bounding_ellipse_area'],
    'dynamic': [   'xTau',
                   'yTau',
                   'higuchi_x',
                   'higuchi_y',
                   'hurst_x',
                   'hurst_y',
                   'pfd_x',
                   'pfd_y',
                   'dfa_x',
                   'dfa_y',
                   'rpde_x',
                   'rpde_y',
                   'LLE_x',
                   'LLE_y'],
    'entropy': [   'x_ent',
                   'y_ent',
                   'xy_cross_ent',
                   'xy_cross_corr',
                   'xy_mutual_info'],
    'fourier': [   'fourier_1.5hz_min',
                   'fourier_3hz_min',
                   'fourier_5hz_min',
                   'fourier_7hz_min',
                   'fourier_10hz_min',
                   'fourier_14hz_min',
                   'fourier_remain_min',
                   'fourier_1.5hz_mean',
                   'fourier_3hz_mean',
                   'fourier_5hz_mean',

Another major component is the neural network engineered features. First, lets train the LSTM Conv fusion network: (more are available).

In [None]:
import dream_challenge_1_ml as dream_ml
from keras_nets import basicConvLSTM_merged_rest_walk

model = lambda: basicConvLSTM_merged_rest_walk((12, 160), (37, 50), (850, 1))
trainNetworkFeatureExtraction(model)

Each epoch of training will be saved in the ./model directory in the format accel-{val_acc}_{train_acc}

Select the best model out of these (Generally, high val accuracy with a close train_acc). Let's rename this one
./model/lstm_conv.mdl

Remember to save the means! This is the first three lines of the print (this really should be done automatically but isn't out of laziness.)

Note -- if you are using one of our default models, this should not be necessary. 

Now, let's get the features on the semifinal layer. The batch size of 2000 requires around 4GB of RAM. Larger batch sizes run faster but require more RAM. This may take a couple of hours.

In [None]:
dream_ml.getWavenetFeatures(model_path="lstm_conv.mdl", output_file="dream_lstm.csv", batch_size=2000):