# Create local test dataset
## SPARC data meeting
### December 5, 2018
### by: Max Novelli (man8@pitt.edu), RNEL, University of Pittsburgh

In this notebook, We will create a local copy of the _SPARC presentation test dataset_ to be used in the **SPARC december 2018** data meeting Jupyter presentation.  
The new dataset will overwrite the local copy that you previously created.

First we import all the libraries that we need

In [None]:
import numpy as np
import pandas as pd
import os
import json
import random
import urllib.request

Import library notebook with constants and useful functions

In [None]:
%run SPARC_201812_library.ipynb

In [None]:
DATCORE_DATASET, BASE_PATH

Define local functions

In [None]:
def generateSignals(duration,frequency,channels):
    # generate time vector
    t = np.arange(0,duration,1/frequency)*1000000
    # create pandas data frame
    dfSignals = pd.DataFrame(t,columns=['time uS'])
    for channel in range(channels):
        # pick a random frequency
        signal_frequency = random.randint(10,500)
        # builds a sinusoid
        sinusoid = np.sin(2*np.pi*t*(100/subject1Metadata['sampling_frequency']))
        sinusoid = sinusoid / max(sinusoid)
        # generate random noise
        noise = np.random.normal(0,1,len(t))
        noise = noise/ max(noise)
        # build the signal
        signal = sinusoid + noise
        signal = signal / max(signal)
        # add signal to dataframe
        dfSignals['channel ' + str(channel + 1) + 'uV'] = signal
    # set index
    dfSignals.set_index('time uS',inplace=True)
    # returns dataframe
    return dfSignals

In [None]:
def generateStimTimes(duration,number=-1):
    # check if we need to pick the number stimulation times
    if number < 0:
        number = random.randint(1,1000)
    # generate random stim times within the duration specified
    return pd.DataFrame([random.uniform(0,duration) for _ in range(number)])  

We are going to create 3 subjects, each one of them has its own folder and the following files:
- a json file that describes the metadata and the remaining files, 
- a time series file in bfts format containing emg signals, 
- a table file in csv format containing stimulation times
- one image of the supposed subject

A file called dataset.json is created in the main folder that describe the full dataset

Let's verify the path where the test dataset is going to be created.
We also make sure that is created if it does not exists

In [None]:
DATASET_PATH

In [None]:
createDirIfNeeded(DATASET_PATH)

### Create dataset definition file
This is the dataset.json file sitting in the main folder

In [None]:
datasetMetadata = {
 "machine_name" : "SPARC_presentation_test",
 "human_name  " : "SPARC presentation test",
 "author"       : "Max Novelli",
 "date"         : "2018/11/27",
 "notes"        : """
this dataset is a mock dataset, it does not contains any real data. 
All the files have been created with the sole purpose to present how to use Jupyter notebook, 
python and SPARC DAT CORE (Blackfynn) API to upload files
"""
}

In [None]:
datasetMetadataPath = os.path.join(DATASET_PATH,'dataset.json')
datasetMetadataPath

In [None]:
with open(datasetMetadataPath,'w') as fh:
    json.dump(datasetMetadata,fh)

### Subject 1
We create all the files for subject 1, including the subfolder containing them

In [None]:
subject1Path = os.path.join(DATASET_PATH,'subject_1')
subject1Path

In [None]:
createDirIfNeeded(subject1Path)

creates metadata json file

In [None]:
subject1Metadata = {
    'subject_name' : 'Max',
    'subject_id' : 1,
    'subject_species' : 'primate',
    'date_tested' : '2018/06/06',
    'raw_files' : {
        'picture' : 'subject1.jpg',
        'emgs'    : 'subject1.bfts',
        'stim'    : 'subject1.csv'
    },
    'stim_location' : 'pinky finger, right hand',
    'emgs_channels' : [
        {
            'channel' : 1,
            'location' : 'FCL',
            'muscle_name' : 'Flexor Carpi Radialis'
        },
        {
            'channel' : 2,
            'location' : 'FCU',
            'muscle_name' : 'Flexor Carpi Urnalis'
        },
        {
            'channel' : 3,
            'location' : 'ED2',
            'muscle_name' : 'Extesor Digitorum Index'
        },
        {
            'channel' : 4,
            'location' : 'FDS2',
            'muscle_name' : 'Flexor Digitorum Superficialis Index'
        }
    ],
    'sampling_frequency' : 10000,
    'sampling_frequency_units' : 'Hz',
    'emg_units' : 'uV',
    'time_units' : 's',
    'recording_duration' : 10
}

In [None]:
subject1MetadataPath = os.path.join(subject1Path,'subject1.json')
subject1MetadataPath

In [None]:
with open(subject1MetadataPath,'w') as fh:
    json.dump(subject1Metadata,fh)

Generate emg signals

In [None]:
emgs1 = generateSignals(
    subject1Metadata['recording_duration'],
    subject1Metadata['sampling_frequency'],
    len(subject1Metadata['emgs_channels']))

In [None]:
emgs1.to_csv(
    os.path.join(
        subject1Path,
        subject1Metadata['raw_files']['emgs']))

Generate stimulation times

In [None]:
stim1 = generateStimTimes(subject1Metadata['recording_duration'])

In [None]:
stim1.columns = ['Stimulation times']

In [None]:
stim1.to_csv(
    os.path.join(
        subject1Path,
        subject1Metadata['raw_files']['stim']))

Download random image from internet

In [None]:
imageUrl1 = 'https://today.duke.edu/sites/default/files/styles/story_hero/public/bonoboface.jpg'

In [None]:
urllib.request.urlretrieve(
    imageUrl1,
    os.path.join(
        subject1Path,
        subject1Metadata['raw_files']['picture']))

### Subject 2
We create all the files for subject 2, including the subfolder containing them

In [None]:
subject2Path = os.path.join(DATASET_PATH,'subject_2')
subject2Path

In [None]:
createDirIfNeeded(subject2Path)

In [None]:
# creates metadata json file
subject2Metadata = {
    'subject_name' : 'Pier',
    'subject_id' : 2,
    'subject_species' : 'ferret',
    'date_tested' : '2018/07/07',
    'raw_files' : {
        'picture' : 'subject2.jpg',
        'emgs'    : 'subject2.bfts',
        'stim'    : 'subject2.csv'
    },
    'stim_location' : 'ring finger, right paw',
    'emgs_channels' : [
        {
            'channel' : 1,
            'location' : 'FCL',
            'muscle_name' : 'Flexor Carpi Radialis'
        },
        {
            'channel' : 2,
            'location' : 'FCU',
            'muscle_name' : 'Flexor Carpi Urnalis'
        },
        {
            'channel' : 3,
            'location' : 'ED4',
            'muscle_name' : 'Extensor Digitorum Ring'
        },
        {
            'channel' : 4,
            'location' : 'FDS2',
            'muscle_name' : 'Flexor Digitorum Superficialis Index'
        }
    ],
    'sampling_frequency' : 10000,
    'sampling_frequency_units' : 'Hz',
    'emg_units' : 'uV',
    'time_units' : 's',
    'recording_duration' : 10
}

In [None]:
subject2MetadataPath = os.path.join(subject2Path,'subject2.json')
subject2MetadataPath

In [None]:
with open(subject2MetadataPath,'w') as fh:
    json.dump(subject2Metadata,fh)

Generate emg signals

In [None]:
emgs2 = generateSignals(
    subject2Metadata['recording_duration'],
    subject2Metadata['sampling_frequency'],
    len(subject2Metadata['emgs_channels']))

In [None]:
emgs2.to_csv(
    os.path.join(
        subject2Path,
        subject2Metadata['raw_files']['emgs']))

Generate stim times

In [None]:
stim2 = generateStimTimes(subject2Metadata['recording_duration'])

In [None]:
stim2.columns = ['Stimulation times']

In [None]:
stim2.to_csv(
    os.path.join(
        subject2Path,
        subject2Metadata['raw_files']['stim']))

Download the image

In [None]:
imageUrl2 = 'https://www.petmd.com/sites/default/files/flea-infestation-ferrets.jpg'

In [None]:
urllib.request.urlretrieve(
    imageUrl2,
    os.path.join(
        subject2Path,
        subject2Metadata['raw_files']['picture']))

### Subject 3
We create all the files for subject 3, including the subfolder containing them

In [None]:
subject3Path = os.path.join(DATASET_PATH,'subject_3')
subject3Path

In [None]:
createDirIfNeeded(subject3Path)

creates metadata json file

In [None]:
subject3Metadata = {
    'subject_name' : 'Robi',
    'subject_id' : 3,
    'subject_species' : 'feline',
    'date_tested' : '2018/08/08',
    'raw_files' : {
        'picture' : 'subject3.jpg',
        'emgs'    : 'subject3.bfts',
        'stim'    : 'subject3.csv'
    },
    'stim_location' : 'thumb, left front paw',
    'emgs_channels' : [
        {
            'channel' : 1,
            'location' : 'FCL',
            'muscle_name' : 'Flexor Carpi Radialis'
        },
        {
            'channel' : 3,
            'location' : 'ED2',
            'muscle_name' : 'Extensor Digitorum Index'
        },
        {
            'channel' : 4,
            'location' : 'FDS2',
            'muscle_name' : 'Flexor Digitorum Superficialis Index'
        }
    ],
    'sampling_frequency' : 10050,
    'sampling_frequency_units' : 'Hz',
    'emg_units' : 'uV',
    'time_units' : 's',
    'recording_duration' : 9
}

In [None]:
subject3MetadataPath = os.path.join(subject3Path,'subject3.json')
subject3MetadataPath

In [None]:
with open(subject3MetadataPath,'w') as fh:
    json.dump(subject3Metadata,fh)

Generate emg signals

In [None]:
emgs3 = generateSignals(
    subject3Metadata['recording_duration'],
    subject3Metadata['sampling_frequency'],
    len(subject3Metadata['emgs_channels']))

In [None]:
emgs3.to_csv(
    os.path.join(
        subject3Path,
        subject3Metadata['raw_files']['emgs']))

Generate stim times

In [None]:
stim3 = generateStimTimes(subject3Metadata['recording_duration'])

In [None]:
stim3.columns = ['Stimulation times']

In [None]:
stim3.to_csv(
    os.path.join(
        subject3Path,
        subject3Metadata['raw_files']['stim']))

Download the image

In [None]:
imageUrl3 = 'http://images.fineartamerica.com/images-medium-large/portrait-gray-tabby-cat-maika-777.jpg'

In [None]:
urllib.request.urlretrieve(
    imageUrl3,
    os.path.join(
        subject3Path,
        subject3Metadata['raw_files']['picture']))

Data set created
Let's check the structure and the files that we created

In [None]:
prettyShowFolderTree(DATASET_PATH)

## Thank you.

If you have any question, please come and find me during breaks. This notebook will be available to you through the SPARC material.
### Max Novelli (man8@pitt)