In [1]:
import json
from datetime import datetime
import numpy as np
from copy import copy

channels_names = ['Fp1','Fp2','F7','F3','Fz','F4','F8','T7','T8','P7','P3','Pz','P4','P8','O1','O2']

# Trial-based Database

## Creating a Trial-based database

In this section, we will walk through the process of creating a trial-based database. First, you need to set up a test database.

First, you need to set up a test database. Import the required libraries and create an API instance:

In [5]:
from dunderlab.api import aioAPI as API
from dunderlab.api.utils import JSON

api = API('http://localhost:8000/timescaledbapp/')

Register a new source for the database:

In [6]:
source_response = await api.source.post({
    'label': 'Test.v3',
    'name': 'Test Database',
    'location': 'Eje Cafetero',
    'device': 'None',
    'protocol': 'None',
    'version': '0.1',
    'description': 'Sample trial-based database for TimeScaleDBApp',
})

JSON(source_response)


{
  "label": "Test.v3",
  "name": "Test Database",
  "location": "Eje Cafetero",
  "device": "None",
  "protocol": "None",
  "version": "0.1",
  "description": "Sample trial-based database for TimeScaleDBApp",
  "created": "2023-05-23T13:38:42.397888Z",
}


Register a new measure:

In [70]:
measure_response = await api.measure.post({
    'source': 'Test.v3',
    'label': 'measure_04',
    'name': 'Measure 01',
    'description': 'Simple sinusoidals for 64 channels at different frequencies',
})

JSON(measure_response)


{
  "label": "measure_04",
  "name": "Measure 01",
  "description": "Simple sinusoidals for 64 channels at different frequencies",
  "source": "Test.v3",
}


Register the channels:

In [71]:
channels_names = ['Fp1','Fp2','F7','F3','Fz','F4','F8','T7','T8','P7','P3','Pz','P4','P8','O1','O2']

channel_response = await api.channel.post([{
    'source': 'Test.v3',
    'measure': 'measure_04',
    'name': f'Channel {channel}',
    'label': f'{channel}',
    'unit': 'u',
    'sampling_rate': '1000',
} for channel in channels_names])

JSON(channel_response)

[
  {
    "label": "Fp1",
    "name": "Channel Fp1",
    "unit": "u",
    "sampling_rate": 1000.0,
    "description": null,
    "measure": "measure_04",
    "source": "Test.v3",
  }, 
  {
    "label": "Fp2",
    "name": "Channel Fp2",
    "unit": "u",
    "sampling_rate": 1000.0,
    "description": null,
    "measure": "measure_04",
    "source": "Test.v3",
  }, 
  {
    "label": "F7",
    "name": "Channel F7",
    "unit": "u",
    "sampling_rate": 1000.0,
    "description": null,
    "measure": "measure_04",
    "source": "Test.v3",
  }, 
  {
    "label": "F3",
    "name": "Channel F3",
    "unit": "u",
    "sampling_rate": 1000.0,
    "description": null,
    "measure": "measure_04",
    "source": "Test.v3",
  }, 
  {
    "label": "Fz",
    "name": "Channel Fz",
    "unit": "u",
    "sampling_rate": 1000.0,
    "description": null,
    "measure": "measure_04",
    "source": "Test.v3",
  }, ...]


Now that you have set up the test database and registered the required components (source, measure, and channels), you can proceed with uploading time series data and creating trials. After uploading the data, you can query the trials and reconstruct the data for further analysis. The previous code snippets you provided demonstrate how to perform these tasks.

## Creating data structure and class vector

In this section, we will create the required data structure, which is a three-dimensional vector (trial, channel, time), and a vector of classes. The three-dimensional vector represents the number of trials, channels, and time samples in the dataset. The class vector contains the class labels for each trial.

In [67]:
trials_per_class = 3
classes = 4
raw_data = np.random.normal(size=(trials_per_class*classes, len(channels_names), 1000)) # trials, channels, time

classes = np.array([f'cLass-{cls}' for cls in range(classes)] *  trials_per_class)
np.random.shuffle(classes)

raw_data.shape, classes.shape

((12, 16, 1000), (12,))

This code snippet demonstrates how to generate random data with a shape of (10, 16, 1000) representing 10 trials, 16 channels, and 1000 time samples. The trials are equally divided between two classes. The data is generated using the NumPy library and the random normal distribution function, which creates an array with the specified shape.

In addition to the data, a class vector is created with 10 elements. The vector assigns the first 5 trials to class 0 and the remaining 5 trials to class 1. The class vector will be used later to associate each trial with its corresponding class when analyzing the data.

## Uploading trials

In this section, we will demonstrate how to upload data to the database, including the new `trial` and `trial_id` arguments. These arguments are used to associate each time series data point with a specific trial and its corresponding class.

In [72]:
data = []
for i, (trial, class_) in enumerate(zip(raw_data, classes)):
    data.append({
        'source': 'Test.v3',
        'measure': 'measure_04',
        'timestamps': np.linspace(i, i+1, 1000, endpoint=False).tolist(),
        'chunk': class_,
        'values': {ch: v.tolist() for ch, v in zip(channels_names, trial)}
    })
    
JSON(data[:3])

[
  {
    "source": "Test.v3",
    "measure": "measure_04",
    "timestamps": [0.0, 0.001, 0.002, 0.003, 0.004, ...],
    "chunk": "cLass-1",
    "values": 
    {
      "Fp1": [-1.6723759831277665, -0.9473302997458278, 1.0350157020911188, -0.13423741313738147, 0.44213124997099895, ...],
      "Fp2": [1.680885905906827, -0.1733593233589856, -0.011461002564741505, 1.1170364796446257, -0.28294682413068106, ...],
      "F7": [0.18728360215630885, -0.33346644921009494, 2.5875332167554745, 2.3871154156571293, 0.5436863013530485, ...],
      "F3": [1.2093193193276395, -1.3937398652106703, 2.118364410248672, 0.03893524317374443, -0.7148656465038554, ...],
      "Fz": [0.0329858016009721, -0.47006625363272, -0.0644958255885579, -0.5062373449531851, -0.19944962870579777, ...],
      "F4": [-0.33191532288976977, -0.17625404968931085, 0.290702978810691, 0.4433066292703765, -2.390242683154221, ...],
      "F8": [0.6248110949787076, -0.405694648015326, -0.0713277858295882, -0.8391293842673967, -1.31

This code snippet demonstrates how to create a list of time series data, including the new `trial` and `trial_id` arguments. The `trial_id` is incremented for each trial in the data, and the `trial` argument is assigned either 'Right' or 'Left' based on the class. The resulting `time_series` list contains data points with the necessary information for each trial, channel, and time sample.

In this section, we will upload the time series data to the database, following the same procedure as before.

In [73]:
await api.timeserie.post(data, batch_size=32)

[{}]

This code snippet demonstrates how to upload the time series data to the database using the `api.timeserie.post()` method with a specified batch size of 32. The batch size determines how many data points are uploaded to the database in a single request, which can help improve the efficiency of the upload process.

## Querying trials

In this section, we will demonstrate how to query the trials based on certain parameters. These parameters can be used to filter and retrieve specific data from the database. In this example, we will focus on querying trials based on the 'OpenBCI' source and 'Left' and 'Right' trial classes.

In [87]:
trials_response = await api.timeserie.get({
    'source': 'Test.v3',
    'measure': 'measure_04',
    'chunks': ['cLass-0', 'cLass-1', 'cLass-2'],
    'channels': [
        'Fp1',
        'Fp2',
    ],
    'timestamps': 'false',
    # 'page_size': 2,
})

JSON(trials_response, max_list_len=3)


{
  "count": 9,
  "next": null,
  "previous": null,
  "results": [
    {
      "source": "Test.v3",
      "measure": "measure_04",
      "timestamps": [],
      "values": 
      {
        "Fp1": [-1.6723759831277665, -0.9473302997458278, 1.0350157020911188, ...],
        "Fp2": [1.680885905906827, -0.1733593233589856, -0.011461002564741505, ...],
      },
      "chunk": "cLass-1",
    }, 
    {
      "source": "Test.v3",
      "measure": "measure_04",
      "timestamps": [],
      "values": 
      {
        "Fp1": [0.2925707332508038, 0.7009836522831058, 0.19938579486057034, ...],
        "Fp2": [0.07687330948265829, 1.5640173039622551, -0.9889755990090431, ...],
      },
      "chunk": "cLass-1",
    }, 
    {
      "source": "Test.v3",
      "measure": "measure_04",
      "timestamps": [],
      "values": 
      {
        "Fp1": [0.09047255018464918, -0.627132504011511, 0.9677498234443604, ...],
        "Fp2": [0.7810200018072582, -0.7662184612206309, -0.01198492672771307, ...],
   

In this example, the trials are queried based on the 'OpenBCI' source and 'Left' and 'Right' trial classes. The `channel` parameter is commented out, meaning that all channels will be included in the query. The `time` parameter is set to 'false' to exclude time information from the response. The resulting `trials_response` contains a list of trials with the specified filtering parameters applied, providing a convenient way to analyze and process the data.

## Reconstructing data from queried trials

To reconstruct the data, we will iterate through the trials in the response and extract the channel values and trial classes. This process allows us to reassemble the data into a suitable format for further analysis or processing.

In [94]:
trials = []
classes = []
for trial in trials_response['results']:
    trials.append(list(trial['values'].values()))
    classes.append(trial['chunk'])
    
np.array(trials).shape, np.array(classes).shape

((9, 2, 1000), (9,))

### Utilizing the ```get_data``` function from the Dunderlab API
The script below uses the ```get_data``` function from the Dunderlab API to achieve the same goal as the code above. This function simplifies the process of reconstructing the data from the queried trials.

In [89]:
from dunderlab.api.utils import get_data

trials, classes = get_data(trials_response)
trials.shape, classes.shape

((9, 2, 1000), (9,))

In this example, the reconstructed data has a shape of (10, 16, 1000), representing 10 trials, each with 16 channels and 1000 time points. The reconstructed classes array has a shape of (10,), indicating 10 trial classes. This reconstructed data can now be used for further analysis or processing, such as machine learning or visualization tasks.