DBInterface
==========

.. warning:: This class is in development and has not been released.

The [DBInterface](api/kineticstoolkit.dbinterface.rst) class is used exclusively at the [Mobility and Adaptive Sports Research Lab](https://felixchenier.uqam.ca) and may be removed from ktk someday. Its aim is to interface with the BIOMEC database (https://felixchenier.uqam.ca/biomec) to fetch all non-personal information about a specified ongoing project.

It is necessary to understand how data files are stored in BIOMEC before reading this tutorial.

Please note that the user/password combination used in this tutorial is not valid, and that you should get access to BIOMEC to use ktk.dbinterface.

In [1]:
import kineticstoolkit.lab as ktk
import os
import shutil

Connecting to a BIOMEC project
---------------------------------------
The class constructor connects to the project and asks the user's credentials
and the folder where the data files are stored.

    project = ktk.DBInterface(project_label)

For example:

    project = ktk.DBInterface('FC_XX18A')

The constructor can also be run non-interactively:

In [2]:
project_label = 'dummyProject'
user = 'dummyUser'
password = 'dummyPassword'
root_folder = 'data/dbinterface/FC_XX18A'
url = ''
url = 'http://localhost/biomec'  # This line is only for this tutorial,
                                 # please don't execute it.

project = ktk.DBInterface(project_label,
                          user=user,
                          password=password,
                          root_folder=root_folder,
                          url=url)

Navigating in the project
-------------------------
Just typing ``project`` gives an overview of the project's content.

In [3]:
project

--------------------------------------------------
DBInterface
--------------------------------------------------
          url: http://localhost/biomec
         user: dummyUser
project_label: dummyProject
  root_folder: data/dbinterface/FC_XX18A
--------------------------------------------------
participants:
['P1']
--------------------------------------------------
sessions:
['GymnaseN1', 'ArenaCESR1']
--------------------------------------------------
trials:
['Run1', 'Run2', 'Walk1', 'Walk2']
--------------------------------------------------
files:
['Kinematics', 'Kinetics', 'SyncedKinematics']
--------------------------------------------------

The method ``get`` is used to extract the project's contents. It always returns
a dict with the fields corresponding to the request. For example:

In [4]:
project.get()

{
          'Participants': <list of 1 items>,
        'ProjectEndDate': None,
             'ProjectID': 11,
          'ProjectLabel': 'dummyProject',
      'ProjectStartDate': None,
          'ProjectTitle': 'Dummy Project for the ktkDBInterface tutorial'
}

In [5]:
project.get('P1')

{
                   'AIS': None,
           'DateOfBirth': None,
          'DateOfInjury': None,
          'DominantSide': None,
         'ParticipantID': 127,
      'ParticipantLabel': 'P1',
             'Pathology': 'Pathologie inconnue',
             'ProjectID': 11,
          'ProjectLabel': 'dummyProject',
              'Sessions': <list of 2 items>,
                   'Sex': None,
             'Traumatic': None,
                   'UID': 1
}

In [6]:
project.get('P1')['Sessions']

['GymnaseN1', 'ArenaCESR1']

In [7]:
project.get('P1', 'GymnaseN1')

{
          'ParticipantID': 127,
       'ParticipantLabel': 'P1',
             'PlaceLabel': 'GymnaseN',
            'SessionDate': '2018-10-17',
              'SessionID': 127,
           'SessionLabel': 'GymnaseN1',
           'SessionNotes': None,
      'SessionRepetition': 1,
                 'Trials': <list of 4 items>
}

In [8]:
project.get('P1', 'GymnaseN1')['Trials']

['Run1', 'Run2', 'Walk1', 'Walk2']

In [9]:
project.get('P1', 'GymnaseN1', 'Run1')

{
                     'Files': <list of 3 items>,
          'ParticipantLabel': 'P1',
                 'SessionID': 127,
              'SessionLabel': 'GymnaseN1',
                   'TrialID': 3150,
                'TrialLabel': 'Run1',
                'TrialNotes': None,
           'TrialRepetition': 1,
      'TrialTypeDescription': None,
               'TrialTypeID': 204,
            'TrialTypeLabel': 'Run'
}

In [10]:
project.get('P1', 'GymnaseN1', 'Run1')['Files']

['Kinematics', 'Kinetics', 'SyncedKinematics']

In [11]:
project.get('P1', 'GymnaseN1', 'Run1', 'Kinematics')

{
      'FileFormatExtension': None,
          'FileFormatLabel': 'C3D',
                   'FileID': 6681,
                'FileLabel': 'Kinematics',
                 'FileName': 'data/dbinterface/FC_XX18A/kinematics1_dbfid6681n.c3d',
      'FileTypeDescription': 'Kinematics recorded using a Vicon system',
               'FileTypeID': 29,
            'FileTypeLabel': 'Kinematics',
         'ParticipantLabel': 'P1',
             'SessionLabel': 'GymnaseN1',
                  'TrialID': 3150,
               'TrialLabel': 'Run1',
                    'dbfid': 'dbfid6681n'
}

In [12]:
project.get('P1', 'GymnaseN1', 'Run1', 'Kinematics')['FileName']

'data/dbinterface/FC_XX18A/kinematics1_dbfid6681n.c3d'

Saving data to a BIOMEC referenced file
---------------------------------------
The ktk library provides the function ktk.save to save a variable to a JSON file. The ktk.save function is helpful to save temporary results.

However, sometimes we need to save results to BIOMEC so that these results become new inputs for subsequent work. In these case, we use the ktk.DBInterface's ``save`` method.

For example, let's say we just synchronized the kinematics for Run1 of participant 1:

In [13]:
synced_kinematics = {'dummy_data':
                     'Normally we would save something more useful'}

We can save these kinematics as a file that is referenced in BIOMEC, using:

In [14]:
project.save('P1', 'GymnaseN1', 'Run1', 'SyncedKinematics', synced_kinematics)

'data/dbinterface/FC_XX18A/SyncedKinematics/P1/GymnaseN1/dbfid11524n_{Run1}.ktk.zip'

This creates the file entry in BIOMEC if needed, then save the file with
a relevant name into the project folder.

We can now obtain the file name using the ``get`` method introduced precedently.

In [15]:
project.get('P1', 'GymnaseN1', 'Run1', 'SyncedKinematics')['FileName']

'data/dbinterface/FC_XX18A/SyncedKinematics/P1/GymnaseN1/dbfid11524n_{Run1}.ktk.zip'

Loading data from a BIOMEC referenced file
------------------------------------------
To load back data saved to BIOMEC, we use the ktk.DBInterface's ``load``
method.

In [16]:
test = project.load('P1', 'GymnaseN1', 'Run1', 'SyncedKinematics')

test

{
      'dummy_data': 'Normally we would save something more useful'
}

Let's do a little clean up before going on.

In [17]:
shutil.rmtree(root_folder + '/SyncedKinematics')

Dealing with external software
------------------------------
The DBInterface's ``save`` and ``load`` methods work very well for data that were processed in Python using ktk. However, things may get complicated when using external software to process data.

In this example, we will synchronize the kinematics using an external synchronizing tool, then enter the resulting files into BIOMEC. We will work with these files:

In [18]:
file_list = []
for trial in ['Walk1', 'Walk2', 'Run1', 'Run2']:
    file_list.append(project.get(
        'P1', 'GymnaseN1', trial, 'Kinematics')['FileName'])

file_list

['data/dbinterface/FC_XX18A/kinematics3_dbfid6685n.c3d',
 'data/dbinterface/FC_XX18A/kinematics4_dbfid6687n.c3d',
 'data/dbinterface/FC_XX18A/kinematics1_dbfid6681n.c3d',
 'data/dbinterface/FC_XX18A/kinematics2_dbfid6683n.c3d']

Let say we synchronized these files using an external software, and then we
exported the synchronized files into a separate folder.

(Here we will simply copy those files into a separate folder).

In [19]:
os.mkdir(root_folder + '/synchronized_files')

for file in file_list:
    dest_file = file.replace(root_folder, root_folder + '/synchronized_files')
    shutil.copyfile(file, dest_file)

os.listdir(root_folder + '/synchronized_files')

['kinematics3_dbfid6685n.c3d',
 'kinematics2_dbfid6683n.c3d',
 'kinematics4_dbfid6687n.c3d',
 'kinematics1_dbfid6681n.c3d']

All is good, but the dbfids in the new ``synchronized_files`` folder refer to
the original ``Kinematics`` file type, not to the ``SyncedKinematics`` file
type. Moreover, there are now duplicate dbfids in the project:

In [20]:
project.refresh()



In [21]:
project.duplicates

[('data/dbinterface/FC_XX18A/synchronized_files/kinematics3_dbfid6685n.c3d',
  'data/dbinterface/FC_XX18A/kinematics3_dbfid6685n.c3d'),
 ('data/dbinterface/FC_XX18A/synchronized_files/kinematics2_dbfid6683n.c3d',
  'data/dbinterface/FC_XX18A/kinematics2_dbfid6683n.c3d'),
 ('data/dbinterface/FC_XX18A/synchronized_files/kinematics4_dbfid6687n.c3d',
  'data/dbinterface/FC_XX18A/kinematics4_dbfid6687n.c3d'),
 ('data/dbinterface/FC_XX18A/synchronized_files/kinematics1_dbfid6681n.c3d',
  'data/dbinterface/FC_XX18A/kinematics1_dbfid6681n.c3d')]

Therefore we need to assign new dbfids to the files we just synchronized, so that they refer to ``SyncedKinematics`` entries in BIOMEC. The method ``batch_fix_file_type`` will help.

In [22]:
project.batch_fix_file_type(root_folder + '/synchronized_files',
                            'SyncedKinematics',
                            create_file_entries=True,
                            dry_run=False)

{
               'Ignore': <list of 0 items>,
      'NoFileTypeLabel': <list of 0 items>,
               'Rename': <list of 4 items>
}

Now let see what happened in the ``synchronized_files`` folder:

In [23]:
os.listdir(root_folder + '/synchronized_files')

['kinematics4_dbfid11526n.c3d',
 'kinematics1_dbfid11524n.c3d',
 'kinematics3_dbfid8491n.c3d',
 'kinematics2_dbfid11525n.c3d']

The files' dbfid have been updated so they now refer to ``SyncedKinematics`` and not to ``Kinematics`` anymore. Moreover, the project does not have duplicate dbfids anymore:

In [24]:
project.refresh()

Let's do a little clean up before going on.

In [25]:
shutil.rmtree(root_folder + '/synchronized_files')
project.refresh()

Including information in file names
-----------------------------------
It can be difficult to deal with a bunch of numbered files without knowing their signification without looking in BIOMEC. The ``tag_files`` method allows adding the trial name to the file names, so that their context is a bit clearer and less error-prone.

In [26]:
os.listdir(root_folder)

['kinetics_dbfid6688n.csv',
 'kinematics3_dbfid6685n.c3d',
 'kinetics_dbfid6684n.csv',
 '.DS_Store',
 'kinetics_dbfid6682n.csv',
 'kinematics2_dbfid6683n.c3d',
 'kinematics4_dbfid6687n.c3d',
 'kinetics_dbfid6686n.csv',
 'kinematics1_dbfid6681n.c3d']

Include the trial name in the file names:

In [27]:
project.tag_files(include_trial_name=True, dry_run=False)

os.listdir(root_folder)

Checking that the project is clean, without duplicates...
Renaming the files...
Refreshing project...


['kinematics1_dbfid6681n_{Run1}.c3d',
 'kinetics_dbfid6682n_{Run1}.csv',
 '.DS_Store',
 'kinematics4_dbfid6687n_{Walk2}.c3d',
 'kinematics3_dbfid6685n_{Walk1}.c3d',
 'kinematics2_dbfid6683n_{Run2}.c3d',
 'kinetics_dbfid6688n_{Walk2}.csv',
 'kinetics_dbfid6684n_{Run2}.csv',
 'kinetics_dbfid6686n_{Walk1}.csv']

Remove the trial name from the file names:

In [28]:
project.tag_files(include_trial_name=False, dry_run=False)

os.listdir(root_folder)

Checking that the project is clean, without duplicates...
Renaming the files...
Refreshing project...


['kinetics_dbfid6688n.csv',
 'kinematics3_dbfid6685n.c3d',
 'kinetics_dbfid6684n.csv',
 '.DS_Store',
 'kinetics_dbfid6682n.csv',
 'kinematics2_dbfid6683n.c3d',
 'kinematics4_dbfid6687n.c3d',
 'kinetics_dbfid6686n.csv',
 'kinematics1_dbfid6681n.c3d']

For more information on DBInterface, please check the [API Reference](api/kineticstoolkit.dbinterface.rst).