# Intro & Metadata

This notebook introduces the approach of the current repository, and explores the metadata files.

In [1]:
%config Completer.use_jedi = False

In [2]:
import os
from pathlib import Path

import numpy as np
import pandas as pd

In [3]:
# Import local code module
import sys
sys.path.append('..')
from conv import process_session
from conv.parser import parse_lines_log, parse_lines_sync
from conv.process import process_task
from conv.io import *

## Metadata Files

Metadata for the task is organized into a series of config files, each representing a facet of the study / task. 

In order to process an individual subject, the configs can be loaded together and session specific information can be added. 

This combined metadata file is then used to create the output data file. 

In [4]:
# Get a list of metadata files
files = get_files('../metadata', 'yaml')

In [5]:
# Check the list of configuration files
files

['device_info.yaml',
 'electrode_info.yaml',
 'events_info.yaml',
 'soring_info.yaml',
 'study_info.yaml',
 'subject_info.yaml',
 'task_info.yaml',
 'timestamps_info.yaml',
 'units_info.yaml']

In [6]:
# Load all the files together to create an overall metadata object
metadata = load_configs(files, '../metadata')

In [7]:
# Check the metadata object
metadata

{'device': {'name': 'Microwire Electrodes',
  'description': 'Behnke Fried Micro Inner Wire Bundle',
  'manufacturer': 'Ad-Tech Medical'},
 'electrode': {'name': 'XX',
  'description': 'Behnke Fried/Micro Inner Wire Bundle.',
  'location': 'XX',
  'impedence': 'XX',
  'filtering': 'XX',
  'reference': 'XX',
  'position': 'XX, XX, XX'},
 'events': None,
 'soring': {'sorter': 'XX',
  'version': 'XX',
  'done_by': 'XX',
  'date': 'XX',
  'settings': 'XX'},
 'study': {'identifier': 'XX',
  'session_id': 'XX',
  'session_start_time': '1111/11/11',
  'session_description': 'XX',
  'experimenter': 'XX',
  'experiment_description': 'XX',
  'institution': 'Columbia University',
  'keywords': 'XX',
  'source_script': 'https://github.com/JacobsSU/ConvertXX/',
  'source_script_file_name': 'XX',
  'data_collection': 'XX',
  'stimulus_notes': 'XX',
  'lab': 'Electrophysiology, Memory, and Navigation Laboratory'},
 'subject': {'age': 'XX',
  'sex': 'XX',
  'species': 'human',
  'description': 'The su

The above metadata file includes all the metadata fields

Where available, information is prefilled with default information for the current task. 

For fields not filled in (marked with `XX`), the metadata needs to be entered. 

In [8]:
# Save out the collected metadata to a subject-level file
save_config(metadata, 'example_files/example_metadata')

In [9]:
# Reload the collected metadata file, and check an example attribute
metadata_new = load_config('example_files/example_metadata.yaml')
metadata_new['study']['lab']

'Electrophysiology, Memory, and Navigation Laboratory'

## Logfile Processing

The task logfile, which is a structured txt file, needs parsing and organizing in order to create the new data files. 

In [None]:
# Define base data folder
data_folder = Path('...')

In [None]:
# Define subject information
subj = ''
session = ''

In [None]:
# Define full file path
logfile_path = data_folder / subj / session

In [None]:
# Define file locations
logfile_path = full_path / 'behavior' / (subj + 'Log.txt')
sync_path = full_path / 'sync' / ...

### Process session all together

The logfile processing extracts required information from the logfile into a `Task` object.

In [None]:
# Process task information
task = process_session(logfile_path, ..., process=True)

In [None]:
# Check the task object
task

### Process logfile in steps

The logfile processing includes two main steps:

- parse the logfile text, extracted required information
- process the collected information into a DataFrame

These processes can be run separately, which may be useful for explorating and needed updates, etc.

In [None]:
# Parse the log file
task1 = parse_lines_log(logfile_path)

In [None]:
# Check task object - task information 
print(task1.trial['trial'][0:5])

In [None]:
# Parse the sync pulses 
task1 = parse_lines_sync(sync_path, task1)

This Task object can then be passed into a subsequent function to process the information.

In [None]:
# Preprocess the task information
task1 = process_task(task1)

### Save & Reload task objects

In [None]:
# Save out a task object
task_obj_name = 'example_task_obj'
save_task_object(task, task_obj_name, folder=examples_folder)

In [None]:
# Reload task object
new_task = load_task_object(task_obj_name, folder=examples_folder)

In [None]:
# Check reloaded task object
new_task

### Process Functions

This `Task` object can then be passed into subsequent function(s) to process the information.

These process functions include:
- `process_time_info`
- `process_task_info`
- `process_position_info`
- `process_location_info`
- `process_error_info`

Each of these functions can be updated independently if some task-related information needs to be updated. 

In [None]:
# Import the process functions
from conv.process import *