# Convert (or simply copy) behavioural data to `sub/ses/beh` folder

This script copies behavioural log files from `sourcedata` to `rawdata`. 

When the files are copied to `rawdata`, they are named in a BIDS-compliant way (e.g., `sub-xxx_ses-xxx_task-xxx_beh.csv`). This may require tweaking of the code below, depending on the naming of the source log files.

You must save the raw behavioural files in the `sourcedata/sub-00x/beh` subfolder of your study's BIDS root folder.

All study-specific configuration details should be defined in the `config.json` file and not in this script. 


---
Copyright 2024 [Aaron J Newman](https://github.com/aaronjnewman), [NeuroCognitive Imaging Lab](http://ncil.science), [Dalhousie University](https://dal.ca)

Released under the [The 3-Clause BSD License](https://opensource.org/licenses/BSD-3-Clause)

---

In [3]:
from os import path as op
import os
import json
import random
import shutil
from glob import glob
from pathlib import Path
import numpy as np 
import pandas as pd
import mne
mne.set_log_level('error')

## Study Parameters

Will import study-level parameters from `config.yml` in `bids_root`

In [4]:
# this shouldn't change if you run this script from its default location in code/import
bids_root = '../..'

config_file = op.join(bids_root, 'config.json')
config = json.load(open(config_file))

study_name = config['Study']['Name']
study_name = config['Study']['TaskName']
data_type = 'beh'
prefix = study_name
beh_extn = config['EEG']['beh_extn']
logfile_string = 'log' # likely need to update this for your study

## Paths

In [None]:
# source_path is where the input source (raw) files live
source_path = op.join(bids_root, 'sourcedata')

# raw_path is where the results of running this script will be saved
raw_path = op.join(bids_root)           

In [None]:
# convert all participants in sourcedata
in_subjs = [s.split('/')[-1] for s in glob(op.join(source_path, prefix) + '*')]

## Copy behavioural log files to rawdata

In [None]:
for subject in in_subjs:
    sessions = [f.split('/')[-1][-3:] for f in glob(op.join(source_path, subject, 'ses-*'))]

    for sess in sessions:
        print(subject, sess)
        # participant_id is for naming output files. We assume the original id number is last 2 digits in the folder name
        participant_id = 'sub-0' + subject[-2:]

        log_dest = op.join(raw_path, participant_id, 'ses-' + sess,  data_type)
        if Path(log_dest).exists() == False:
            Path(log_dest).mkdir(parents=True)

        log_files = glob(op.join(source_path, subject, sess, data_type) + '/*' + logfile_string + '*.' + beh_extn)
        df_list = []
        for f in log_files:
            df_list.append(pd.read_csv(f))
        pd.concat(df_list).to_csv(log_dest + '/' + participant_id + '_ses-' + sess + '_task-' + task + '_' + data_type + '.tsv',
                                  sep='\t', index=False)

 