## E4 Sensor Concatenation

This sensor concatenation file compiles all .csv files of subjects by sensor type. A column is added with the "Subject_ID" and arranges the data in order of ascending ID number. The output of this function is a csv file. 

**INPUT: Properly formatted .csv files from the E4FileFormatter** (E4_data)




**OUTPUT:** __Each .csv file will consist of only one type of sensor data. A column for subject ID has been added. Data will be organized numerically, by subject ID. Headers will be based on the column names input into the function.__

In [1]:
import pandas as pd
import glob
import os

# Please change the working directory to the folder containing the properly formatted .csv files from the E4 File Formatter. 
os.chdir('./E4_data/')

## Import & Concatenate Sensor Data of Choice
**Functions:**
* $\underline{data\_concat()}$ - reads all files in data directory (00_source) and concatenates those of one sensor type. Adds subject ID column to resulting .csv file
  > <span style="color:blue">data</span> = data type to be concatenated as a string <br>
  > <span style="color:blue">cols</span> = column names in resulting dataframe as a list <br>
  > <span style="color:blue">file_name</span> = output .csv file name as a string <br>


In [3]:
# Select files of specific data and concat to one dataframe

def data_concat(data, cols, file_name):
    """
    data = data type to be concatenated as a string
    cols = column names in resulting dataframe as a list
    file_name = output csv file name as a string
    """
    all_filenames = [i for i in glob.glob(f'*{data}.csv')]
    all_filenames = sorted(all_filenames)
    df = pd.concat([pd.read_csv(f, header=None).assign(Subject_ID=os.path.basename(f))
                    for f in all_filenames])
    df['Subject_ID'] = df['Subject_ID'].str[:6]
    df.columns = cols
    # Please edit the following line to output files in directory of choice
    df.to_csv(f"../../10_code/10_pre_outlier_removal_processing/{file_name}.csv", index = False)
    return df


cols = ['Time','ACC1','ACC2','ACC3', 'Subject_ID']

data_concat("ACC", cols, "10_ACC_Combined")

Unnamed: 0,Time,ACC1,ACC2,ACC3,Subject_ID
0,2019-07-17 11:50:05.000000,-38.0,-50.0,8.0,19-001
1,2019-07-17 11:50:05.031250,-40.0,-50.0,8.0,19-001
2,2019-07-17 11:50:05.062500,-40.0,-49.0,7.0,19-001
3,2019-07-17 11:50:05.093750,-42.0,-51.0,7.0,19-001
4,2019-07-17 11:50:05.125000,-42.0,-51.0,7.0,19-001
...,...,...,...,...,...
29023,2019-08-16 16:12:50.968750,10.0,-40.0,-44.0,19-056
29024,2019-08-16 16:12:51.000000,11.0,-44.0,-47.0,19-056
29025,2019-08-16 16:12:51.031250,11.0,-43.0,-45.0,19-056
29026,2019-08-16 16:12:51.062500,10.0,-44.0,-44.0,19-056
