## E4 Sensor Concatenation

This sensor concatenation file compiles all .csv files of subjects by sensor type. A column is added with the "Subject_ID" and arranges the data in order of ascending ID number. The output of this function is a csv file. 

***

##### **Input:** Properly formatted .csv files from the E4FileFormatter (DBDP preprocessing folder)

##### **Output:** Each .csv file will consist of only one type of sensor data. A column for subject ID has been added. Data will be organized numerically, by subject ID. Headers will be based on the column names input into the function. 
***

In [1]:
import pandas as pd
import glob
import os

os.chdir('../00_source')

## Import & Concatenate Sensor Data of Choice
**Functions:**
* $\underline{data\_concat()}$ - reads all files in data directory (00_source) and concatenates those of one sensor type. Adds subject ID column to resulting .csv file
  > <span style="color:blue">data</span> = data type to be concatenated as a string <br>
  > <span style="color:blue">cols</span> = column names in resulting dataframe as a list <br>
  > <span style="color:blue">file_name</span> = output .csv file name as a string <br>


In [6]:
# Select files of specific data and concat to one dataframe

def data_concat(data, cols, file_name):
    """
    data = data type to be concatenated as a string
    cols = column names in resulting dataframe as a list
    file_name = output csv file name as a string
    """
    all_filenames = [i for i in glob.glob(f'*{data}.csv')]
    all_filenames = sorted(all_filenames)
    df = pd.concat([pd.read_csv(f, header=None).assign(Subject_ID=os.path.basename(f))
                    for f in all_filenames])
    df['Subject_ID'] = df['Subject_ID'].str[:6]
    df.columns = cols
    df.to_csv(f"../20_Intermediate_files/{file_name}.csv", index = False)
    return df


cols = ['Time', 'TEMP', 'Subject_ID']

data_concat("TEMP", cols, "20_Temp_Combined")

Unnamed: 0,Time,TEMP,Subject_ID
0,2019-07-17 11:50:05.000,26.23,19-001
1,2019-07-17 11:50:05.250,26.23,19-001
2,2019-07-17 11:50:05.500,26.23,19-001
3,2019-07-17 11:50:05.750,26.23,19-001
4,2019-07-17 11:50:06.000,26.23,19-001
...,...,...,...
3619,2019-08-16 16:12:48.750,29.91,19-056
3620,2019-08-16 16:12:49.000,30.11,19-056
3621,2019-08-16 16:12:49.250,30.11,19-056
3622,2019-08-16 16:12:49.500,30.11,19-056
