# Splitting worksheet on a per subject basis

The following source code cells are responsible for splitting a singular _.xlsx_ file containing a single worksheet into multiple worksheets on a per subject basis. This is intended to be executed prior to preparing the file for the study team to populate their respective columns. Information needed prior to execution include:
 - Input _.xlsx_ file containing all collected ECG information
 - Desired output file/path destination
 - A column containing the unique subject identifier for each subject

In [None]:
from src import FileReader, FileWriterXL
import os

The function, main, acts as a container for all the business logic. The workflow grammar can be described as follows:

 1. Retrieve the location/path of the input _.xlsx_ file
 2. Ensure the path specified exist. If not, raise an exception/error.
 3. Get the _.xlsx_ workbook metadata
 4. Fetch the column number in which the subjects unique identifier can be found
 5. Collate all information related to a particular subject
 6. Read the specified output file's location
 7. Create the output file along with all thee worksheets containing subject specific data.

In [None]:
def main():
    filename = input("Input filename / location:\n") # GET FILENAME / LOCATION

    # CHECK THE INPUT FILE EXISTS
    if not os.path.exists(filename):
        raise Exception(f"Specified input file path does not exist ...\n{filename}")

    # GET XLSX WORKBOOK METADATA
    wb_reader = FileReader(filename)
    sheetname = wb_reader.getSheetnames()[0]
    ws = wb_reader.getWorksheet(sheetname)
    headers = wb_reader.getSheetHeaders(sheetname)

    # GET THE COLUMN
    id_col = int(input("Column number of unique subject identifier: (Note: Column A = 0)\n"))
    # id_col = 6

    # Look for participants
    participants = {}
    rows = ws.iter_rows(min_row=2, values_only=True)
    for row in rows:
        if row[id_col] is None:
            break
        elif row[id_col] == 'VOID':
            continue
        else:
            # parser.parseRow(row, subject)
            id = row[id_col].upper().replace(' ', '') # STANDARDISE TO UPPER CASE & NO SPACE
            if participants.get(f'{id}', None) is None:
                participants[f'{id}'] = []
            values = {}
            for index, header in enumerate(headers):
                values[f'{header}'] = row[index]
            participants[f'{id}'].append(values)

    # RETREIVE LOCATION OF OUTPUT FOLDER
    outputfolder = input('Output folder location:\n')
    
    # CHECK OUTPUT FOLDER EXISTS
    if not os.path.exists(outputfolder):
        raise Exception('Specified output folder path does not exist.')
    
    outputfilename = os.path.join(outputfolder, 'output')

    # WRITE TO FILE NEW SHEETS
    wb_writer = FileWriterXL(outputfilename, headers)
    for key, value in participants.items():
        print(key)
        wb_writer.bulkWriteSheet(key, value)

    wb_writer.close() # saves the document

try:
    main()
except Exception as err:
    print("An error has occured!\n\n")
    print(err)
    print("Please restart the kernal and run again.")