<a href="https://colab.research.google.com/github/rpandya5/gaitanalysis/blob/main/Gait_Analysis_Preprocessing_Setup.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### SisFall Dataset Preprocessing

The SisFall Dataset contains acclerometer and gyroscope sensor data of 19 ADLs and 15 different types of Fall recorded by 23 young adults and 14 healthy and independent old adults (well over 62 years of age). Currently, the SisFall dataset has been stored in the form of txt files containing the x, y and z directional components of 2 accelerometer readings and 1 gyroscope readings. The data has been sampled at 200 Hz and is stored in the form of bits based on the type of sensor.

The Python script below helps to convert these txt files into csv files along with the sensor readings converted from bits to actual readings in their standard units. It stores them in a seperate folder named "Preprocessed Data" which contains seperate sub-folders for Fall and Non Fall Activities. The naming convention from the original SisFall dataset has been conserved as this contains specific information about the participant, trial no and activity of the data stored.

For more information on the SisFall Dataset, kindly use the link below:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5298771/

In [None]:
# To load the dataset, I would recommend uploading the zip file unto drive
# Mounting and accessing the drive will make it easier

from google.colab import drive

drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [None]:
#Changes the directory to access the files saved in drive
import os
os.chdir('/content/gdrive/MyDrive')

In [None]:
#This code will unzip the dataset and save it in a folder SisFall in your drive
#Just run this once

!unzip SisFall_dataset.zip -d SisFall

Archive:  SisFall_dataset.zip
   creating: SisFall/SisFall_dataset/
  inflating: SisFall/SisFall_dataset/Links videos SisFall youtube.txt  
  inflating: SisFall/SisFall_dataset/Readme.txt  
   creating: SisFall/SisFall_dataset/SA01/
  inflating: SisFall/SisFall_dataset/SA01/D01_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D02_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D03_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D04_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D05_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D05_SA01_R02.txt  
  inflating: SisFall/SisFall_dataset/SA01/D05_SA01_R03.txt  
  inflating: SisFall/SisFall_dataset/SA01/D05_SA01_R04.txt  
  inflating: SisFall/SisFall_dataset/SA01/D05_SA01_R05.txt  
  inflating: SisFall/SisFall_dataset/SA01/D06_SA01_R01.txt  
  inflating: SisFall/SisFall_dataset/SA01/D06_SA01_R02.txt  
  inflating: SisFall/SisFall_dataset/SA01/D06_SA01_R03.txt  
  inflating: SisFall/SisFall_datase

In [None]:
#Importing the libraries
import os
import pandas as pd
import numpy as np

#Class Module for Preprocessing the files
class DataPreparation:
    def __init__(self, input_path):
        self.input_path = input_path
        self.output_path = '/content/gdrive/MyDrive/Preprocessed Data'

    #Creates folders to store the csv files in
    def create_folders(self):
        os.mkdir(self.output_path)
        os.chdir(self.output_path)
        os.mkdir('FALL')
        os.mkdir('NO FALL')
        os.chdir(self.input_path)

    #Converts the sensor bit readings into m/s^2
    def convert_readings(self, readings):
        converted_readings = []
        for i in range(9):
            if i<3:
                changed = float((2 * 16  * 9.81 * readings[i])/(pow(2, 13)))
            elif i>2 and i<6:
                changed = float((2 * 2000 * readings[i])/(pow(2, 16)))
            else:
                changed = float((2 * 8 * 9.81 * readings[i])/(pow(2, 14)))
            converted_readings.append(changed)
        return converted_readings

    #Converts the file into a dataframe
    def convert_to_df(self, path):
      file = open(path)
      file_str = file.read()
      readings_str = file_str.split(';\n')
      readings_str = readings_str[:-1]
      final_df = []
      for row in readings_str:
          readings_split = row.split(',')
          readings_final=[]
          for read in readings_split:
              readings_final.append(int(read.strip()))
          final_df.append(self.convert_readings(readings_final))
      final_df = pd.DataFrame(final_df, columns=['X_Acc_1', 'Y_Acc_1', 'Z_Acc_1', 'X_Gyro', 'Y_Gyro', 'Z_Gyro', 'X_Acc_2', 'Y_Acc_2', 'Z_Acc_2'])
      timestamp = np.arange(0, (len(final_df)) * 0.005, 0.005)
      if len(timestamp)-len(final_df)==1:
        timestamp = timestamp[:-1]
      final_df.insert(0, 'Timestamp', timestamp)
      return final_df

    #Gets activity information for each file
    def get_activity_info(self, file_name):
        file_name = file_name[:-4]
        details = file_name.split('_')
        return details[0], details[1], details[2]

    #Runs through all the files and converts and saves them as csv files
    def create_csv(self):
        for dir, _, files in os.walk(self.input_path):
#          print(files)
          for file in files:
#             print(file)
              if file != 'Readme.txt' and file[-4:]!='.pdf' and file!='desktop.ini' and file[-4:]!='.png' and file!='Links videos SisFall youtube.txt' and file!='images.rar':
                  print(file)
                  path = self.output_path
                  activity, person, trial = self.get_activity_info(file)
                  df = self.convert_to_df(os.path.join(dir, file))
                  if activity[0] == 'F':
                      path = os.path.join(path, 'FALL')
                      path = os.path.join(path, file[:-4]+'.csv')
                  else:
                      path = os.path.join(path, 'NO FALL')
                      path = os.path.join(path, file[:-4]+'.csv')
                  df.to_csv(path, index=False)

In [None]:
Processed = DataPreparation('/content/gdrive/MyDrive/SisFall/SisFall_dataset')

In [None]:
Processed.create_folders() #Creates the required folders

FileExistsError: [Errno 17] File exists: '/content/gdrive/MyDrive/Preprocessed Data'

In [None]:
Processed.create_csv() #Converts all files to csv and saves them

D15_SA04_R03.txt
D19_SA04_R05.txt
F14_SA04_R02.txt
D18_SA04_R04.txt
F01_SA04_R01.txt
D14_SA04_R05.txt
F13_SA04_R04.txt
D15_SA04_R01.txt
F14_SA04_R03.txt
D07_SA04_R01.txt
D17_SA04_R01.txt
D10_SA04_R04.txt
F12_SA04_R05.txt
D10_SA04_R03.txt
D10_SA04_R05.txt
D11_SA04_R03.txt
D19_SA04_R01.txt
D19_SA04_R03.txt
D17_SA04_R02.txt
D17_SA04_R05.txt
D18_SA04_R03.txt
D19_SA04_R02.txt
D18_SA04_R05.txt
D19_SA04_R04.txt
D18_SA04_R02.txt
F08_SA04_R03.txt
F09_SA04_R03.txt
F06_SA04_R03.txt
F06_SA04_R04.txt
F07_SA04_R04.txt
F12_SA04_R04.txt
F11_SA04_R05.txt
F12_SA04_R03.txt
F10_SA04_R03.txt
F11_SA04_R04.txt
F12_SA04_R02.txt
F15_SA04_R01.txt
D15_SA04_R02.txt
D16_SA04_R03.txt
D03_SA04_R01.txt
F11_SA04_R01.txt
F15_SA04_R05.txt
D07_SA04_R02.txt
F09_SA04_R04.txt
F11_SA04_R02.txt
D06_SA04_R05.txt
F01_SA04_R04.txt
F04_SA04_R01.txt
F03_SA04_R03.txt
F05_SA04_R04.txt
D12_SA04_R01.txt
F08_SA04_R04.txt
F06_SA04_R01.txt
D12_SA04_R02.txt
F12_SA04_R01.txt
F06_SA04_R02.txt
D01_SA04_R01.txt
F02_SA04_R01.txt
D16_SA04_R01.t