Skip to content

hasanssl/BdSLW60_Code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Before You Jump In

*****Assuming you downloaded the code from Github

Dataset and Annotation

step 0-0: download the BdSLW60 dataset from Kaggle. (https://www.kaggle.com/datasets/hasaniut/BdSLW60) COPY/DOWNLOAD ALL THE CONTENTS IN BdSL60 FOLDER. You will first get the folder empty before copying.

step 0-1: (Optional) If you want to run the DTW_SVM 'pipeline_SVM_DTW.ipynb': download the DTW_BdSLW60 dataset from Kaggle. (https://www.kaggle.com/datasets/hasaniut/DTW_BdSLW60) Overwrite the DTW folder already there.

step 0-2: (Optional) If you have changed the dataset annotation and want to generate JSON files again. Think before you do this. If you have a video editing tool and can see the video frame by frame, then do it. (Raw annotation text files are available in the dataset BdSLW60 wordwise subfolders)

      Run the python script  'RUN_THIS_JSON.py' available in 'OPTIONAL_JSON_CREATION' Folder 
      this will overwrite the default annotation json file provided with the dataset.

Preprocesing

step 1-0: (Optional) Something about DTW experiment. If you have performed step 0-1 Edit the last cell of 'PREPROCESSING/RUN_THIS_CREATE_NUMPY.ipynb' and set the flag DTW_weights_available to True

step 1-1: run the 'PREPROCESSING/RUN_THIS_CREATE_NUMPY.ipynb' This will generate all the mediapipe landmarks and numpy files, and will generate normalized data. The first time run may take several hours (10 hours or more) because it will extract landmarks from videos. (Subsequent runs will take much less time as the landmark extractions will not be repeated) step 1-2: Wrote two scripts to translate the coordinates locally to wrist point and elbow and so on. This allows local changes to be captured. We quantized them according to our heuristics (can be found in the second script) run PREPROCESSING_RQ/relativeCalibrated.ipynb and PREPROCESSING_RQ/relativeQuantize_D5MAX.ipynb

step 1-3: (Optional and discouraged) Run the script 'DTW Distance Calculation.ipynb' available OUTSIDE PREPROCESSING folder. You may do this if you changed data annotations and perfomed step 0-2.

      CAUTION: This is very computation intensive and takes more than 5 days with 60 computers calculating the distances.

step 1-4: (Optional) If you want to check the number of missing hand landmarks in the dataset. The script also finds out number of right handed and left handed samples from each signer. RUN:

         	PREPROCESSING/missing_handpoint_count.ipynb

Machine Learning:

step 2-0: Run the scripts for LSTM WITH ATTTENTION pipeline in any sequence. The experiements are with all sign instances but with different dataset and feature and early stopping configurations.

         All 1629 features (Pose, Face and Hand Landmarks)
    	Left Hand data Flipped:  Hand Landmarks horizontally flipped centering around the middle of two shoulders
 	       pipeline_LSTM_1629_FLIPPED_ST5_B32.ipynb
               pipeline_LSTM_1629_FLIPPED_no_stop_B32.ipynb
                   pipeline_LSTM_1629_FLIPPED_ST5_B64.ipynb
            NO FLIP: 
                   pipeline_LSTM_1629_ST5_B32.ipynb
                   
         225 features (Pose, and Hand Landmarks, NO FACE)
    	Left Hand data Flipped:  Hand Landmarks horizontally flipped centering around the middle of two shoulders
              pipeline_LSTM_225_FLIPPED_ST5_B32.ipynb
                   pipeline_LSTM_225_FLIPPED_ST5_B64.ipynb
	       pipeline_LSTM_225_FLIPPED_no_stop_B64.ipynb
 	       pipeline_LSTM_225_FLIPPED_no_stop_B72.ipynb
               pipeline_LSTM_225_FLIPPED_ST5_B64_D5MAX.ipynb
	       pipeline_LSTM_225_FLIPPED_no_stop_B64_D5MAX.ipynb
    	NO FLIP:  
      	       pipeline_LSTM_225_ST5_B32.ipynb
	       pipeline_LSTM_225_ST5_B64.ipynb
	       pipeline_LSTM_225_no_stop_B64.ipynb

step 2-1: SVM pipeline with gestures plonged to 164 frames by uniformly duplicating frames. All 164 frame features are fed as input to SVM.

  RUN: 
  
          		pipeline_SVM.ipynb
	        pipeline_SVM_NO_FLIP.ipynb
      		pipeline_SVM_POSE_HAND.ipynb
 		pipeline_SVM_POSE_HAND_NO_FLIP.ipynb

step 2-2: SVM with DTW distances. No need to plong the gestures. Highly computationally intensive to calculate the distances. Run if you have performed step 0-1 and checked step 1-2.

  RUN:
  
          pipeline_SVM_DTW.ipynb
      pipeline_SVM_DTW_POSE_HAND.ipynb

MOST PROBABLY YOU ARE READING THIS LINE Thank You!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published