gsl-fingerspelling

Dataset

A Dataset has been created from scratch for 6 letters (Β, Γ, Η, Θ, Ζ, Φ). Two signers captured frames for 2 days and 3 sessions per day (morning / evening / night). Final dataset consists of 5400 rgb photos (224x224), 5400 grayscale photos (28x28) and the corresponding csv files (a python script file has been used to convert grayscale to pixels).

How to capture frames

Run script capture_frames.py
click space each time you want to capture a frame
click esc for saving frames and exit

Below we describe the conditions of each session per signer:

Data/pixels (user_1)

morning S1 csv files refer to photos taken at 11am, on white rib backround
morning S2 csv files refer to photos taken at 9am, on black dark backround
evening S1 csv files refer to photos taken at 2pm, on light blue background
evening S2 csv files refer to photos taken at 5pm, on natural light (outdoor)
night S1 csv files refer to photos taken at 1am on light blue backround under lighting
night S2 csv files refer to photos taken at 9pm on yellow backround under lighting

Data/pixels (user_2)

morning S1 csv files refer to photos taken at 12am, on mixed white/light purple background
morning S2 csv files refer to photos taken at 11am, on light/yellow rib background
evening S1 csv files refer to photos taken at 6pm, on light gray background
evening S2 csv files refer to photos taken at 7pm, on mixed white/light purple background
night S1 csv files refer to photos taken at 9am on light purple background
night S2 csv files refer to photos taken at 10pm on light gray background

In terms of training process, the dataset has been splitted in 3 parts, train (70%) / test (10%) / validation (20%)

Each part contains unique sessions, in order to avoid overfitting during training process.

Architectures used:

CNN Model

A CNN with 3 layers (1->16->32) has been implemented. Each convolutional layer consists of

Conv2d(d_in, d_out, kernel=3, stride=1, padding=1)
BatchNorm2d(d_out)
ReLU(inplace=True)
Dropout(d)
MaxPool2d(kernel=2, stride=2)

Final CNN model ran with the fllowing hyperparameters:
Lr = 0.001
batch_size = 1024
n_epochs = 100
Patience = 15
Dropout = 0.0
No L2-Regularization

VGG Model

We followed the method of Transfer Learning. Initially we freezed all layers and replaced the final decision layer (fc8) with one corresponding to 6 class output. However, we noticed that best results come with unfreezing 3 classification layers.

Final VGG model ran with the fllowing hyperparameters:

Lr = 0.001
batch_size = 64
n_epochs = 100
Patience = 20
Dropout = 0.2

How to execute CNN:

install requirements_cnn.txt
run: python3 MyCNN_batch.py ** script loads csv files from Final_Dataset_CSV folder
output: plots, configuration file and final model will be saved in folder Saved_models

How to execute VGG:

install requirements_vgg.txt
run: python3 VGG-Transfer.py
** script loads rgb frames from drive
output: plots, configuration file and final model will be saved in drive

drive link: https://drive.google.com/drive/folders/1xK0gFjICOMl83dOvY1jLtAkxo69UiWmF?usp=drive_link

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Final_Dataset_CSV		Final_Dataset_CSV
dataset_png_sample		dataset_png_sample
CNN_Model_Test.py		CNN_Model_Test.py
DeepLearning_presentation.pdf		DeepLearning_presentation.pdf
DeepLearning_report.pdf		DeepLearning_report.pdf
Greek_sign_Language.png		Greek_sign_Language.png
LICENSE		LICENSE
MyCNN_batch.py		MyCNN_batch.py
README.md		README.md
Roadmap.md		Roadmap.md
SimpleRunner.py		SimpleRunner.py
VGGSimpleRunner.py		VGGSimpleRunner.py
VGG_Model_Test.py		VGG_Model_Test.py
Vgg-Transfer.ipynb		Vgg-Transfer.ipynb
Vgg-Transfer.py		Vgg-Transfer.py
best-model_MyCNN-parameters.pt		best-model_MyCNN-parameters.pt
capture_frames.py		capture_frames.py
demo_nikolas.webm		demo_nikolas.webm
requirements_cnn.txt		requirements_cnn.txt
requirements_vgg.txt		requirements_vgg.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gsl-fingerspelling

Dataset

How to capture frames

Data/pixels (user_1)

Data/pixels (user_2)

In terms of training process, the dataset has been splitted in 3 parts, train (70%) / test (10%) / validation (20%)

Architectures used:

CNN Model

VGG Model

How to execute CNN:

How to execute VGG:

About

Releases

Packages

Contributors 2

Languages

License

cucuvaya/gsl-fingerspelling

Folders and files

Latest commit

History

Repository files navigation

gsl-fingerspelling

Dataset

How to capture frames

Data/pixels (user_1)

Data/pixels (user_2)

In terms of training process, the dataset has been splitted in 3 parts, train (70%) / test (10%) / validation (20%)

Architectures used:

CNN Model

VGG Model

How to execute CNN:

How to execute VGG:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages