Šakalkulator project for "Soft Computing"
"Šakalkulator", developed as our 4th year college project, is a program used for class "Soft Computing" in 2024.
The goal of this project is to enable a simple calculator that can perform basic arithmetic operations (+, -, *, /), with any numbers, using hand gestures and audio recordings.
All of the data used for the program was made by the 3 students who created the project.
To install this program, you're required to do two things.
- CD into the folder where requirements.txt can be found and do:
pip install -r requirements.txt
- Run the script itself
python main.py
Database consists of 2 parts: audio and video data. Audio data is stored in .mp3 format, while video data in .mp4 format.
Both audio and video data consist of 2 separated folders: training and testing, 80% and 20% respectively.
Mainly two types of models are trained: CNN for image classification and SVM for audio classification.
CNN, used for image classification, is trained on a dataset of hand gesture images. Each gesture represents a number or an arithmetic operation. The training of the gestural part of the application was performed by a fully connected neural network, which consists of 5 convolutional, Flatten, fully connected Dense and output Dense layers.
SVM, used for audio classification, is trained with linear kernel, and MFCC was used for feature extraction.
Testing the performance of the calculator on videos was done by taking frame by frame and making a prediction for each one.
Testing the operation of the calculator on audio recordings was done by dividing each one into segments where the division was made according to the absence of sound and for each segment a prediction was made after that. The exact results are placed in a csv file.
Both models are evaluated using various metrics such as accuracy, precision, recall, and F1 score.
PlaylistGenie is available under the GNU GPLv3 license.