The main idea was: how to determine time from a photo of an analog clock using neural networks.
2 options for solving the problem were worked out:
- Trigonometric method:
segmentation of clock hands, detection of dial numbers and time determination through the use of the angle of the hands.
- ResNext50 classification:
training ResNext50 on a dataset of over 200,000 images (720 image classes).
- annotate the dataset:
- for arrow segmentation (https://www.makesense.ai/)
- for detection of dial numbers (https://roboflow.com/)
- training and obtaining model weights:
- mask_rcnn - for arrow segmentation
- faster_rcnn - for number detection
- getting coordinates of arrows and numbers
- determining time with coordinates and trigonometric functions
(!) Due to the low accuracy, it was decided to change the approach of time determination.
- creating and annotating a dataset:
- with timelaps video split
- with the clock image generation function
- training the ResNext50 model on a generalized dataset
Inference on ResNext50 gives 60% accuracy.
Telegram bot became the final product of the project.