-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Welcome to the Basegun-ml wiki!
Basegun-ml repo contains all research code based on Machine Learning (ML) used in Basegun app, a tool for helping at the identification and legal categorization of firearms in France.
The type of algorithm used in Basegun aims at assigning a category, called label or class, to any image it sees. This category can only be one in a pre-defined list, it cannot think of any class outside this scope.
In Basegun's case, it categorizes an image in a list of families representing (generally) firearm mechanismes. This classification is based on descriptive, objective criteria which are independant from legal classification. For more info on this point check the Dataset section.
ℹ️ Due to the way it works, the algorithm will try to find the fittest firearm for any image. Even a cat image will be categorized as a kind of firearm ! This is a technical choice we made to have better performance for distinguishing firearms.
This type of algorithm is called Supervised classification for Computer Vision (= for images/videos).
The output of a classification model applied to a image is a array size number of classes
which contains for each class the probability that this image is of this class = the confidence score. For instance if we have 5 classes, the output [0.1, 0.2, 0.5, 0.05, 0.15] means that the algorithm thinks:
- there is 10% chance this image is of class 1
- there is 20% chance this image is of class 2
- etc. Consequently, in this example we consider the image of probably of class 3, since it's the class with highest confidence score.
In Basegun app, in order to deliver an information easily understandable by the user, we only return the name and confidence score of the class with the highest score.
The dataset is the collection of images used to train the algorithm. Indeed, since we use a supervised type of algorithm, we need to "teach" our algorithm how to recognize the images. For that purpose, we feed it a large (tens of thousands) number of pairs (image, label) where the label is the real class of the image. After this training we hope that for new images where we don't know the class, the algorithm will be able to guess it on its own. More details about the dataset for classification
- General research on image classification
- Trainings of March 2022 on Dataset v0
- Trainings of March 2023 on Dataset v1
- Trainings of January 2024 on Dataset v1
Previous work available here
Measuring the overall length of a firearm or its barrel length is crucial for its legal classification. In fact, in France, the classification of long guns is dependent on these lengths.
For example, a long gun with a shorter barrel is easier to conceal and thus poses greater risks. Consequently, it will be assigned a higher legal class compared to a similar firearm with a longer barrel.
Measuring length from images or videos is a common task in computer vision. We have studied several computer vision techniques to determine the best solution for the use-case. Some methods require a reference object to determine the actual length, while others do not.
In the case of measurement using a reference object, here is the primary strategy:
- Measure the reference object and the target object in the image in pixels.
- Using the known length of the reference object, determine a length factor in px/cm.
- Calculate the target length using the measurement in pixels and the length factor.
One approach to measuring actual lengths in an application without a reference object is the use of augmented reality algorithms. This allows you to measure distances in your environment from your phone, in real time.
Advantages | Drawbacks |
---|---|
Precision of the measurement (Dependent on the user) | CPU usage, battery drain |
No need for a reference object | User experience |
Hardware compatibility |
Useful Links
- AR Apps: WebXR-Measure, Sizer
- Tutorial: AR Measurement Application Development
- How Does It Work? AR Ruler App Review, How to Use Google Measure on Android
Another method for measuring actual lengths in an application without a reference object involves using the focal length and other parameters. This approach determines the real size of an object by proportionality, based on the following elements:
- Focal length -> Photo metadata (depends on the phone model)
- Sensor size -> Photo metadata
- Object distance from the camera -> LiDAR or MIDaS depth estimation algorithm
- Weapon size in pixels
Advantages | Drawbacks |
---|---|
User experience (only a single photo is necessary) | Availability of metadata |
No need for a reference object | Availability of LiDAR |
Limited precision of depth estimation algorithms |
Useful Links
Segmentation involves determining whether each pixel in an image belongs to the object of interest. The result is a cutout of the weapon. It's possible to use SOD algorithms that segment the object in the foreground, which requires no specific training, or algorithms to be retrained for our use case, such as YOLOv8.
Advantages | Drawbacks |
---|---|
Precision of the algorithm | Use of computational resources |
No need for training with some models (SOD, SAM) | GPU may be necessary for training |
Segmentation Models
Object detection involves detecting the presence of an object in an image. The process is generally divided into two stages:
- Determination of the object's potential locations, which are represented using bounding boxes,
- Identification of the class of the object identified.
This method can be complemented by keypoint detection, which involves detecting the object's points of interest in the image. For example, the eyes in the case of face detection.
Advantages | Drawbacks |
---|---|
Inference time on CPU compatible with use case | Bounding box and keypoint precision |
Good performance for object detection | Transfer learning mandatory to specialize the models on the use-case |
Useful Links
- Benchmark
- Tutorials
To measure a distance in pixels on an image or detect an object, you don't necessarily need to use resource-intensive deep learning methods, as computer vision methods exist for shape recognition or contour detection. Once the object has been detected, it's simply a matter of calculating the distance based on the position of its extremities.
Advantages | Drawbacks |
---|---|
Inference time on CPU compatible with use case | Precision of the algorithms |
Simplicity of treatments | Precision highly dependent on background |
Useful Links
- Haar Cascade
- Contour Detection
The selected method to address the use case of measuring firearm length is keypoint detection.