This is a sam2 based yolo auto annotation tool for drone detection. It mainly uses ultralytics pretrained model, so it's quite light-weighted for use.
python -m pip install -r requirements.txtIf your hardware is good enough to run SAM2 inference locally, you can skip this step.
Upload your video data to /video_data/<video_name>/<video_name>.mp4, and then run :
python scripts/extract_mask.py -v <video_name>The frames will be extracted in directory /video_data/<video_name>/rgb_frames/
If your hardware is good enough to run SAM2 inference locally, you can skip this step.
First, prepare a prompt.yaml right in your dataset folder, which is used to specify the prompt for sam2.
The format includes 2 parts, you can find it in the folder:
initial box prompt: the bounding box prompt for sam2, in formatxyxy, the origin is on top-left, and x is horizontal, y is vertical. This updates the initial memory of sam2, so it should be precise enoughauxiliary prompt: the auxiliary prompt for sam2, which is used to update the memory of sam2, and it can be a point prompt or a box prompt. The format isxyfor point prompt, andxyxyfor box prompt. This is used to guide the sam2 when it fails in adjacent frames. Note that adding too much prompt will slow down the segmentation.
Then run the following command to start auto annotation:
python scripts/sam_auto_annotation.py -v <video_name> -i -eAfter that, download all the masks in video_data/<video_name>/mask_frames/ if you run sam2 remotely,
then run the following command to convert the masks to yolo format labels:
python scripts/gen_labels.py -v <video_name>All the labels will be generated in video_data/<video_name>/labels, and you can review all the labels in video_data/<video_name>/check_labels.
These are images with the labels drawn on them, and you can check if the labels are correct. If not, you can modify the prompt.yaml and run the auto annotation again to update the labels.
You can also remove the wrong labels inside it, and then run
python scripts/percolate.py -v <video_name>This operation will directly add the correct labels and rgb frames into the yolo train dataset with correct format.
Run the following command to start training:
python scripts/train.py --epochs 100This is just a simple project that I created it to generate labels for YOLO26n efficiently, I need it to detect fast and maneuverable drones, so I need lots of data. It is quite annoying to label the data manually, so I created this tool to help me generate labels automatically. I have to admit that the quality of the labels is not very good, but it is good enough for my use case. I hope this tool can help you to generate labels for your own use case, and you can do some improvements on it to make it better.