ACSTNet: An improved YOLO X method for small object detection with pixel-level attention and parallel Swin Transformer
torch version: 1.11.0+cu113
cuda version: 11.3
cudnn version: 8200
torchversion -V: 0.12.0+cu113
Python environment import:
pip install -r requirements.txt- Backbone feature extraction network: CSPNet with Swin Transform network structure is used.
- Neck layer: increase (ECSNFAM multi-attention structure.
- Categorical regression layer: Decoupled Head, in YoloX, Yolo Head is divided into two parts of categorical regression, and only integrated together in the final prediction.
- Tips used for training: Mosaic data enhancement, IOU and GIOU, learning rate cosine annealing decay.
- Anchor Free
- SimOTA
The weights required for training can be downloaded from Google Drive.
Link: https://drive.google.com/drive/folders/1cF7GUiqjay0WJ-lElzG-MrHZHpY3kuQu?usp=sharing
-
Preparation of the data set First download the RSOD dataset, then use the VOC format for training and place the file under the path: ACSTNet/dataset/RSOD-Dataset
-
Modify the parameters needed for training
-
Start training
python train.pyTraining result prediction
We need to use two files to predict the training results, yolo.py and predict.py. We first need to go to yolo.py and modify model_path and classes_path, these two parameters must be modified. model_path points to the trained weights file, which is in the logs folder.
Once you have completed the changes you can run predict.py for testing.
python predict.py
- This paper uses the VOC format for evaluation. RSOD has divided the test set and its path is at:dataset/RSOD-Dataset/test_xywh.txt.
- Modify model_path as well as classes_path inside yolo.py. model_path points to the trained weights file, in the logs folder. classes_path points to the txt corresponding to the detected category.
- The evaluation results can be obtained by running get_map.py, and the evaluation results will be saved in the map_out folder.
python get_map.py