# Efficient cross scale fusion network SCW-YOLO for object detection in remote sensing imagery

## ABSTRACT
Traditional object detection algorithms often struggle to detect small objects in remote sensing images because of their small size and complex backgrounds. To address this, We propose a high performance remote sensing image object detection model SCW-YOLO based on the YOLOv8 model. Firstly, the model incorporated an Efficient Cross Scale Feature Pyramid Network (ECFPN) to enabled richer feature fusion without increasing computational costs caused by continuous downsampling by adding a new feature layer to the shallow network and directly outputting the backbone network features to the detection head. Additionally, a coordinate attention mechanism was employed to refine the backbone network by locally enhancing the features and reducing interference from redundant information. Finally, to further improve bounding box loss fitting and accelerate network convergence, a dynamic non-monotonic Wise-IoU (WIOU) loss function was introduced to replace the loss function of baseline. The experimental results indicated that SCW-YOLO outperformed most state-of-the-art (SOTA) models in parameter efficiency and small-object detection accuracy, confirming its robustness in detecting small targets in remote sensing images.

## Setup
Download [YOLOv8]{https://github.com/ultralytics/ultralytics/tree/v8.1.6} code. Pip install ultralytics and dependencies and check software and hardware.

In [None]:
%pip install ultralytics
import ultralytics

ultralytics.checks()

## ECFPN
The neck structure is displayed as follows.

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  
  - [-1, 3, C2f, [256]] 

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 2], 1, Concat, [1]]  
  - [-1, 3, C2f, [128]]  

  - [-1, 1, Conv, [128, 3, 2]]
  - [[-1, 10, 4], 1, Concat, [1]]  
  - [-1, 3, C2f, [256]]  

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 7], 1, Concat, [1]]  
  - [-1, 3, C2f, [512]]  

  - [[13, 16, 19], 1, Detect, [nc]]  # Detect(P2, P3, P4)

## Backbone Enhance
The backbone structure is displayed as follows.

backbone:
  - [-1, 1, Conv, [64, 3, 2]]  
  - [-1, 1, Conv, [128, 3, 2]]  
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  
  - [-1, 1, C2f_CooreA, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  
  - [-1, 1, C2f_CooreA, [512, True]]
  - [-1, 1, SPPF, [512, 5]]

## WIOU
Replace the original bbox_iou function with the loss. py function in the modules file.

Similarly, the contents of the block.cy file in the modules file were added to the corresponding block.py file.

## Dataset
Download the dataset and place it in a location you know. Refer to the data.yaml file.

## Training


In [None]:
from ultralytics import YOLO

if __name__ =='__main__':

    data = "data.yaml"

    model = YOLO('scw-yolo.yaml')
    model.train(data=data, epochs=200, imgsz=640, batch=8)
