Outdoor fires (OFs) pose significant threats to human safety, property, and ecological stability. However, existing detection algorithms often suffer from performance degradation in complex real-world environments, resulting in high false alarm and missed detection rates. To address these limitations, we propose a novel Convolutional Neural Network (CNN)-Mamba Dual-branch Fusion Network (CMDFNet). The framework consists of a CNN branch for fine-grained local feature extraction and a Vision Mamba branch for efficient global context modeling. To further enhance representation, we introduce an eight-directional selective scanning strategy (R8-SS2D) for irregular flame contour perception, a selective state space mechanism (S6) for dynamic temporal adaptation, and a CNN-VMamba Fusion (CVMF) block to facilitate deep interaction between local and global features. Extensive experiments show that CMDFNet achieves superior performance, surpassing state-of-the-art methods by 2.0% in detection accuracy and 2.1% in recall, while maintaining high computational efficiency. The model has also been deployed at over 20 monitoring sites, where it successfully detected more than 1,600 OF events, further confirming its effectiveness and robustness in complex, real-world environments.
Due to ongoing collaborations with industrial partners and intellectual property considerations, the full codebase cannot be released at this stage. Nevertheless, the released key code, along with the detailed descriptions of model architecture, and training protocols provided in the manuscript. We are also actively discussing with our collaborators the possibility of releasing additional components in the future and remain committed to open-source practices to benefit the community.