Official implementation of:
Zhen Yang, Yufei Luo, Jinshuai Yang, Xin Xu, Ru Zhang, Yongfeng Huang*
Class-aware Adversarial Unsupervised Domain Adaptation for Linguistic Steganalysis
IEEE Transactions on Information Forensics and Security (TIFS), 2025
DOI: https://doi.org/10.1109/TIFS.2025.3569409
Cross-domain linguistic steganalysis aims to detect stego texts in an unlabeled target domain using a model trained on a labeled source domain.
Existing methods mainly reduce marginal distribution discrepancy between domains, but often suffer from:
- Class-misalignment: incorrect alignment between cover and stego texts across domains
- Class-indistinction: insufficient class separation in the target domain
To address these issues, we propose CADA, a two-stage class-aware adversarial domain adaptation framework.
CADA consists of two stages:
- Supervised source-domain training
- Adversarial domain alignment (feature extractor vs. domain discriminator)
- Weighted Class-Aware Domain Distance (WCADD)
- Class-Aware Label Smoothing (CALS)
- Iterative pseudo-label selection
- Confidence-aware soft weighting
- Class-balanced sampling
- Progressive pseudo-label growth
pip install -r requirements.txt| Package | Version |
|---|---|
| torch | ≥ 1.13.1 |
| transformers | ≥ 4.30.2 |
| scikit-learn | ≥ 0.19.2 |
| tqdm | ≥ 4.66.1 |
| numpy | ≥ 1.21.6 |
We use publicly available datasets introduced in:
Wen et al.,
SCL-Stega: Exploring Advanced Objective in Linguistic Steganalysis using Contrastive Learning
IH&MMSec 2023
@inproceedings{wen2023scl,
title={Scl-stega: Exploring advanced objective in linguistic steganalysis using contrastive learning},
author={Wen, Juan and Gao, Liting and Fan, Guangying and Zhang, Ziwei and Jia, Jianghao and Xue, Yiming},
booktitle={Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security},
pages={97--102},
year={2023}
}├── main.py # Entry point: argument parsing and training pipeline
├── pretrain.py # Phase 1: source pre-training + UDA adversarial training
├── Finetune.py # Phase 2: class-aware pseudo-label fine-tuning
├── module.py # Model definitions: FeatureExtractor, Classifier, Discriminator
├── loss.py # WCADD loss and helper functions
├── DataLoader.py # Dataset loading, tokenisation, and batch iteration
├── test.py # Evaluation functions
├── utils.py # Utilities: save_model, make_variable, LabelSmoothingCrossEntropy
└── requirements.txt
If you find this work useful, please cite our paper:
@article{yang2025class,
title={Class-aware adversarial unsupervised domain adaptation for linguistic steganalysis},
author={Yang, Zhen and Luo, Yufei and Yang, Jinshuai and Xu, Xin and Zhang, Ru and Huang, Yongfeng},
journal={IEEE Transactions on Information Forensics and Security},
year={2025},
publisher={IEEE}
}This paper was submitted to IEEE Transactions on Information Forensics and Security (TIFS).
Review history:
- Submitted: 09 September 2024
- Major Revision: 03 February 2025
- Minor Revision: 21 April 2025
- Accepted: 06 May 2025
- Published Online: 12 May 2025
Total duration: ~8 months.
The manuscript underwent one major revision and one minor revision.
Revisions mainly focused on strengthening theoretical justification, expanding ablation studies, and clarifying pseudo-label robustness.
This repository is released under the MIT License.
If you have any questions about the paper, the implementation, or encounter issues when reproducing the results, please feel free to reach out:
- Yufei Luo: luoyf@bupt.edu.cn
We warmly welcome discussions, suggestions, and potential collaborations on related topics, including linguistic steganalysis, domain adaptation, and adversarial learning.

