MLMStego is a steganography method based on the BERT transformer model is proposed for hiding text data within cover text. The primary objective is to conceal information by substituting specific words in the text using BERT's masked language modeling (MLM) feature. The study employs two models, fine-tuned for English and Turkish, to perform steganography on texts in these languages. Additionally, the proposed method is designed to work with any transformer model that supports masked language modeling. Unlike traditional methods that often have limitations on the amount of hidden information, the proposed approach allows for concealing a substantial amount of data in the text without distorting its meaning.
- PyTorch
- Transformers
- Tqdm
-
Clone the repository:
git clone https://github.com/emirozturk/MLMStego
-
Install the required dependencies:
cd MLMStego pip install -r requirements.txt
Command-line Interface
python Test.py \
--pathForCoverText "path_to_cover_text_file" \
--secret "secret_text" \
--language "tr" \
--halfWindowSize 10 \
--loopChange 2 \
--loopMod 3 \
--randomSeed 110001 \
--saveStegoText True \
--printObtainedSecret True \
--model "dbmdz/bert-base-turkish-cased" \
--device "cuda" # or "cpu" or "mps" for MacOS
Example with minimum arguments
python Test.py --pathForCoverText cover.txt --secret hiddenmessage
If you use this steganography method in your research, please cite the following paper:
@ARTICLE{MLMStego,
author={Öztürk, Emir and Mesut, Andaç Şahin and Fidan, Özlem Aydın},
journal={IEEE Access},
title={A character-based steganography using masked language modeling},
year={2024},
volume={},
number={},
pages={1-1},
doi={10.1109/ACCESS.2024.3354710}
}