Text Guided Domain Generalization for Face Anti-Spoofing

Introduction

Enhancing the domain generalization capability of Face Anti-Spoofing (FAS) remains a challenge. Existing methods aim to extract domain-invariant features from various training domains. Despite the promising performance, the extracted features inevitably contain residual style feature bias (e.g., illumination, capture device), resulting in inferior generalization performance.

In this paper, we propose the Text Guided Domain Generalization (TeG-DG) framework, which can effectively leverage text information for cross-domain alignment. As an abstract and universal representation, texts can capture the commonalities and essential characteristics across various attacks, bridging the gap between different image domains. Contrary to existing vision-language models, the proposed framework is elaborately designed to enhance the domain generalization ability of the FAS task. Concretely, we designed a Text Prompter (TP) that dynamically generates text prompts and a Hierarchical Attention Fusion (HAF) module to integrate multiple levels of visual features. Furthermore, we propose a Textually Enhanced Visual Discriminator (TEVD) that not only improves vision-language alignment but also regularizes the classifier using textual features. Extensive experiments demonstrate that TeG-DG significantly outperforms prior approaches, particularly in scenarios with limited source domain data.

Method

Overview of the proposed Text Guided Domain Generalization (TeG-DG) framework.

Setup

Prerequisites

Linux
CPU or NVIDIA GPU + CUDA CuDNN
Python 3.x

We recommend you to use Anaconda to create a conda environment:

conda create -n TEGDG python=3.8
conda activate TEGDG

Install PyTorch (2.0 or later) and torchvision.

conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 -c pytorch

Install some additional dependencies:

pip install -r requirements.txt

Implement

sh train_MCIO.sh

Datasets

We evaluate the proposed method across four publicly available datasets: MSU-MFSD (denoted as M), Replay-Attack (denoted as I), CASIA-MFSD (denoted as C), OULU-NPU (denoted as O) .

We follow previous DG-FAS works, adopt a leave-one-out (LOO) strategy: three datasets are randomly selected for training, and the rest one for testing.

└── YOUR_Data_Dir
   ├── OULU_NPU
   |   ├──trainset
   |   ├──testset
   |   ├──trainset.csv
   |   └──testset.csv
   ...
   └── REPLAY_ATTACK
       ├──trainset
       ...
       └──testset.csv

The trainset.csv and testset.csv file follow the format below:

Index	Dataset	Path	Type	Attack	Label
0	MSU-MFSD	path/to/data1	train	replay	0
1	MSU-MFSD	path/to/data2	train	real	1

Acknowledgments

The code is built on CLIP, BLIP and SSAN. Thanks for their works!

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets		assets
configs		configs
models		models
my_utils		my_utils
output/default_tegdg		output/default_tegdg
tegdg_clip		tegdg_clip
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
train.py		train.py
train_MCIO.sh		train_MCIO.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Guided Domain Generalization for Face Anti-Spoofing

Introduction

Method

Setup

Datasets

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text Guided Domain Generalization for Face Anti-Spoofing

Introduction

Method

Setup

Datasets

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages