Skip to content

Mu437/TeG-DG

Repository files navigation

Text Guided Domain Generalization for Face Anti-Spoofing

Introduction

Enhancing the domain generalization capability of Face Anti-Spoofing (FAS) remains a challenge. Existing methods aim to extract domain-invariant features from various training domains. Despite the promising performance, the extracted features inevitably contain residual style feature bias (e.g., illumination, capture device), resulting in inferior generalization performance.

In this paper, we propose the Text Guided Domain Generalization (TeG-DG) framework, which can effectively leverage text information for cross-domain alignment. As an abstract and universal representation, texts can capture the commonalities and essential characteristics across various attacks, bridging the gap between different image domains. Contrary to existing vision-language models, the proposed framework is elaborately designed to enhance the domain generalization ability of the FAS task. Concretely, we designed a Text Prompter (TP) that dynamically generates text prompts and a Hierarchical Attention Fusion (HAF) module to integrate multiple levels of visual features. Furthermore, we propose a Textually Enhanced Visual Discriminator (TEVD) that not only improves vision-language alignment but also regularizes the classifier using textual features. Extensive experiments demonstrate that TeG-DG significantly outperforms prior approaches, particularly in scenarios with limited source domain data.

Method

pipeline

Overview of the proposed Text Guided Domain Generalization (TeG-DG) framework.

Setup

Prerequisites

  • Linux
  • CPU or NVIDIA GPU + CUDA CuDNN
  • Python 3.x

We recommend you to use Anaconda to create a conda environment:

conda create -n TEGDG python=3.8
conda activate TEGDG

Install PyTorch (2.0 or later) and torchvision.

conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 -c pytorch

Install some additional dependencies:

pip install -r requirements.txt

Implement

sh train_MCIO.sh

Datasets

We evaluate the proposed method across four publicly available datasets: MSU-MFSD (denoted as M), Replay-Attack (denoted as I), CASIA-MFSD (denoted as C), OULU-NPU (denoted as O) .

We follow previous DG-FAS works, adopt a leave-one-out (LOO) strategy: three datasets are randomly selected for training, and the rest one for testing.

└── YOUR_Data_Dir
   ├── OULU_NPU
   |   ├──trainset
   |   ├──testset
   |   ├──trainset.csv
   |   └──testset.csv
   ...
   └── REPLAY_ATTACK
       ├──trainset
       ...
       └──testset.csv

The trainset.csv and testset.csv file follow the format below:

Index Dataset Path Type Attack Label
0 MSU-MFSD path/to/data1 train replay 0
1 MSU-MFSD path/to/data2 train real 1

Acknowledgments

The code is built on CLIP, BLIP and SSAN. Thanks for their works!

About

Textually Guided Domain Generalization for Face Anti-Spoofing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors