# Audio Deepfake Detection Implementation (Part 2)

**AASIST Audio Deepfake Detection Implementation**

This notebook walks through the process of setting up and running the AASIST model for audio deepfake detection, based on the official implementation. We'll use the ASVspoof 2019 Logical Access (LA) dataset.

**References:**
* Official AASIST Code: [https://github.com/clovaai/aasist](https://github.com/clovaai/aasist)
* ASVspoof Challenge: [https://www.asvspoof.org/](https://www.asvspoof.org/)
* ASVspoof 2019 Dataset: [Edinburgh DataShare](https://datashare.ed.ac.uk/handle/10283/3336)

### Step 1: Clone the Official AASIST Repository

First, we need to get the codebase from GitHub. We'll clone the repository directly into our Colab environment.

In [None]:
!git clone https://github.com/clovaai/aasist.git

Cloning into 'aasist'...
remote: Enumerating objects: 38, done.[K
remote: Counting objects: 100% (20/20), done.[K
remote: Compressing objects: 100% (16/16), done.[K
remote: Total 38 (delta 8), reused 4 (delta 4), pack-reused 18 (from 1)[K
Receiving objects: 100% (38/38), 1.43 MiB | 3.72 MiB/s, done.
Resolving deltas: 100% (12/12), done.


### Step 2: Navigate into the Repository Directory

Now that we have the code, we need to move into the directory that was just created (`aasist`) so we can run the scripts from there.

In [None]:
import os
os.chdir('./aasist')

### Step 3: Install Dependencies

The repository includes a `requirements.txt` file listing all the necessary Python packages. Let's install them using `pip`.

In [None]:
!pip install -r requirements.txt

Collecting torchcontrib (from -r requirements.txt (line 2))
  Downloading torchcontrib-0.0.2.tar.gz (11 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.6.0->-r requirements.txt (line 1))
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.6.0->-r requirements.txt (line 1))
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.6.0->-r requirements.txt (line 1))
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.6.0->-r requirements.txt (line 1))
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.6.0->-r requirements.t

### Step 4: Download and Prepare the Dataset

The AASIST model is typically trained and evaluated on the ASVspoof 2019 Logical Access (LA) dataset. This dataset is quite large (~7.1 GB compressed).

The original `clovaai/aasist` repository provides a `download_dataset.py` script. Running this script downloads and extracts the dataset into the expected directory structure (`./LA`).

**Note:** This download can take a considerable amount of time and requires sufficient disk space in your Colab environment (or mounted Drive if you adapt the paths).

In [None]:
!python ./download_dataset.py

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_7787040.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_2924301.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_9249366.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_3442936.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_7772915.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_5569336.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_7773607.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_7813281.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_9705954.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_2427464.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_1000273.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_5263550.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_1642109.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_1339848.flac  
  inflating: LA/ASVspoof2019_LA_eval/flac/LA_E_9495857.flac  
  inf

### Step 5: Code Modifications (Important!)

Before running the training, two specific modifications were made to the original repository code:

**Modification 1: Update NumPy Data Type in `evaluation.py`**

* **File:** `evaluation.py`
* **Change:** In the sections loading ASV scores and CM scores, `.astype(np.float)` was changed to `.astype(np.float64)`.
* **Reason:** The alias `np.float` is deprecated in recent NumPy versions and raises warnings or errors. Using the explicit `np.float64` (for 64-bit float) resolves this deprecation issue and ensures compatibility with modern NumPy.

    ```python
    # Original lines (problematic with newer NumPy):
    # asv_scores = asv_data[:, 2].astype(np.float)
    # cm_scores = cm_data[:, 3].astype(np.float)

    # Modified lines (using specific NumPy type):
    # asv_scores = asv_data[:, 2].astype(np.float64)
    # cm_scores = cm_data[:, 3].astype(np.float64)
    ```

**Modification 2: Reduce Epochs in `AASIST-L.conf`**

* **File:** `config/AASIST-L.conf`
* **Change:** The parameter `"num_epochs"` was changed from the default `100` to `10`.
* **Reason:** Training for the full 100 epochs requires significant GPU time, often exceeding the limits of free tiers on platforms like Google Colab. Reducing epochs to 10 allows for a quicker training run ("light re-training") to verify the setup and observe initial learning, while staying within typical resource constraints.

    ```json
    // Inside config/AASIST-L.conf
    // Original line:
    // "num_epochs": 100,
    // Modified line:
    "num_epochs": 10,
    ```


### Step 6: Run Training (AASIST-L)

With the dependencies installed, the dataset downloaded, and the code modifications noted, we can now start the training process for the lightweight model (AASIST-L).

We use the `main.py` script and specify the configuration file (`AASIST-L.conf`) which now includes the reduced epoch count. The script will automatically use the dataset downloaded into the `./LA` directory by `download_dataset.py`.

In [None]:
!python main.py --config ./config/AASIST-L.conf

2025-04-02 17:13:52.889376: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1743614033.151493    7711 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1743614033.219833    7711 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-04-02 17:13:53.718461: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Device: cuda
no. model params:85306
no. training files: 25380
no. validation files: 24844
Start training epoch000
Sco

### Conclusion

This notebook successfully demonstrated the steps to:
1.  Clone the AASIST repository.
2.  Install necessary dependencies.
3.  Download and prepare the ASVspoof 2019 LA dataset using the provided script.
4.  Run the training process for the AASIST-L model, incorporating necessary code updates (NumPy types) and configuration changes (epoch count) for practical execution in an environment like Colab.

The output shows the model training, with loss decreasing and EER/t-DCF metrics improving over the initial epochs, confirming the setup works correctly.