Skip to content

Luffy-2000/NAFN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Network Intrusion Detection Adaptability under Spurious Labels: A Noise-Aware Few-Shot Class-Incremental Learning Framework

Installation

  1. Place the dataset according to the the path defined in dataset_config.py.

  2. It is recommended to use virtualenv to create an isolated Python environment:

    virtualenv venv
    source venv/bin/activate
  3. Install the required packages:

    pip install -r requirements.txt

How To Use It

Navigate to the src directory and execute the main script:

cd src
python3 -u main.py

followed by general options:

  • --gpus: Number of GPUs to use for the experiment (default: 0)
  • --max_epochs: Maximum number of epochs
  • --default_root_dir: Default path for logs and weights if no logger or lightning.pytorch.callbacks.ModelCheckpoint callback is passed
  • --seed: Random seed
  • --is-fscil: Enable Few-Shot Class Incremental (FSCIL) procedure if present; otherwise, use Few-Shot Learning (FSL) procedure
  • --pt-only: Run only the pre-training phase if present
  • --ft-only: Run only the adaptation phase if present
  • --num-tasks: Number of episodes used for the adaptation task (default: 100)
  • --shuffle-classes: shuffles FSL/FSCIL partition classes based on the seed if present
  • --classes-per-set: Number of classes for pre-training and adaptation (e.g., '7 3' means the first 7 classes are used for pre-training and the next 3 for adaptation, order defined in dataset_config.py). If not set, the 'train_classes' for pre-training and 'test_classes' for adaptation are used, as defined in dataset_config.py.
  • Additional options are those defined by the PyTorch Lightnining Trainer class. They can be viewed via:
    python3 -u main.py --help

and specific options for --approach,--datasets and --network.

Approach Options

LightningTLModule in tl_module.py defines methods for pre-training and fine-tuning along with utility functions. LightningRFS inherits from this class. To run this approach, add the name of the *.py file to --approach.

Generic options in LightningTLModule include:

  • --shots: Number of shots used during fine-tuning (default: 5)
  • --queries: Number of queries used during fine-tuning (default: 5)
  • --lr: Starting learning rate (default: 0.001)
  • --lr-strat: Learning rate scheduler strategy (default: 'none') (choices: ['lrop', 'cawr', 'none'])
  • --scheduler_patience: Reduce LR on plateau (lrop) scheduler patience (default: 10)
  • --scheduler_decay: lrop decay rate (default: 0.1)
  • --t0: Cosine annealing warm restarts (cawr) period (default: 10)
  • --eta-min: cawr minimum LR (default: 1e-5)
  • --ckpt-path: Path to resume a saved PyTorch Lightning module with the .ckpt extension (default: None)

LightningRFS (rfs.py) implements the RFS algorithm for Few-Shot Learning, as described in "Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?". Options include:

  • --alpha: Weight for CE loss (default: 0.5)
  • --gamma: Weight for KL loss (default: 0.5)
  • --is-distill: Enable knowledge distillation if present
  • --kd-t: Temperature for knowledge distillation loss (default: 1)
  • --teacher-path: Path to the teacher model (default: None)
  • --base-learner: Type of base learner ('lr' or 'nn') (default: 'nn')
  • --pre-mode: Pre-training mode, choices: none, recon, contrastive, hybrid (default: None)
  • --memory-selector: Memory selection strategy, choices: herding, uncertainty, random (default: None)
  • --noise-label: Enable noisy label training if present
  • --noise-ratio: Noise ratio for noisy labels (0.0-1.0, default: 0.0)
  • --denoising: Denoising method, choices: proto_margin, LOF (default: None)

Feature synthesis (calibration) options

Hyperparameters for distribution calibration and feature synthesis (see Distribution.py):

  • --calib-gamma: Weight on the new class in the calibrated mean/covariance; the base-class fusion gets the rest. Default: 0.7 (i.e. 30% new class, 70% base fusion when used with --calib-epsilon).
  • --calib-epsilon: Weight on the base-class fusion in the calibrated distribution. Default: 0.7.

Use these for sensitivity experiments, e.g. --calib-gamma 0.5 --calib-epsilon 0.5.

Network Options

The network class defines the architecture model to be used. This class includes methods to manage the model.

Options include:

  • --network: Embedding function to use (default: Lopez17CNN). The network used in experiments is UNet1D2D
  • --out-features-size: Feature vector size (default: -1)
  • --weights-path: Path to a *.pt file with weights to initialize the embedding function (default: None)
  • --scale: Scaling factor for the neural network backbone (default: 1)

Dateset Options

Datasets are defined in dataset_config.py. Each key is considered a dataset name.

Options include:

  • --dataset: Dataset to use (default: 'iot_nid')
  • --num-pkts: Number of packets to select from the dataset (default: None)
  • --fields: Packet field(s) used (default: [], choices: ['PL', 'IAT', 'DIR', 'WIN', 'FLG', 'PL_DIR'])

Early-Stopping Options

Early-stopping parameters are handled by EarlyStoppingDoubleMetric in callbacks.py, wrapping PyTorch Lightning's EarlyStopping to add features like logging and double monitored metrics.

Options include:

  • --monitor: Metric to monitor for early stopping (default: 'valid_accuracy')
  • --min_delta: Early-stopping minimum delta (default: 0.01)
  • --patience: Early-stopping patience (default: 17)
  • --mode: Early-stopping mode (default: 'auto')
  • --double-monitor: Monitor both valid_accuracy and valid_loss if present

Project Structure

Approaches

In src/approach, you can find the PyTorch Lightning implementation of RFS.

Modules

In src/modules, you can find implementations of various nn.Module components (e.g., losses, teacher).

Data

In src/data, several classes act as wrappers for the PyTorch Lightning dataloader modules.

Trainers

In src/trainers, there is a trainer class that acts as a wrapper for the PyTorch Lightning trainer, extending its functionality to manage custom logic (e.g., FSCIL).

Utils

In src/util, you can find various utility functions for managing the seed and RNG state, the logger, and callbacks.

Networks

In src/networks, you can find implementations of embedding functions and the logic needed to manage them.

Execution of Experiments

Step 1: Train the Teacher

To train the teacher model, you need to specify the pre-training mode and memory selector. Example for iot_nid dataset with nn classifier:

python3 main.py --is-fscil --dataset iot_nid --fields PL IAT DIR WIN FLG TTL --num-pkts 20 --shots 10 --queries 40 --gpus 1 --num-tasks 100 --max_epochs 100 --seed 0 --approach rfs --patience 20 --monitor valid_accuracy --min_delta 0.001 --mode max --double-monitor --lr 0.0001 --lr-strat none --classes-per-set 7 3 --default_root_dir ../save_files/results_rfs_teacher_allpre/results_rfs_teacher_iot_nid_10shot_contrastive_nn_uncertainty --network UNet1D2D --base-learner nn --pre-mode contrastive --memory-selector uncertainty

Step 2: Train the Student with Noisy Labels and Denoising

When training the student model, knowledge distillation, noisy labels, and denoising methods are used. The teacher model path is automatically found from the training results directory.

Example for iot_nid dataset with nn classifier, noise ratio 0.1, and proto_margin denoising:

python3 main.py --is-fscil --dataset iot_nid --fields PL IAT DIR WIN FLG TTL --num-pkts 20 --shots 9 --queries 40 --gpus 1 --num-tasks 100 --max_epochs 100 --seed 0 --approach rfs --patience 20 --monitor valid_accuracy --min_delta 0.001 --mode max --double-monitor --lr 0.0001 --lr-strat none --classes-per-set 7 3 --default_root_dir ../save_files/results_rfs_student/results_rfs_student_iot_nid_9shot_contrastive_nn_uncertainty_noise_0.1 --network UNet1D2D --base-learner nn --kd-t 1 --teacher-path ../save_files/results_rfs_teacher_allpre/results_rfs_teacher_iot_nid_10shot_contrastive_nn_uncertainty/lightning_logs/version_0/distill_models/teacher_ep100.pt --is-distill --memory-selector uncertainty --noise-label --noise-ratio 0.1 --denoising proto_margin

Key parameters for noisy label training and denoising:

  • --noise-label: Enable noisy label training
  • --noise-ratio: Noise ratio for noisy labels (0.0-1.0), e.g., 0.1 means 10% of labels are noisy
  • --denoising: Denoising method, choices: proto_margin, LOF

Using the Automation Script

You can use the run_experiments.py script to run experiments in batch with different noise ratios:

cd src
python3 run_experiments.py --datasets iot_nid --shots 9 8 7 6 5 --classifier nn --noise-ratios 0.1 0.2 0.3 0.4 0.5

The script automatically selects the optimal pre-training mode and memory selector based on the dataset configuration, and handles teacher model path lookup.

Acknowledgement

We thank the following open-source implementations that were used in this work:

Citation

@article{xx,
  title = 
}

About

Network Intrusion Detection Adaptability under Spurious Labels: A Noise-Aware Few-Shot Class-Incremental Learning Framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages