Network Intrusion Detection Adaptability under Spurious Labels: A Noise-Aware Few-Shot Class-Incremental Learning Framework
-
Place the dataset according to the the path defined in
dataset_config.py. -
It is recommended to use
virtualenvto create an isolated Python environment:virtualenv venv source venv/bin/activate -
Install the required packages:
pip install -r requirements.txt
Navigate to the src directory and execute the main script:
cd src
python3 -u main.pyfollowed by general options:
--gpus: Number of GPUs to use for the experiment (default: 0)--max_epochs: Maximum number of epochs--default_root_dir: Default path for logs and weights if no logger orlightning.pytorch.callbacks.ModelCheckpointcallback is passed--seed: Random seed--is-fscil: Enable Few-Shot Class Incremental (FSCIL) procedure if present; otherwise, use Few-Shot Learning (FSL) procedure--pt-only: Run only the pre-training phase if present--ft-only: Run only the adaptation phase if present--num-tasks: Number of episodes used for the adaptation task (default: 100)--shuffle-classes: shuffles FSL/FSCIL partition classes based on the seed if present--classes-per-set: Number of classes for pre-training and adaptation (e.g., '7 3' means the first 7 classes are used for pre-training and the next 3 for adaptation, order defined indataset_config.py). If not set, the 'train_classes' for pre-training and 'test_classes' for adaptation are used, as defined indataset_config.py.- Additional options are those defined by the PyTorch Lightnining
Trainerclass. They can be viewed via:python3 -u main.py --help
and specific options for --approach,--datasets and --network.
LightningTLModule in tl_module.py defines methods for pre-training and fine-tuning along with utility functions. LightningRFS inherits from this class. To run this approach, add the name of the *.py file to --approach.
Generic options in LightningTLModule include:
--shots: Number of shots used during fine-tuning (default: 5)--queries: Number of queries used during fine-tuning (default: 5)--lr: Starting learning rate (default: 0.001)--lr-strat: Learning rate scheduler strategy (default: 'none') (choices: ['lrop', 'cawr', 'none'])--scheduler_patience: Reduce LR on plateau (lrop) scheduler patience (default: 10)--scheduler_decay: lrop decay rate (default: 0.1)--t0: Cosine annealing warm restarts (cawr) period (default: 10)--eta-min: cawr minimum LR (default: 1e-5)--ckpt-path: Path to resume a saved PyTorch Lightning module with the .ckpt extension (default: None)
LightningRFS (rfs.py) implements the RFS algorithm for Few-Shot Learning, as described in "Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need?". Options include:
--alpha: Weight for CE loss (default: 0.5)--gamma: Weight for KL loss (default: 0.5)--is-distill: Enable knowledge distillation if present--kd-t: Temperature for knowledge distillation loss (default: 1)--teacher-path: Path to the teacher model (default: None)--base-learner: Type of base learner ('lr' or 'nn') (default: 'nn')--pre-mode: Pre-training mode, choices:none,recon,contrastive,hybrid(default: None)--memory-selector: Memory selection strategy, choices:herding,uncertainty,random(default: None)--noise-label: Enable noisy label training if present--noise-ratio: Noise ratio for noisy labels (0.0-1.0, default: 0.0)--denoising: Denoising method, choices:proto_margin,LOF(default: None)
Hyperparameters for distribution calibration and feature synthesis (see Distribution.py):
--calib-gamma: Weight on the new class in the calibrated mean/covariance; the base-class fusion gets the rest. Default: 0.7 (i.e. 30% new class, 70% base fusion when used with--calib-epsilon).--calib-epsilon: Weight on the base-class fusion in the calibrated distribution. Default: 0.7.
Use these for sensitivity experiments, e.g. --calib-gamma 0.5 --calib-epsilon 0.5.
The network class defines the architecture model to be used. This class includes methods to manage the model.
Options include:
--network: Embedding function to use (default:Lopez17CNN). The network used in experiments isUNet1D2D--out-features-size: Feature vector size (default: -1)--weights-path: Path to a*.ptfile with weights to initialize the embedding function (default: None)--scale: Scaling factor for the neural network backbone (default: 1)
Datasets are defined in dataset_config.py. Each key is considered a dataset name.
Options include:
--dataset: Dataset to use (default: 'iot_nid')--num-pkts: Number of packets to select from the dataset (default: None)--fields: Packet field(s) used (default: [], choices: ['PL', 'IAT', 'DIR', 'WIN', 'FLG', 'PL_DIR'])
Early-stopping parameters are handled by EarlyStoppingDoubleMetric in callbacks.py, wrapping PyTorch Lightning's EarlyStopping to add features like logging and double monitored metrics.
Options include:
--monitor: Metric to monitor for early stopping (default: 'valid_accuracy')--min_delta: Early-stopping minimum delta (default: 0.01)--patience: Early-stopping patience (default: 17)--mode: Early-stopping mode (default: 'auto')--double-monitor: Monitor bothvalid_accuracyandvalid_lossif present
In src/approach, you can find the PyTorch Lightning implementation of RFS.
In src/modules, you can find implementations of various nn.Module components (e.g., losses, teacher).
In src/data, several classes act as wrappers for the PyTorch Lightning dataloader modules.
In src/trainers, there is a trainer class that acts as a wrapper for the PyTorch Lightning trainer, extending its functionality to manage custom logic (e.g., FSCIL).
In src/util, you can find various utility functions for managing the seed and RNG state, the logger, and callbacks.
In src/networks, you can find implementations of embedding functions and the logic needed to manage them.
To train the teacher model, you need to specify the pre-training mode and memory selector. Example for iot_nid dataset with nn classifier:
python3 main.py --is-fscil --dataset iot_nid --fields PL IAT DIR WIN FLG TTL --num-pkts 20 --shots 10 --queries 40 --gpus 1 --num-tasks 100 --max_epochs 100 --seed 0 --approach rfs --patience 20 --monitor valid_accuracy --min_delta 0.001 --mode max --double-monitor --lr 0.0001 --lr-strat none --classes-per-set 7 3 --default_root_dir ../save_files/results_rfs_teacher_allpre/results_rfs_teacher_iot_nid_10shot_contrastive_nn_uncertainty --network UNet1D2D --base-learner nn --pre-mode contrastive --memory-selector uncertaintyWhen training the student model, knowledge distillation, noisy labels, and denoising methods are used. The teacher model path is automatically found from the training results directory.
Example for iot_nid dataset with nn classifier, noise ratio 0.1, and proto_margin denoising:
python3 main.py --is-fscil --dataset iot_nid --fields PL IAT DIR WIN FLG TTL --num-pkts 20 --shots 9 --queries 40 --gpus 1 --num-tasks 100 --max_epochs 100 --seed 0 --approach rfs --patience 20 --monitor valid_accuracy --min_delta 0.001 --mode max --double-monitor --lr 0.0001 --lr-strat none --classes-per-set 7 3 --default_root_dir ../save_files/results_rfs_student/results_rfs_student_iot_nid_9shot_contrastive_nn_uncertainty_noise_0.1 --network UNet1D2D --base-learner nn --kd-t 1 --teacher-path ../save_files/results_rfs_teacher_allpre/results_rfs_teacher_iot_nid_10shot_contrastive_nn_uncertainty/lightning_logs/version_0/distill_models/teacher_ep100.pt --is-distill --memory-selector uncertainty --noise-label --noise-ratio 0.1 --denoising proto_marginKey parameters for noisy label training and denoising:
--noise-label: Enable noisy label training--noise-ratio: Noise ratio for noisy labels (0.0-1.0), e.g., 0.1 means 10% of labels are noisy--denoising: Denoising method, choices:proto_margin,LOF
You can use the run_experiments.py script to run experiments in batch with different noise ratios:
cd src
python3 run_experiments.py --datasets iot_nid --shots 9 8 7 6 5 --classifier nn --noise-ratios 0.1 0.2 0.3 0.4 0.5The script automatically selects the optimal pre-training mode and memory selector based on the dataset configuration, and handles teacher model path lookup.
We thank the following open-source implementations that were used in this work:
@article{xx,
title =
}