This code base was used for the GZHU team's submission to DCASE2023 Task-1 Low Complexity Acoustic Scene Classification. The technical report corresponding to it can be found in DCASE2023.
The entire code framework was written based on PANNs and the pre-trained model we used, resnet38, can be downloaded in there.This code can be run using the same environment as PANNs.
First, use the open source code of CPJKU to reassemble the 1s audio clip into 10s audio
MicIRP needs to be downloaded as the source microphone impulse response and convolved with the audio in training to achieve the effect of the simulated device. Then write the paths of all MIRs into a scp file and modify the path of MIR.scp in the corresponding Dataset class in data_generator.py
.
First use a scp file with spaces as spacers to store the meta information, fill in the order of audio name, label, path, device label. Then modify the "scp_path" and "workspace" in h5.sh to the corresponding paths in your file system, where workspace is the path where h5 is stored. Then run
sh scripts/2_dcase_with_device_h5.sh
After Hdf5 is created, you can enter the training phase, first create a workspace for your training logs and models and other information. And "full_train.h5" and "eval.h5" are stored in your "workspace/indexes", and in and change "WORKSPACE" to the path you created in "*_train.sh".
If you wish to train the target model alone, then run
sh scripts/dcase_train.sh
To train the teacher model alone, run
sh scripts/fineturn_train.sh
To train using deep mutual learning (DML), run
sh scripts/dml_train.sh
To fine tune using knowledge distillation after DML training, please run
sh scripts/kd_fineturn.sh