Epitopological Learning and Cannistraci-Hebb Network Shape Intelligence Brain-Inspired Theory for Ultra-Sparse Advantage in Deep Learning
Yingtao Zhang1,2,3, Jialin Zhao1,2,3, Wenjing Wu1,2,3, Alessandro Muscoloni1,2,4 & Carlo Vittorio Cannistraci1,2,3,4
1 Center for Complex Network Intelligence (CCNI)
2 Tsinghua Laboratory of Brain and Intelligence (THBI)
3 Department of Computer Science
4 Department of Biomedical Engineering
Tsinghua University, Beijing, China
Brain-inspired Sparse Training in MLP and Transformers with Network Science Modeling via Cannistraci-Hebb Soft Rule
Yingtao Zhang1,2,4, Jialin Zhao1,2,4, Ziheng Liao1,2,4, Wenjing Wu1,2,4, Umberto Michieli5 & Carlo Vittorio Cannistraci1,2,3,4
1 Center for Complex Network Intelligence (CCNI), Tsinghua Laboratory of Brain and Intelligence (THBI)
2 Department of Computer Science
3 Department of Biomedical Engineering
4 Tsinghua University, Beijing, China
5 University of Padova, Italy
- Create a new conda environment:
conda create -n chts python=3.10
conda activate chts
- Install relevant packages:
pip install -r requirements.txt
- Compile the python-c code:
python setup.py build_ext --inplace
Navigate to the MLP directory:
cd mlp_and_cnn
python run.py --batch_size 32 --dataset EMNIST --network_structure mlp --weight_decay 5e-04 --regrow_method CH3_L3 --init_mode swi --update_mode zero --bias --linearlr --epochs 100 --learning_rate 0.025 --cuda_device 3 --dim 2 --update_interval 1 --reset_parameters --self_correlated_sparse --no_log --chain_removal --zeta 0.3 --remove_method weight_magnitude --seed 0 --sparsity 0.99 --early_stop
python run.py --batch_size 32 --dataset EMNIST --network_structure mlp --weight_decay 5e-04 --regrow_method CH3_L3_soft --init_mode swi --update_mode zero --bias --linearlr --epochs 100 --learning_rate 0.025 --cuda_device 3 --dim 2 --update_interval 1 --reset_parameters --self_correlated_sparse --no_log --chain_removal --zeta 0.3 --remove_method weight_magnitude_soft --seed 0 --sparsity 0.99 --T_decay linear
Note:
--remove_method
can be chosen from weight_magnitude, weight_magnitude_soft, ri, ri_soft--self_correlated_sparse
means using Correlated Sparse Topological initialization- For Bipartite Small World (BSW) model, activate
--WS --beta $YOUR_BETA_VALUE
- For Bipartite Scale-Free (BSF) model, activate
--BA
python run.py --batch_size 32 --dataset EMNIST --network_structure mlp --weight_decay 5e-04 --regrow_method random --init_mode kaiming --update_mode zero --bias --linearlr --epochs 100 --learning_rate 0.025 --cuda_device 3 --dim 2 --update_interval 1 --reset_parameters --no_log --zeta 0.3 --remove_method weight_magnitude --seed 0 --sparsity 0.99
Navigate to the Transformer directory:
cd Transformer
- Download and tokenize the dataset:
cd data/iwslt14/
bash prepare-iwslt14.sh
- Preprocess the dataset:
cd ../../
bash preprocess.iwslt14.sh
- Train the model:
- Fully connected network:
bash iwslt_FC.sh
- CHTs:
bash iwslt_CHTs.sh
- SET:
bash iwslt_SET.sh
- Evaluate the model:
bash eval_iwslt.sh ${beam_size} ${model_path}
Download and preprocess Multi-30k:
python preprocess.py -train_src data/Multi30k/train.en -train_tgt data/Multi30k/train.de -valid_src data/Multi30k/val.en -valid_tgt data/Multi30k/val.de -save_data data/Multi30k/processed.noshare -src_seq_length 256 -tgt_seq_length 256 -src_vocab_size 40000 -tgt_vocab_size 40000
Download and preprocess WMT17:
cd data/wmt17/
bash prepare-wmt14.sh
cd ../../
bash preprocess.wmt17.sh
If you use our code, please consider citing:
@inproceedings{
zhang2024epitopological,
title={Epitopological learning and Cannistraci-Hebb network shape intelligence brain-inspired theory for ultra-sparse advantage in deep learning},
author={Yingtao Zhang and Jialin Zhao and Wenjing Wu and Alessandro Muscoloni and Carlo Vittorio Cannistraci},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=iayEcORsGd}
}
@article{202406.1136,
doi = {10.20944/preprints202406.1136.v1},
url = {https://doi.org/10.20944/preprints202406.1136.v1},
year = 2024,
month = {June},
publisher = {Preprints},
author = {Yingtao Zhang and Jialin Zhao and Ziheng Liao and Wenjing Wu and Umberto Michieli and Carlo Vittorio Cannistraci},
title = {Brain-Inspired Sparse Training in MLP and Transformers with Network Science Modeling via Cannistraci-Hebb Soft Rule},
journal = {Preprints}
}