adding efficientnet_el, efficientnet_es_pruned and efficientnet_el_pruned pre-trained models #502
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dense Models
EfficientNet-ES (EdgeTPU-Small) and EfficientNet-EL (EdgeTPU-Large) are trained with 8 Quadro RTX 8000 using pytorch-image-models repo.
The training scripts hyper-params are derived from training_hparam_examples.
./distributed_train.sh 8 /imagenet --model efficientnet_es -b 128 --sched step --epochs 450 --decay-epochs 2.4 --decay-rate .97 --opt rmsproptf --opt-eps .001 -j 8 --warmup-lr 1e-6 --weight-decay 1e-5 --drop 0.2 --drop-connect 0.2 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .064
./distributed_train.sh 8 /imagenet --model efficientnet_el -b 128 --sched step --epochs 450 --decay-epochs 2.4 --decay-rate .97 --opt rmsproptf --opt-eps .001 -j 8 --warmup-lr 1e-6 --weight-decay 1e-5 --drop 0.2 --drop-connect 0.2 --aa rand-m9-mstd0.5 --remode pixel --reprob 0.2 --amp --lr .064
The EfficientNet-ES accuracies are slightly lower than what is reported in training_hparam_examples since we just took the best checkpoint, not the average of 8 best checkpoints. I kept our EfficientNet-ES results for completeness.
Pruned Models
The pruning is done by use of the DG_Prune submodule. The pruning code is provided at my forked pytorch-image-models in DG branch.
The pruning is done using the lottery ticket hypothesis (LTH) algorithm. The pruning hyperparameters are provided in the json files attached in DeGirum/pruned-models efficientnet release.
The training hyperparameters are exactly the same as the dense training.
Results