Normalization-Test

Comparison normalization to pre-process images when using a pre-trained model

Motivation

It is generally known that when using a pre-trained model for fine-tuning, the image for fine-tuning is normalized using the statistics of the data to be used for fine-tuning. This is because the data for fine-tuning is different from the distribution of the data used for pre-training. However, sometimes the data to be fine-tuned is normalized by the statistics of the data used for pre-training, and the performance is better or similar. Therefore, I would like to experiment to see in which cases this difference occurs.

Experiments

Baseline

In this experiments, I used different types of CNN models, ResNet and DenseNet, and varied the size of the models to use them as baseline models.

model	#Params
ResNet18	11M
ResNet50	24M
DenseNet121	7M
DenseNet161	27M

Dataset

We use four datasets for our experiments. We also use the mean and standard deviation to check the statistics after normalization, or the mean and standard deviation of ImageNet-1k used for model pre-training.

< Table 1. Dataset Description. >

Dataset	#Images(trainset)	#Images(testset)	#Classes	Image Size	Fine-tune stats.	ImageNet-1k stats.
CIFAR10	50,000	10,000	10	32 X 32 X 3	mean: (0.00, 0.00, 0.00) std : (1.00, 1.00, 1.00)	mean: (0.03, 0.12, 0.18) std : (1.08, 1.09, 1.16)
CIFAR100	50,000	10,000	100	32 X 32 X 3	mean: (0.00, 0.00, 0.00) std : (1.00, 1.00, 1.00)	mean: (0.10, 0.14, 0.16) std : (1.17, 1.15, 1.23)
SVHN	73,257	26,032	10	32 X 32 X 3	mean: (0.00, 0.00, 0.00) std : (1.00, 1.00, 1.00)	mean: (-0.21, -0.06, 0.30) std : ( 0.87, 0.90, 0.88)
Tiny ImageNet-200	100,000	10,000	200	64 X 64 X 3	mean: (0.00, 0.00, 0.00) std : (1.00, 1.00, 1.00)	mean: (-0.38, -0.04, 0.33) std : ( 1.23, 1.20, 1.23)

Training Settings

All models are trained with ./default_configs.

SEED: 223

DATASET:
  datadir: /datasets

OPTIMIZER:
  opt_name: Adam
  lr: 0.001

TRAINING:
  batch_size: 256
  test_batch_size: 256
  epochs: 100
  log_interval: 10
  use_scheduler: true

RESULT:
  savedir: './saved_model'

the statistics for each datasets to be fine-tuned and ImageNet-1k used to pre-trained can be find in ./stats.py.

dataset_stats = {
    "imagenet":{
        "num_classes" : 1000,
        "img_size"    : 224,
        "mean"        : (0.485, 0.456, 0.406),
        "std"         : (0.229, 0.224, 0.225)
    },
    "cifar10":{
        "num_classes" : 10,
        "img_size"    : 32,
        "mean"        : (0.4914, 0.4822, 0.4465),
        "std"         : (0.247, 0.2435, 0.2616)
    },
    "cifar100":{
        "num_classes" : 100,
        "img_size"    : 32,
        "mean"        : (0.5071, 0.4867, 0.4408),
        "std"         : (0.2675, 0.2565, 0.2761)
    },
    "svhn":{
        "num_classes" : 10,
        "img_size"    : 32,
        "mean"        : (0.4377, 0.4438, 0.4728), 
        "std"         : (0.1980, 0.2010, 0.1970)
    },
    "tiny_imagenet_200":{
        "num_classes" : 200,
        "img_size"    : 64,
        "mean"        : (0.4802, 0.4481, 0.3975), 
        "std"         : (0.2764, 0.2689, 0.2816)
    }
}

Results

< Figure 1. Compare history of test accuracy based on normalization settings by model and dataset >

Figure 1 shows the accuracy history of the testset between different models and datasets. From the results, it seems that using normalization on the fine-tuning dataset converges to higher performance in most cases.

< Table 2. Accuracy(%) for four datasets. $\textcolor{green}{green}$ indicates how much better the performance is than pre-trained setting. $\textcolor{red}{red}$ indicates how much worse the performance is than pre-trained setting. >

Model	Setting	CIFAR10 Accuracy(%)	CIFAR100 Accuracy(%)	SVHN Accuracy(%)	Tiny ImageNet-200 Accuracy(%)
DenseNet121	Fine-tuned	89.05 ($\textcolor{green}{+00.02}$)	63.34 ($\textcolor{red}{-00.24}$)	95.16 ($\textcolor{green}{+00.25}$)	59.66 ($\textcolor{green}{+00.42}$)
	Pre-trained	89.03	63.58	94.91	59.24

DenseNet161	Fine-tuned	90.91 ($\textcolor{red}{-00.15}$)	67.68 ($\textcolor{red}{-00.19}$)	95.51 ($\textcolor{green}{+00.05}$)	61.90 ($\textcolor{green}{+00.03}$)
	Pre-trained	91.06	67.87	95.46	61.87

ResNet18	Fine-tuned	87.88 ($\textcolor{red}{-00.10}$)	61.67 ($\textcolor{green}{+00.12}$)	95.01 ($\textcolor{red}{-00.16}$)	53.62 ($\textcolor{green}{+00.73}$)
	Pre-trained	87.98	61.55	95.17	52.89

ResNet50	Fine-tuned	89.78 ($\textcolor{red}{-00.42}$)	67.58 ($\textcolor{green}{+00.64}$)	94.39 ($\textcolor{red}{-00.09}$)	67.32 ($\textcolor{green}{+00.16}$)
	Pre-trained	90.20	66.94	94.48	67.16

Table 1 compares the best accuracy for the testset. When comparing the results quantitatively, I found that using normalization based on the fine-tuning dataset resulted in higher performance in three out of four cases for CIFAR10 and higher performance in two out of four cases for CIFAR100 and SVHN. However, Tiny ImageNet-200 performed better with all of the fine-tuned settings.

In general, image normalization for model training is done to fit a normal distribution. Therefore, the pre-training model was also trained on images normalized to a normal distribution. However, in the case of tiny imagenet-200, when normalized to the statistics of imagenet-1k, it is farther from the normal distribution than the other datasets, so I assume that the fine-tuning setting yielded better results than the pretraining setting.

< Table 3. Accuracy(%) for four datasets and for normalization settings.

Model	Setting	CIFAR10 Accuracy(%)	CIFAR100 Accuracy(%)	SVHN Accuracy(%)	Tiny ImageNet-200 Accuracy(%)
DenseNet121	finetune	0.8905	0.6334	0.9516	0.5966
	pretrain	0.8903	0.6358	0.9491	0.5924
	instance	0.8768	0.6056	0.9515	0.5659
	minmax	0.8889	0.6367	0.9512	0.5917

DenseNet161	finetune	0.9091	0.6768	0.9551	0.6190
	pretrain	0.9106	0.6787	0.9546	0.6187
	instance	0.8994	0.6572	0.9537	0.6034
	minmax	0.9087	0.6763	0.9525	0.6245

ResNet18	finetune	0.8788	0.6167	0.9501	0.5362
	pretrain	0.8798	0.6155	0.9517	0.5289
	instance	0.8655	0.5764	0.9455	0.5083
	minmax	0.8797	0.6118	0.9497	0.5254

ResNet50	finetune	0.8978	0.6758	0.9439	0.6732
	pretrain	0.9020	0.6694	0.9448	0.6716
	instance	0.8932	0.6469	0.9465	0.6540
	minmax	0.9023	0.6750	0.9449	0.6724

Following on from Table 2, we experimented with two additional image normalization methods in Table 3. The first instance setting performs normalization on a per-image basis, and the second performs MinMax normalization. The results were surprisingly good with minmax normalization in some cases.

Conclusion

The results were, unsurprisingly, strongly influenced by whether the image was normalized to a normal distribution. The general rule of thumb is to normalize to the distribution of the data we want to fine-tune. Sometimes I do it without bothering with normalization, and I see it in other repositories as well, so I need to reflect on it and make sure the model is trained well according to the canonicals.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
datasets		datasets
saved_model		saved_model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis.ipynb		analysis.ipynb
default_configs.yaml		default_configs.yaml
log.py		log.py
main.py		main.py
run.sh		run.sh
stats.py		stats.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Normalization-Test

Experiments

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Normalization-Test

Experiments

Conclusion

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages