Is there any data augmentation strategy used during training? #19

qiusuor · 2020-12-22T14:00:22Z

I try to train ScaleHyperprior model using code provide in example/train.py with Imagenet2012/DIV2K dataset. The learning rate for main optimizer and aux_optimizer both set to 1e-4. The learning rate of the main optimizer is then divided by 2 when the evaluation loss reaches a plateau as it described in https://interdigitalinc.github.io/CompressAI/zoo.html. However, atfer nearly 1 week's training, the rate-distortion performance on Kadok-24 still has a gap compared to the result provided.

I am also training model on vimeo_test_clean from Vimeo90K after 2 days, it seems will not to converge to the result provided.
Have I missed something? Is there any data augmentation strategy used during training?

jbegaint · 2020-12-22T16:15:17Z

Hi, no we don't do any augmentation besides random cropping the data to patches of size 256x256. We used the vimeo dataset from here.

How large is the performance gap?

Another learning rate strategy would be to set the lr to 1e-5 after ~100-150 epochs (depending on your dataset/batch size). We have observed similar performances.

qiusuor · 2020-12-23T07:24:04Z

Thanks for quick reply.
The gap is huge when training with Lambda 0.0130.

I will try to train on the dataset provide above. Is all data used in training/validating or just part of them?

qiusuor · 2020-12-23T09:35:20Z

I have checked the example/train.py and found that gradients were clipped by hyper_param --clip_max_norm(default:0.1). Does it influence train result? I wonder if gradient clip operation is involved in your training process? If it is, what the value of hyper_param --clip_max_norm?

After declare network, I explicitly call the update func like below, is that ok?
net = ScaleHyperprior(M,N)
net.update(force=True)

jbegaint · 2020-12-23T16:15:16Z

Yes ok, this gap should not happen.
We do use all the training/test images provided by the dataset (with random crop for training, and center crop for testing).

We use gradient norm clipping, but usually the value is set 1. not 0.1, I'll fix the example. Thanks for reporting!

You only need to call .update() after the training of your network, it's only required for the entropy coding (compress/decompress).

qiusuor · 2020-12-27T06:23:37Z

Thanks. Finally I get similar results.

jbegaint · 2020-12-28T20:29:03Z

Great, thanks for the update!

achel-x · 2022-12-26T13:28:01Z

Yes ok, this gap should not happen. We do use all the training/test images provided by the dataset (with random crop for training, and center crop for testing).

We use gradient norm clipping, but usually the value is set 1. not 0.1, I'll fix the example. Thanks for reporting!

You only need to call .update() after the training of your network, it's only required for the entropy coding (compress/decompress).

Hi,
I met the same question. The training dataset is DIV2k. I down-sampled all 800 images to half of their sizes. All 1600 images are included. I just train one point, the performance is shown in the figure below

Here is a gap with the homepage shows.
The training set is follow the default set
epochs 100
lr is 1e-4
lamda is 1e-2
Here is part of training log.

1、I wonder where is the exact position of the update call.
2、I noticed that the training set mentioned on the homepage is from Vimeo90k. I also want to know exactly how many images were used in training

achel-x · 2022-12-26T14:28:33Z

@jbegaint sorry, i found the update command. While I try to update the trained model as python -m compressai.utils.update_model ./checkpoint.pth.tar -a bmshj2018-factorized`
Then to eval its performance. The results are the same
The name of model file in command does not cause this problem.

jbegaint self-assigned this Dec 23, 2020

qiusuor closed this as completed Dec 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any data augmentation strategy used during training? #19

Is there any data augmentation strategy used during training? #19

qiusuor commented Dec 22, 2020

jbegaint commented Dec 22, 2020

qiusuor commented Dec 23, 2020

qiusuor commented Dec 23, 2020

jbegaint commented Dec 23, 2020

qiusuor commented Dec 27, 2020

jbegaint commented Dec 28, 2020

achel-x commented Dec 26, 2022

achel-x commented Dec 26, 2022

Is there any data augmentation strategy used during training? #19

Is there any data augmentation strategy used during training? #19

Comments

qiusuor commented Dec 22, 2020

jbegaint commented Dec 22, 2020

qiusuor commented Dec 23, 2020

qiusuor commented Dec 23, 2020

jbegaint commented Dec 23, 2020

qiusuor commented Dec 27, 2020

jbegaint commented Dec 28, 2020

achel-x commented Dec 26, 2022

achel-x commented Dec 26, 2022