Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the teacher logits of the TinyViT #109

Closed
shadowpa0327 opened this issue Jul 31, 2022 · 4 comments
Closed

About the teacher logits of the TinyViT #109

shadowpa0327 opened this issue Jul 31, 2022 · 4 comments
Labels

Comments

@shadowpa0327
Copy link

Hi, thanks for sharing your excellent work. I am trying to utilize the script save_logits.py to generate the soft label for knowledge distillation. During generation I found that the binary file of the same epoch is different when using different starting epoch option. Although, using the script save_logits.py with the flag --check-saved-logits, there is no different or error occur. But I am wonder that why these different might come from?

To rebuild the issue, we can generate the logit of a specific epoch with different start epoch setting.

Thanks !

@wkcn wkcn added the TinyViT label Aug 1, 2022
@wkcn
Copy link
Contributor

wkcn commented Aug 1, 2022

Thanks for your attention to our work! : )

The result is normal.
In the script save_logits.py#L346-L351, the global random seed depends on the starting epoch.

seed = config.SEED + dist.get_rank() + config.TRAIN.START_EPOCH * \
      dist.get_world_size()

And the random seed of each PyTorch dataloader worker for data augmentation depends on the global random seed.

For the two cases:

  1. Save teacher logits from the 1st epoch to the 10th epoch.
  2. Save teacher logits from 10th epoch.

The random states of two cases are different, since the global random seed and the times of sampling random value are both different. It leads to different augmentation although they are in the same 10th epoch.

Therefore, the binary file of the same epoch is different.

However, it does not affect knowledge distillation.
In the script data/augmentation/dataset_wrapper.py#L32-L45, we save the augmentation seed and restore the augmentation when knowledge distillation.

    def __getitem_for_write(self, index: int):
        # get an augmentation seed
        key = self.keys[index]
        seed = np.int32(np.random.randint(0, 1 << 31))  # generate an augmentation seed
        with AugRandomContext(seed=int(seed)):
            item = self.dataset[index]
        return (item, (key, seed))

    def __getitem_for_read(self, index: int):
        key = self.keys[index]
        seed, logits_index, logits_value = self._get_saved_logits(key)
        with AugRandomContext(seed=seed):  # restore the augmentation by the seed
            item = self.dataset[index]
        return (item, (logits_index, logits_value, np.int32(seed)))

We may update the code to make two cases consistent.
Thanks for your great suggestion!

@shadowpa0327
Copy link
Author

Thanks for the reply. <3

It seems like the augmentation for each mini-batch depend on the local seed which is set at data/augmentation/dataset_wrapper.py#L32-L45
. When knowledge is enable, augmentation data will be restored directly base on this seed that is store in the .bin file?

@wkcn
Copy link
Contributor

wkcn commented Aug 1, 2022

@shadowpa0327
Yes. Each image per epoch is corresponding to a local seed, and the seed will be stored in the .bin file.
For the mixup and cutmix, the seed is the seed combination of two images (https://github.com/microsoft/Cream/blob/main/TinyViT/data/augmentation/mixup.py#L220).

@shadowpa0327
Copy link
Author

@wkcn
Thanks, for the reply. Your implementation is excellent. I will also try to enable the mixup, thanks for the information.

@wkcn wkcn closed this as completed Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants