Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the differences on the training strategies when "_pt" in model name or not. #1

Closed
xiaovhua opened this issue Aug 3, 2023 · 2 comments

Comments

@xiaovhua
Copy link

xiaovhua commented Aug 3, 2023

Thank you so much for your prominent work in your MProtoNet, and sincerely thank you for your code available on Github too.

Recently, we have some doubts about the usage of (a) "_pt" in model name and (b) best_grid["fixed"] in tumor_cls.py.

In the commands you supply in the repository, the model names all end with "_pmX". Therefore, according to the code, MProtoNet will update in mode "joint" and "last_layer" (every 10 epochs) while training. However, we notice that when "_pt" is in the model name and best_grid["fixed"]=False, there is also another branch, where the net.features will not update at the begging, but start to optimize when current epoch >= wu_e. We also notice that the vanilla ProtoPNet updates in the latter way (fix the net.features at the begging).

Will there be a significant difference between two kinds of training strategies? Hope to receive your kindest reply!

@xiaovhua xiaovhua changed the title What "_pt" in model name and best_grid["fixed"] denote in tumor_cls.py? About the differences on the training strategies when "_pt" in model name or not. Aug 3, 2023
@aywi
Copy link
Owner

aywi commented Aug 4, 2023

It is kind of our fault for not cleaning up these code, since these options are defined for models that are deleted in the final paper.

  1. "_pt" means "pre-trained" models that we tested before, which are 2D & 3D mixed models where their 2D layers are pre-trained with Imagenet1k. Since we test pure 3D models in the final paper, these models are deleted for fair comparisons. You can also see that in the final experiments we choose feature layers all with "_ri", which means "randomly initialized" (pre-trained versions are ignored since we only use these 2D backbones from torchvision as sketches to build 3D feature layers):

    mprotonet/src/models.py

    Lines 27 to 47 in 4565e22

    def features_imagenet1k(features):
    if features == 'resnet18':
    return build_resnet_features(vision_models.resnet18(weights='IMAGENET1K_V1'))
    elif features == 'resnet18_ri':
    return build_resnet_features(vision_models.resnet18())
    elif features == 'resnet34':
    return build_resnet_features(vision_models.resnet34(weights='IMAGENET1K_V1'))
    elif features == 'resnet34_ri':
    return build_resnet_features(vision_models.resnet34())
    elif features == 'resnet50':
    return build_resnet_features(vision_models.resnet50(weights='IMAGENET1K_V2'))
    elif features == 'resnet50_ri':
    return build_resnet_features(vision_models.resnet50())
    elif features == 'resnet101':
    return build_resnet_features(vision_models.resnet101(weights='IMAGENET1K_V2'))
    elif features == 'resnet101_ri':
    return build_resnet_features(vision_models.resnet101())
    elif features == 'resnet152':
    return build_resnet_features(vision_models.resnet152(weights='IMAGENET1K_V2'))
    elif features == 'resnet152_ri':
    return build_resnet_features(vision_models.resnet152())
  2. "fixed" is a much older option when I tested the pre-trained models without a fixed training period in the beginning (yes, the correct name should be "not fixed").
  3. Vanilla ProtoPNet has a fixed training period because it is a 2D model pre-trained with Imagenet1k. Since we test pure 3D models that are randomly initialized in the final paper, this period becomes useless and only wastes training time.

So, in a word, just ignore them. I will add a commit later to remove these code.

@aywi aywi closed this as completed in 49d4c89 Aug 4, 2023
@xiaovhua
Copy link
Author

xiaovhua commented Aug 4, 2023

It is kind of our fault for not cleaning up these code, since these options are defined for models that are deleted in the final paper.

  1. "_pt" means "pre-trained" models that we tested before, which are 2D & 3D mixed models where their 2D layers are pre-trained with Imagenet1k. Since we test pure 3D models in the final paper, these models are deleted for fair comparisons. You can also see that in the final experiments we choose feature layers all with "_ri", which means "randomly initialized" (pre-trained versions are ignored since we only use these 2D backbones from torchvision as sketches to build 3D feature layers):

    mprotonet/src/models.py

    Lines 27 to 47 in 4565e22

    def features_imagenet1k(features):
    if features == 'resnet18':
    return build_resnet_features(vision_models.resnet18(weights='IMAGENET1K_V1'))
    elif features == 'resnet18_ri':
    return build_resnet_features(vision_models.resnet18())
    elif features == 'resnet34':
    return build_resnet_features(vision_models.resnet34(weights='IMAGENET1K_V1'))
    elif features == 'resnet34_ri':
    return build_resnet_features(vision_models.resnet34())
    elif features == 'resnet50':
    return build_resnet_features(vision_models.resnet50(weights='IMAGENET1K_V2'))
    elif features == 'resnet50_ri':
    return build_resnet_features(vision_models.resnet50())
    elif features == 'resnet101':
    return build_resnet_features(vision_models.resnet101(weights='IMAGENET1K_V2'))
    elif features == 'resnet101_ri':
    return build_resnet_features(vision_models.resnet101())
    elif features == 'resnet152':
    return build_resnet_features(vision_models.resnet152(weights='IMAGENET1K_V2'))
    elif features == 'resnet152_ri':
    return build_resnet_features(vision_models.resnet152())
  2. "fixed" is a much older option when I tested the pre-trained models without a fixed training period in the beginning (yes, the correct name should be "not fixed").
  3. Vanilla ProtoPNet has a fixed training period because it is a 2D model pre-trained with Imagenet1k. Since we test pure 3D models that are randomly initialized in the final paper, this period becomes useless and only wastes training time.

So, in a word, just ignore them. I will add a commit later to remove these code.

I get it! Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants