Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semi-SL Semantic Segmentation. Prototype View. #2156

Merged
merged 30 commits into from
May 24, 2023

Conversation

kprokofi
Copy link
Contributor

Summary

How to test

Checklist

  • I have added unit tests to cover my changes.​
  • I have added integration tests to cover my changes.​
  • I have added e2e tests for validation.
  • I have added the description of my changes into CHANGELOG in my target branch (e.g., CHANGELOG in develop).​
  • I have updated the documentation in my target branch accordingly (e.g., documentation in develop).
  • I have linked related issues.

License

  • I submit my code changes under the same Apache License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below).
# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

@github-actions github-actions bot added the ALGO Any changes in OTX Algo Tasks implementation label May 15, 2023
@kprokofi kprokofi added the ENHANCE Enhancement of existing features label May 15, 2023
@kprokofi kprokofi marked this pull request as ready for review May 16, 2023 21:24
@kprokofi kprokofi requested a review from a team as a code owner May 16, 2023 21:24
@kprokofi
Copy link
Contributor Author

kprokofi commented May 16, 2023

This PR includes new solution for Semi-SL approach. It is implemented for new models, SegNext. In the next PRs documentation will be updated with validation metrics on some public datasets used for validation. Also, experiments for tuning some hyperparameters are in progress in the background mode.
For now, these changes (Prototype based approach) can achieve the following result:

model cityscapes 1/16 kitty_54 VOC 1/16 DISK 1/4 city4 1/16 voc_12 Mean Dice gain
ham_segnext_t: SUP 55.93 62.35 73.82 86.87 68.3 68 0
ham_segnext_t: MT 59.2 66.68 76.14 87.4 68.9 69.78 + 2.14
ham_segnext_t: Proto 0.1 60.2 67.1 77.2 87.6 69.21 70.4 + 2.74

Experiments with bigger models in progress.
Some results:
SegNext-s:

model cityscapes 1/16 kitty_54 VOC 1/16 voc_12 Mean Dice gain
ham_segnext_s: MT 67.02 68.11 79.5 75.11 0
ham_segnext_s: Proto 0.1 69.71 68.54 80.51 75 + 1%

@github-actions github-actions bot added the TEST Any changes in tests label May 16, 2023
@JihwanEom
Copy link
Contributor

Could you please clarify which paper you've implemented, and provide a link to it? Am I correct in understanding that your implementation is based on "Semi-supervised Semantic Segmentation with Prototype-based Consistency Regularization" (https://arxiv.org/pdf/2210.04388.pdf)?

If this is the case, would you consider adopting a step-by-step approach before fully integrating it?
image

As you might be aware, the Mean Teacher model and CutMix-seg are commonly used baselines in academia. It might be beneficial to first explore the performance and training time trade-off for CutMix-seg and then incrementally add Prototype view functionalities.

I would also suggest trying to keep changes to the MeanTeacherSegmentor to a minimum by defining other class for Prototype view, as it could serve as a standard baseline architecture for semi-supervised semantic segmentation.

@kprokofi
Copy link
Contributor Author

kprokofi commented May 17, 2023

@JihwanEom
I use this paper as base idea and this one: https://arxiv.org/abs/2203.15102 for implementation reference

Unfortunately, Cutmix-seg performs poorly in OTX, I conducted experiments with that including different probability as well as Soobee did and saw accuracy degradation. But, CutOut helped and in final solution I changed CutMix to CutOut.

What do you mean step by step? I integrated Prototype network and enhanced a bit MT with CutOut and Filter pixels with high entropy. You can use standard MeanTeacher as always, just remove protohead from config. Also, old models (HrNets) use MT, I didn't change their configs.
It is impossible to use different Class, because the base method it is Mean Teacher.

Many changes in that PR is just "black" some files

@JihwanEom
Copy link
Contributor

JihwanEom commented May 17, 2023

Thank you for the detailed explanation. I agree that CutOut can bring the promising performance improvements, but I believe that CutMix may work in our situation. Have you investigated the performance tendencies for both the lite-hrnet and SegNext templates? (or include ResNets?)

image

According to the CutMix-seg paper (https://arxiv.org/pdf/1906.01916.pdf), CutMix brings significant performance gains compared to CutOut even with only 100 labeled images. I don't think there is a significant difference in the environment between our situation and the one described in the paper.

Even if CutMix does not work due to unknown issues and we use CutOut instead, we need to assess the net benefit of using CutOut without combining it with Prototype view. That was my intention for taking a step-by-step approach. Could you share the experiment results if you have already?

You can continue using the standard MeanTeacher as usual, just remove protohead from the configuration. Also, the old models (HrNets) use MeanTeacher, and I haven't changed their configurations.
=> My suggestion was to consider inheriting MeanTeacherSegmentor in the new architecture of PrototypeViewSegmentor. This would ensure maintainability and reproducibility for not only MT and new models. I recommend creating a new file that defines PrototypeViewSegmentor and inheriting from MeanTeacherSegmentor, because it's a base method for many various algorithms for semi-sl on semantic segmentation as you said. But it's also my individual opinion as OTX developer, please share us if other one have good design ideas.

@kprokofi
Copy link
Contributor Author

kprokofi commented May 17, 2023

My suggestion was to consider inheriting MeanTeacherSegmentor in the new architecture of PrototypeViewSegmentor. This would ensure maintainability and reproducibility for not only MT and new models. I recommend creating a new file that defines PrototypeViewSegmentor and inheriting from MeanTeacherSegmentor, because it's a base method for many various algorithms for semi-sl on semantic segmentation as you said.

I consider your proposal and even tried to do that, but I faced a problem with that. If I do that -> I will copy all the code from forward_train method. Why should we do that?
I moved everything related to Prototypes to decode_proto_network. And there are new lines of code in main forward method:
image
image
I think it is ambigious to create different Segmentor and copy almost all code there.
We also have second option, not copy, but use parent forward and then compute proto based forward, but in that case I should call self.model_s.extract_feat twice. It is double inference

I would consider leave it in the main MeanTeacher framework as ProtoNetwork is additional method rather than main one.

@github-actions github-actions bot added BUILD DEPENDENCY Any changes in any dependencies (new dep or its version) should be produced via Change Request on PM and removed BUILD labels May 17, 2023
@kprokofi
Copy link
Contributor Author

I have some experiments comparing CutMix, CutOut and base algo
image

But I would like to conduct it one more time with different implementation of CutMix, SegNext-s and without early stopping

Copy link
Contributor

@supersoob supersoob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay to keep use_prototype_head in original MeanTeacherSegmentor and enable it by parameter because it were eventually based on mean teacher as current @kprokofi's implementation. About Cutmix, even though I checked the transformed image is alright, I couldn't see the gain of it (but only 1-2% drop). In my investigation, at that time there were some doubtable points. First, it was needed to fix any confidence threshold for pseudo label. Second, inference needed to be done in teacher model which might more generalized than student. And I doubted the cross entropy loss using as consistency loss is affecting it, which it uses prediction from the model trained with few labeled model that might not well-generalized as gt. I don't exactly know what caused cutmix make worse but I hope these helped some for you to refer.

@supersoob
Copy link
Contributor

Could you add unit tests for prototype head and for changed mean teacher?

@eunwoosh
Copy link
Contributor

LGTM, but as Soobee said, could you add unit test?

@kprokofi
Copy link
Contributor Author

@JihwanEom
Please, find below experiments with augmentations on bigger model. Unfortunately, cutmix always performs worse in our OTX. I tried 2 different implementations, but result is the same - CutOut looks better

Model Cityscapes VOC Kitty_57
SegNext-b: MT 70.88 82.05 70.45
SegNext-b: MT + pixel_filter + cutout 71.87 82.35 73.67
SegNext-b: MT + pixel_filter + cutmix 70.43 81.80 69.44
SegNext-b: MT + pixel_filter + cutout + ProtoNet 72.10 82.74 74.31

@github-actions github-actions bot added the API Any changes in OTX API label May 23, 2023
@kprokofi
Copy link
Contributor Author

I added unit tests + integration tests for Semi-SL and e2e for new model (we need to start validate it at least one template)
Could you take a look and merge if it looks good to you?

@JihwanEom
Copy link
Contributor

@JihwanEom Please, find below experiments with augmentations on bigger model. Unfortunately, cutmix always performs worse in our OTX. I tried 2 different implementations, but result is the same - CutOut looks better

Model Cityscapes VOC Kitty_57
SegNext-b: MT 70.88 82.05 70.45
SegNext-b: MT + pixel_filter + cutout 71.87 82.35 73.67
SegNext-b: MT + pixel_filter + cutmix 70.43 81.80 69.44
SegNext-b: MT + pixel_filter + cutout + ProtoNet 72.10 82.74 74.31

Okay, thank you so much for experiment results and kind explanation. I understood.

@jaegukhyun
Copy link
Contributor

Could you set milestone for this PR?

@kprokofi kprokofi added this to the 1.4.0 milestone May 24, 2023
@kprokofi
Copy link
Contributor Author

Could you set milestone for this PR?

Done. Could you approve if it is OK? Let's finally merge this.
In the following PR I will update documentation accordingly

@kprokofi kprokofi merged commit e1004eb into openvinotoolkit:develop May 24, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ALGO Any changes in OTX Algo Tasks implementation API Any changes in OTX API DEPENDENCY Any changes in any dependencies (new dep or its version) should be produced via Change Request on PM ENHANCE Enhancement of existing features TEST Any changes in tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants