Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When could we expect to see code #1

Closed
bradezard131 opened this issue Apr 21, 2020 · 24 comments
Closed

When could we expect to see code #1

bradezard131 opened this issue Apr 21, 2020 · 24 comments

Comments

@bradezard131
Copy link
Contributor

Hey,

Super keen to see code for this. It's very interesting and I've started trying to reimplement it, however, the paper doesn't include all the details and says to reference the code. Any idea when we could expect to see it (so I don't keep coming back every day to check if it's updated)?

Cheers

@Peipeilvcm
Copy link

+1

@jason718
Copy link
Contributor

jason718 commented Apr 21, 2020

Thanks for the attention. Because of covid-19, everything is unfortunately slowed down. We are working on it and plan to release the code this summer. Thanks for understanding.

@bradezard131
Copy link
Contributor Author

@jason718 Any update? Seeing as CVPR is now happening and the paper is unfortunately scant on many implementation details (telling us to see the released code, which has not been released).

@bradezard131
Copy link
Contributor Author

Bump. Really hoping this isn't yet another un-reproducible paper that leaves details to code that is never released

@jason718
Copy link
Contributor

Sorry for late, and thanks for the attention. We are preparing it, and also incorporating a follow-up paper into the same repo. The current plan is to wait till eccv results come out and we will release the code for both paper.

@bradezard131
Copy link
Contributor Author

In that case, would you mind sharing some details that are absent from the paper? I will list below some questions:

  1. What is the structure of the Concrete Dropblock? It says a conv residual block, is that a bottleneck like in resnet50? Also, a residual block will output HxHxC where C is the input channels, but the paper says the output is HxH (i.e. HxHx1) so how do you do that reduction?
  2. How do you balance your difference losses? I have already found out the hard way that Tang et. al. and most works that derive from them use F.binary_cross_entropy(midn_preds.sum(0), image_labels, reduction='mean') as their mind loss even though the paper has a sum as in the original WSDDN. Do you do this as well? and any other balancing?

@jason718
Copy link
Contributor

jason718 commented Jun 30, 2020

  1. It's a BasicBlock as in the ResNet models where we change the 2nd conv layers to have output channel 1. one example is here:https://github.com/pytorch/vision/blob/7fd24918b2d0a41295600b819f329593d5a33a91/torchvision/models/resnet.py#L35

  2. We use BCE w/ mean reduction. We also use the loss weights as introduced in OICR.

@bradezard131
Copy link
Contributor Author

If you change the output of the second conv layer to 1 channel, how do you handle the residual connection? Since the input to the residual block is still HxHxC, and the output is HxHx1, the only way to add them is to allow broadcasting and then you still end up HxHxC?

@bityangke
Copy link

Hi, are your codes implemented with pytorch?

@jason718
Copy link
Contributor

jason718 commented Jul 4, 2020

If you change the output of the second conv layer to 1 channel, how do you handle the residual connection? Since the input to the residual block is still HxHxC, and the output is HxHx1, the only way to add them is to allow broadcasting and then you still end up HxHxC?

If you change the output of the second conv layer to 1 channel, how do you handle the residual connection? Since the input to the residual block is still HxHxC, and the output is HxHx1, the only way to add them is to allow broadcasting and then you still end up HxHxC?

A 1x1 conv is used on the residual connection, the same as in resnet.

@jason718
Copy link
Contributor

jason718 commented Jul 4, 2020

Hi, are your codes implemented with pytorch?

yes.

@youthHan
Copy link

youthHan commented Jul 6, 2020

Hi, will the code be released? @jason718

@liuyuancv
Copy link

+1

@SongYii
Copy link

SongYii commented Aug 1, 2020

Hello, when will the code be released?

@jason718
Copy link
Contributor

jason718 commented Aug 1, 2020

Hi @liuyuan11 @SongYii @youthHan, and many others,

We want to thank for the attention of this work. We are sorry for the delay and are actively working for a high-quality code base.

The purpose of this repo was to create a placeholder so that everyone knows where the code will appear in the first place. The delay is mostly due to:

  1. The original code was built on "Mask R-CNN Benchmark". We are working to switch to Detectron2 so that it is compatible with the latest supervised research.
  2. IP, legal approval and code review can take time.
  3. We have an upcoming ECCV paper highly related to the CVPR work. We plan to release a unified codebase that covers both.

We apologize for the unrigorous descriptions in the initial paper, and have updated the arxiv paper with more implementation details. Feel free to raise any technical questions here or email the authors. We are more than happy to answer.

@bradezard131
Copy link
Contributor Author

bradezard131 commented Aug 5, 2020

You say in the paper you take p percent of proposals, and then later you say p=0.15. Can I confirm that this means you take 0.15% (approximately 3) regions per image? And is there anything else with regards to the pseudo-labelling algorithm that is worth noting? I have a working implementation of both OICR and PCL, but have not been able to replicate even your MIST w/o Regression result with any set of hyperparameters (the original OICR, the original PCL, the latest PCL from pcl.pytorch, or the new parameters you've listed)

@bityangke
Copy link

@bradezard131 @jason718 I thought the "Self-training with regression" is from the main idea of the ICCV 2019 paper titled 'Towards Precise End-to-End Weakly Supervised Object Detection Network'. Please correct me if I am wrong.

@bradezard131
Copy link
Contributor Author

@bityangke You are mistaken, they take a different approach. I am referring to MIST W/O Reg. From Table 5, which should be a normal OICR model but with the MIST algorithm (Algorithm 1)

@jason718
Copy link
Contributor

jason718 commented Aug 5, 2020

@bradezard131 p=0.15 means top 15% percent. It takes more then enough proposals as the initial pool from which the pseudo-labels are further generated. As you said, setting p=0.15% will only pick ~3 ROIs, then a MIST w/o regression is almost the same as OICR (top-1).

We apologize for the writing in paper. Please try p=15% in your code base. I'm also curious about the reason of why you cannot reproduce. If you still cannot solve it, can you start a new issue and share more information with us? From the current information you provide, I cannot tell what happened.

@doduythao
Copy link

@jason718 Also the paper is WSOD but in the Experiment - Dataset part, I think I only see you using full-annotated datasets, where is the image-level datasets mentioned detail ?
Thank you

@jason718
Copy link
Contributor

@doduythao all the WSOD papers still use fully-annotated datasets (VOC, COCO), and I guess this is mainly due to the evaluation purpose. Otherwise, it's hard to compare to most detection works.

I am not aware of any datasets specifically collected for WSOD tasks (image-level tags only for training set, and bounding boxes for test set). Feel free to share and suggest the ones you know.

@BlueBlueFF
Copy link

@doduythao all the WSOD papers still use fully-annotated datasets (VOC, COCO), and I guess this is mainly due to the evaluation purpose. Otherwise, it's hard to compare to most detection works.

I am not aware of any datasets specifically collected for WSOD tasks (image-level tags only for training set, and bounding boxes for test set). Feel free to share and suggest the ones you know.

Hi, will you release code after ECCV? Thanks

@jason718
Copy link
Contributor

@BlueBlueFF that's the plan.

@BlueBlueFF
Copy link

@jason718 UFO2: A Unified Framework towards Omni-supervised Object Detection, Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection. Nice works for wsod!! And will code release this month?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants