-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R-CNN API design #28
Comments
Elegant? No. 😆 My initial thinking was similar to your description. I imagined a few subclasses of Keras’s model class, e.g. an RPN model, an RCNN model, and a MaskRCNN model. However, I think there’d be some benefit of something more modular. @mcquin has been thinking about this problem. I’d love to hear her thoughts. I’ve also considered reaching out to @KaimingHe, @rbgirshick, @pdollar, etc. and asking for feedback. I’d especially like to hear feedback from @rbgirshick about his current thinking around RCNN implementation after writing and maintaining py-faster-rcnn for the past year or two. @JihongJu Are you planning on attending CVPR? A few of us will be attending (@jhung0 and I are attending), so maybe we can talk about future plans (I assume some of the aforementioned RCNN authors are attending too). SciPy is another option (@mcquin is attending). Alternatively, we could schedule a Skype call for anybody that’s interested. When we (@jhung0, @mcquin, @AnneCarpenter, and I) first started discussing this , I wrote the following simple description in my notebook:
I still think this captures what I hope this becomes (rather than the application approach provided by py-faster-rcnn or yolo). I imagine us trying to stay state-of-the-art (inside the scope of RCNN) but enabling users to mix-and-match components (like a Mask-RCNN branch). |
@0x00b1 Unfortunately, I am not attending CVPR this year. A Skype call would work the best for me because I am now located in Europe.
I totally agree with the components mix-and-match pattern. That is also what I have in mind. |
I think it is the time to think about how the R-CNN API should look like. We have discussed it a bit in #7, but there it is more about the structure than the API.
From my understanding, a R-CNN framework should include a body, that predicts ROIs from images, and a couple of heads, that predicts scores, bounding boxes (and masks).
The idea is to easily attach different bodies, ResNets, (VGG, FPN etc.), and choose to include/exclude the mask head.
Do you have an elegant way of doing this in minds? @0x00b1 @jhung0
The text was updated successfully, but these errors were encountered: