Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on Models #2

Closed
michael-camilleri opened this issue Mar 11, 2022 · 15 comments
Closed

Clarification on Models #2

michael-camilleri opened this issue Mar 11, 2022 · 15 comments

Comments

@michael-camilleri
Copy link

Am I right in understanding that currently, you only implement the STLT model, without any of the appearance features?

Is there a timeline for when the use of appearance features will be incorporated? I am particularly interested in either the PBF (hoping to use my own appearance features) or CACNF models (if it can be pretrained end-to-end).

Thanks

@gorjanradevski
Copy link
Owner

Hi Michael! Thanks for your interest in our work.

You are correct, currently the only model implemented (with released weights) is STLT. Since I was too busy with CVPR/ECCV, I postponed the release of the other models :( I plan on releasing CACNF (and others) by the end of March, at latest. Would that work for you?

As soon as I update the repo I'll make sure to ping you.

Cheers!

@michael-camilleri
Copy link
Author

Thanks Gorjan

I will await release of the model then.
By way of clarification, I am interested in using the model to classify the behaviour of group-housed mice: so I would need to fine-tune the model (preferably end-to-end) on my data, with custom classes. Would this be possible? Or do you think the architecture is not amenable to such a scenario?

Thanks

@gorjanradevski
Copy link
Owner

Hello Michael,

It is definitely possible, however, note that you would have to separately train an object detector, as the models' inputs are object detections (STLT), together with video appearance embeddings, i.e., RGB frames (CACNF). I can provide further suggestions if needed.

Best,
Gorjan

@michael-camilleri
Copy link
Author

michael-camilleri commented Mar 14, 2022

Hi Gorjan

Thanks.
Indeed, I already have detections of the mice through an in-house trained object detector. I suspect I would be able to get appearance embeddings in some way through it.

With regards to the CACNF model, does it not train the appearance embeddings itself (as in given raw frames and BBoxes?) In this case, it is possible to pre-train the backbone on my own dataset first (I have an associated dataset of detections).

Michael

@gorjanradevski
Copy link
Owner

Hi Michael, CACNF trains the appearance embeddings in conjunction with the layout embeddings. The model uses a ResNet 3D (pre-trained on Kinetics, etc.) and fine-tunes the ResNet to obtain appearance embeddings. Does that answer your question?

@michael-camilleri
Copy link
Author

Yes. Thanks

@michael-camilleri
Copy link
Author

Hi Gorjan

Has there been any progress on the release of CACNF?

Thanks

@gorjanradevski
Copy link
Owner

Hi Michael,

Slowly but surely I'm refactoring the code. In the past few days, I've released the Action Genome models, the appearance Resnet3D models (currently working on this), and refactored everything under a joint training and inference pipeline. The original code is ugly, DRY violated all around...so it takes some time :( Plus, I'm also renaming modules in ALL of the trained checkpoints because of the bad naming I did while chasing the deadline...This usually takes place in a branch other than master, currently action_genome (have a look, fetch at your own risk).

I'm doing my best to finish by the end of the weekend. I'll keep you posted.

Best,
Gorjan

@michael-camilleri
Copy link
Author

Gosh, you do have your work cut out for you. Thanks for taking the mantle on this, I really appreciate it.

(and I fully understand with publication code being ugly... I've been there)

@gorjanradevski
Copy link
Owner

Michael! I've just released CACNF trained on Something-Else Compositional split (and some others), and significantly refactored the codebase. Currently I've verified the inference results against the paper (Table 1) and tested the training of STLT on Action Genome, which works as expected. I haven't yet trained a new CACNF model with the refactored repo, but I assume it should work.

Make sure to update the datasets/models in case you downloaded something before, and let me know how it goes!

@michael-camilleri
Copy link
Author

Hi Gorjan
Thanks for the above! I will have a look and see if I can get started.

@michael-camilleri
Copy link
Author

Hi Gorjan
Could you clarify if you also changed the STLT model architecture/implementation? I seem unable to load previously trained models.

@gorjanradevski
Copy link
Owner

The only thing changed in the STLT model architecture is that some of the modules are renamed. In particular, I think the only thing you need to change is to rename the backbone from stlt_backbone to backbone. Please post the error you're getting, and I'll have a look.

@michael-camilleri
Copy link
Author

michael-camilleri commented Apr 4, 2022 via email

@gorjanradevski
Copy link
Owner

Yes, they are not used for Something-Something. The score_embedding is used for Action Genome, and I included them in the model definition in order not to have multiple implementations of STLT. Let me know in case you encounter any other issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants