New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification on Models #2
Comments
Hi Michael! Thanks for your interest in our work. You are correct, currently the only model implemented (with released weights) is STLT. Since I was too busy with CVPR/ECCV, I postponed the release of the other models :( I plan on releasing CACNF (and others) by the end of March, at latest. Would that work for you? As soon as I update the repo I'll make sure to ping you. Cheers! |
Thanks Gorjan I will await release of the model then. Thanks |
Hello Michael, It is definitely possible, however, note that you would have to separately train an object detector, as the models' inputs are object detections (STLT), together with video appearance embeddings, i.e., RGB frames (CACNF). I can provide further suggestions if needed. Best, |
Hi Gorjan Thanks. With regards to the CACNF model, does it not train the appearance embeddings itself (as in given raw frames and BBoxes?) In this case, it is possible to pre-train the backbone on my own dataset first (I have an associated dataset of detections). Michael |
Hi Michael, CACNF trains the appearance embeddings in conjunction with the layout embeddings. The model uses a ResNet 3D (pre-trained on Kinetics, etc.) and fine-tunes the ResNet to obtain appearance embeddings. Does that answer your question? |
Yes. Thanks |
Hi Gorjan Has there been any progress on the release of CACNF? Thanks |
Hi Michael, Slowly but surely I'm refactoring the code. In the past few days, I've released the Action Genome models, the appearance Resnet3D models (currently working on this), and refactored everything under a joint training and inference pipeline. The original code is ugly, DRY violated all around...so it takes some time :( Plus, I'm also renaming modules in ALL of the trained checkpoints because of the bad naming I did while chasing the deadline...This usually takes place in a branch other than master, currently action_genome (have a look, fetch at your own risk). I'm doing my best to finish by the end of the weekend. I'll keep you posted. Best, |
Gosh, you do have your work cut out for you. Thanks for taking the mantle on this, I really appreciate it. (and I fully understand with publication code being ugly... I've been there) |
Michael! I've just released CACNF trained on Something-Else Compositional split (and some others), and significantly refactored the codebase. Currently I've verified the inference results against the paper (Table 1) and tested the training of STLT on Action Genome, which works as expected. I haven't yet trained a new CACNF model with the refactored repo, but I assume it should work. Make sure to update the datasets/models in case you downloaded something before, and let me know how it goes! |
Hi Gorjan |
Hi Gorjan |
The only thing changed in the STLT model architecture is that some of the modules are renamed. In particular, I think the only thing you need to change is to rename the backbone from |
Hi Gorjan
Indeed, I managed to do just that (I just loaded the state_dict and renamed the keys). It is still complaining about missing score_embeddings, but it is working (seems these are not used for something dataset)
Thanks
…________________________________
From: Gorjan ***@***.***>
Sent: 04 April 2022 15:29
To: gorjanradevski/revisiting-spatial-temporal-layouts ***@***.***>
Cc: CAMILLERI Michael ***@***.***>; Author ***@***.***>
Subject: Re: [gorjanradevski/revisiting-spatial-temporal-layouts] Clarification on Models (Issue #2)
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
The only thing changed in the STLT model architecture is that some of the modules are renamed. In particular, I think the only thing you need to change is to rename the backbone from stlt_backbone to backbone. Please post the error you're getting, and I'll have a look.
—
Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIKRP32C3BZRZ2SPRDLFSQDVDL4ETANCNFSM5QPDZERA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
|
Yes, they are not used for Something-Something. The |
Am I right in understanding that currently, you only implement the STLT model, without any of the appearance features?
Is there a timeline for when the use of appearance features will be incorporated? I am particularly interested in either the PBF (hoping to use my own appearance features) or CACNF models (if it can be pretrained end-to-end).
Thanks
The text was updated successfully, but these errors were encountered: