Clarification on Models #2

michael-camilleri · 2022-03-11T09:46:05Z

Am I right in understanding that currently, you only implement the STLT model, without any of the appearance features?

Is there a timeline for when the use of appearance features will be incorporated? I am particularly interested in either the PBF (hoping to use my own appearance features) or CACNF models (if it can be pretrained end-to-end).

Thanks

gorjanradevski · 2022-03-11T14:18:21Z

Hi Michael! Thanks for your interest in our work.

You are correct, currently the only model implemented (with released weights) is STLT. Since I was too busy with CVPR/ECCV, I postponed the release of the other models :( I plan on releasing CACNF (and others) by the end of March, at latest. Would that work for you?

As soon as I update the repo I'll make sure to ping you.

Cheers!

michael-camilleri · 2022-03-11T16:03:32Z

Thanks Gorjan

I will await release of the model then.
By way of clarification, I am interested in using the model to classify the behaviour of group-housed mice: so I would need to fine-tune the model (preferably end-to-end) on my data, with custom classes. Would this be possible? Or do you think the architecture is not amenable to such a scenario?

Thanks

gorjanradevski · 2022-03-14T10:02:23Z

Hello Michael,

It is definitely possible, however, note that you would have to separately train an object detector, as the models' inputs are object detections (STLT), together with video appearance embeddings, i.e., RGB frames (CACNF). I can provide further suggestions if needed.

Best,
Gorjan

michael-camilleri · 2022-03-14T10:42:41Z

Hi Gorjan

Thanks.
Indeed, I already have detections of the mice through an in-house trained object detector. I suspect I would be able to get appearance embeddings in some way through it.

With regards to the CACNF model, does it not train the appearance embeddings itself (as in given raw frames and BBoxes?) In this case, it is possible to pre-train the backbone on my own dataset first (I have an associated dataset of detections).

Michael

gorjanradevski · 2022-03-23T21:42:24Z

Hi Michael, CACNF trains the appearance embeddings in conjunction with the layout embeddings. The model uses a ResNet 3D (pre-trained on Kinetics, etc.) and fine-tunes the ResNet to obtain appearance embeddings. Does that answer your question?

michael-camilleri · 2022-03-24T15:35:35Z

Yes. Thanks

michael-camilleri · 2022-03-30T15:47:10Z

Hi Gorjan

Has there been any progress on the release of CACNF?

Thanks

gorjanradevski · 2022-03-30T19:35:26Z

Hi Michael,

Slowly but surely I'm refactoring the code. In the past few days, I've released the Action Genome models, the appearance Resnet3D models (currently working on this), and refactored everything under a joint training and inference pipeline. The original code is ugly, DRY violated all around...so it takes some time :( Plus, I'm also renaming modules in ALL of the trained checkpoints because of the bad naming I did while chasing the deadline...This usually takes place in a branch other than master, currently action_genome (have a look, fetch at your own risk).

I'm doing my best to finish by the end of the weekend. I'll keep you posted.

Best,
Gorjan

michael-camilleri · 2022-03-31T06:47:01Z

Gosh, you do have your work cut out for you. Thanks for taking the mantle on this, I really appreciate it.

(and I fully understand with publication code being ugly... I've been there)

gorjanradevski · 2022-04-02T22:53:44Z

Michael! I've just released CACNF trained on Something-Else Compositional split (and some others), and significantly refactored the codebase. Currently I've verified the inference results against the paper (Table 1) and tested the training of STLT on Action Genome, which works as expected. I haven't yet trained a new CACNF model with the refactored repo, but I assume it should work.

Make sure to update the datasets/models in case you downloaded something before, and let me know how it goes!

michael-camilleri · 2022-04-04T08:59:38Z

Hi Gorjan
Thanks for the above! I will have a look and see if I can get started.

michael-camilleri · 2022-04-04T13:01:11Z

Hi Gorjan
Could you clarify if you also changed the STLT model architecture/implementation? I seem unable to load previously trained models.

gorjanradevski · 2022-04-04T14:29:18Z

The only thing changed in the STLT model architecture is that some of the modules are renamed. In particular, I think the only thing you need to change is to rename the backbone from stlt_backbone to backbone. Please post the error you're getting, and I'll have a look.

michael-camilleri · 2022-04-04T14:47:35Z

Hi Gorjan Indeed, I managed to do just that (I just loaded the state_dict and renamed the keys). It is still complaining about missing score_embeddings, but it is working (seems these are not used for something dataset) Thanks

…

________________________________ From: Gorjan ***@***.***> Sent: 04 April 2022 15:29 To: gorjanradevski/revisiting-spatial-temporal-layouts ***@***.***> Cc: CAMILLERI Michael ***@***.***>; Author ***@***.***> Subject: Re: [gorjanradevski/revisiting-spatial-temporal-layouts] Clarification on Models (Issue #2) This email was sent to you by someone outside the University. You should only click on links or attachments if you are certain that the email is genuine and the content is safe. The only thing changed in the STLT model architecture is that some of the modules are renamed. In particular, I think the only thing you need to change is to rename the backbone from stlt_backbone to backbone. Please post the error you're getting, and I'll have a look. — Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIKRP32C3BZRZ2SPRDLFSQDVDL4ETANCNFSM5QPDZERA>. You are receiving this because you authored the thread.Message ID: ***@***.***> The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.

gorjanradevski · 2022-04-04T14:50:57Z

Yes, they are not used for Something-Something. The score_embedding is used for Action Genome, and I included them in the model definition in order not to have multiple implementations of STLT. Let me know in case you encounter any other issue.

michael-camilleri closed this as completed Apr 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Models #2

Clarification on Models #2

michael-camilleri commented Mar 11, 2022

gorjanradevski commented Mar 11, 2022

michael-camilleri commented Mar 11, 2022

gorjanradevski commented Mar 14, 2022

michael-camilleri commented Mar 14, 2022 •

edited

gorjanradevski commented Mar 23, 2022

michael-camilleri commented Mar 24, 2022

michael-camilleri commented Mar 30, 2022

gorjanradevski commented Mar 30, 2022

michael-camilleri commented Mar 31, 2022

gorjanradevski commented Apr 2, 2022

michael-camilleri commented Apr 4, 2022

michael-camilleri commented Apr 4, 2022

gorjanradevski commented Apr 4, 2022

michael-camilleri commented Apr 4, 2022 via email

gorjanradevski commented Apr 4, 2022

Clarification on Models #2

Clarification on Models #2

Comments

michael-camilleri commented Mar 11, 2022

gorjanradevski commented Mar 11, 2022

michael-camilleri commented Mar 11, 2022

gorjanradevski commented Mar 14, 2022

michael-camilleri commented Mar 14, 2022 • edited

gorjanradevski commented Mar 23, 2022

michael-camilleri commented Mar 24, 2022

michael-camilleri commented Mar 30, 2022

gorjanradevski commented Mar 30, 2022

michael-camilleri commented Mar 31, 2022

gorjanradevski commented Apr 2, 2022

michael-camilleri commented Apr 4, 2022

michael-camilleri commented Apr 4, 2022

gorjanradevski commented Apr 4, 2022

michael-camilleri commented Apr 4, 2022 via email

gorjanradevski commented Apr 4, 2022

michael-camilleri commented Mar 14, 2022 •

edited