Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does iCAN result contains human-object interactioness relation? #17

Closed
xiadingZ opened this issue Jul 7, 2020 · 6 comments
Closed

Does iCAN result contains human-object interactioness relation? #17

xiadingZ opened this issue Jul 7, 2020 · 6 comments

Comments

@xiadingZ
Copy link

xiadingZ commented Jul 7, 2020

Does iCAN detection result contains human-object relation? then you needn't estimate which object one person interacts with,
Only need to predict which part interacts with this object?

@xiadingZ xiadingZ changed the title Does iCAN result contains human-object relation? Does iCAN result contains human-object interactioness relation? Jul 7, 2020
@DirtyHarryLYL
Copy link
Owner

For part-level interaction, e.g., recognizing hand is interacting with which object and performing what actions, are separated with the instance-level interaction detection.
In the part-level model, we also provide positive and negative training samples for the PaSta recognizer similar to the instance-level model.
For example, given the same human-mug box pair, the instance model may infer that they are non-interactive, but the PaSta model may infer that the hand and head are interacting with the object and thus conclude the result ''drink with mug''. This will bring in more flexibility: if the instance model makes a mistake, the PaSta model may not.

@xiadingZ
Copy link
Author

xiadingZ commented Jul 7, 2020

Do you use pretrained instance-level interaction detection? or train instance-level interaction detection first, then part-level? Or simultaneously? In this repo, I only see part-level interaction detection using this:
vec_cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(labels=label_vec, logits=cls_score_vec).
Or can you figure out the code of instance-level interaction detection? thanks very much

@DirtyHarryLYL
Copy link
Owner

In training, we only train the part model solely. In inference, part model would give the part-level prediction and then late fused with the instance-level prediction from an off-the-shelf instance model like iCAN or TIN.

@DirtyHarryLYL
Copy link
Owner

For example, the result of TIN is loaded from a pkl here:

score_H = pickle.load(open('TIN/score_H.pkl', 'rb'))

@xiadingZ
Copy link
Author

xiadingZ commented Jul 8, 2020

In inference, for one human, iterate all objs, late fuse part-obj prediction score and instance-level hoi predication score, if one obj-part score is higher than threshold, then use this human-obj pair to predict action and calculate metric. if all obj-part pairs scores are lower than threshold, then drop this human-obj pair.
Is my understanding correct? and can you give the equations of late fusion

@Foruck
Copy link
Collaborator

Foruck commented Jul 8, 2020

The equation is : sHO = (((sH + sO) * ssp + sP + sA + sL) * score_I[im_index[keys[obj_index]], x:y]) * hod.
sH, sO and ssp are prediction probabilities from TIN, (sH + sO) * ssp is the instance-level score and (sP + sA + sL) is the part-level score. score_I is the image-level score. hod is the multiplication of human detection score and object detection score (both from the object detector).
In inference, for one human, iterate all objs, and perform the above late-fusion. If the human detection score or object detection score is lower than a given threshold, the pair is dropped. If the interaction score (from TIN, saved in -Results/TIN/neg.pkl and -Results/TIN/pos.pkl) doesn't satisfy the NIS rule, the pair is also dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants