Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vocab mismatch between checkpoint and paper #19

Closed
aluo-x opened this issue Oct 18, 2019 · 3 comments
Closed

Vocab mismatch between checkpoint and paper #19

aluo-x opened this issue Oct 18, 2019 · 3 comments

Comments

@aluo-x
Copy link

aluo-x commented Oct 18, 2019

Was just running the checkpoints for COCO & VG.

For VG there are indeed 45 relationships plus a "in_image" relationship, which matches the paper on arxiv. However, for COCO there are additional "touching" relationships, which brings the total of non "in_image" relationships to 10.

@jcjohnson could you potentially help clarify this question?

@jcjohnson
Copy link
Collaborator

jcjohnson commented Oct 18, 2019

In some of my earlier experiments I tried some additional relationships on COCO, defined as follows (replaces https://github.com/google/sg2im/blob/master/sg2im/data/coco.py#L337)

      touching = False
      if self.touching_relations:
        area_s = (sx1 - sx0) * (sy1 - sy0)
        area_o = (ox1 - ox0) * (oy1 - oy0)
        ix0, ix1 = max(sx0, ox0), min(sx1, ox1)
        iy0, iy1 = max(sy0, oy0), min(sy1, oy1)
        area_i = max(0, ix1 - ix0) * max(0, iy1 - iy0)
        iou = area_i / (area_s + area_o - area_i)
        touching = 0.1 < iou < 0.5

      if sx0 < ox0 and sx1 > ox1 and sy0 < oy0 and sy1 > oy1:
        p = 'surrounding'
      elif sx0 > ox0 and sx1 < ox1 and sy0 > oy0 and sy1 < oy1:
        p = 'inside'
      elif theta >= 3 * math.pi / 4 or theta <= -3 * math.pi / 4:
        p = 'right touching' if touching else 'left of'
      elif -3 * math.pi / 4 <= theta < -math.pi / 4:
        p = 'bottom touching' if touching else 'above'
      elif -math.pi / 4 <= theta < math.pi / 4:
        p = 'left touching' if touching else 'right of'
      elif math.pi / 4 <= theta < 3 * math.pi / 4:
        p = 'top touching' if touching else 'below'
      p = self.vocab['pred_name_to_idx'][p]
      triples.append([s, p, o])

However in the final models I didn't end up using these relationships. They are still present in the vocab of the pretrained models, but these relationships were not used at all during training and the embeddings associated with these relationships in the released model weights will be random. Thus if you try to pass a scene graph with one of these "touching" relationships, you will probably get a garbage output from the model.

@aluo-x
Copy link
Author

aluo-x commented Oct 18, 2019

Many thanks for the impressively quick reply! Really appreciate the clarification!

@aluo-x aluo-x closed this as completed Oct 18, 2019
@jcjohnson
Copy link
Collaborator

My reply times are usually bimodal: either I respond right away or it will fall out of my inbox and be forgotten forever!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants