Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation metrics #4

Open
KyonP opened this issue Apr 21, 2022 · 0 comments
Open

Evaluation metrics #4

KyonP opened this issue Apr 21, 2022 · 0 comments

Comments

@KyonP
Copy link

KyonP commented Apr 21, 2022

Hello, I hope your research goes well. 馃榾

I am trying to evaluate the metrics that you proposed for my model.

I have read your paper. However, I am asking you to double-check.
(my results seem a bit odd and off the scale, that's why 馃槩)

  1. I presume that the "character F1" score represents the "micro avg" of F1 score outputs from your eval_clasifier.py code? Am I correct?
  2. also, "Frame accuracy" represents "eval Image Exact Match Acc" outputs from your eval_classifier.py code?
  3. are BLEU 2 and BLEU 3 scores scaled by 100? I have tested your translate.py code with my generated images, and I've got about 0.04-ish scores. are the BLEU scores you reported multiplied by 100?
  4. Lastly, It is unclear about the R-precision evaluation method. Do I require to train your code (H-DAMSM)? if so, when is the right time to stop the training and benchmark my model?
  5. To fair comparison, is it possible to be provided your H-DAMSM pretrained weight?

I am currently stuck on the R-precision evaluation using H-DAMSM. So, I was thinking of utilizing the recent CLIP R-Precision instead, but I am leaving this issue to avoid a fair comparison issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant