Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

Multiple bugs for evaluating selfplay #9

Closed
HMJiangGatech opened this issue Jun 3, 2020 · 2 comments
Closed

Multiple bugs for evaluating selfplay #9

HMJiangGatech opened this issue Jun 3, 2020 · 2 comments

Comments

@HMJiangGatech
Copy link

  1. In README, Section 5 Scoring:
airdialogue score --pred_data ./data/out_dir/dev_selfplay_out.txt \
                  --true_data ./data/airdialogue/tokenized/dev.selfplay.eval.data \
                  --true_kb ./data/airdialogue/tokenized/dev.selfplay.eval.kb \
                  --task selfplay \
                  --output ./data/out_dir/dev_selfplay.json

It loads tokenized true_data and true_kb.
However according to
https://github.com/josephch405/airdialogue/blob/c74072f8667d92839dc39e98b386ce8e932c8c68/airdialogue/evaluator/evaluator_main.py#L240-L256
, it actually needs json files.
May be change it to

                  --true_data ./data/airdialogue/json/dev_data.json \
                  --true_kb ./data/airdialogue/json/dev_kb.json \

?

  1. After fixing the previous bug, another one appears:

https://github.com/josephch405/airdialogue/blob/c74072f8667d92839dc39e98b386ce8e932c8c68/airdialogue/evaluator/evaluator_main.py#L247

it process pred_json_obj['action'] using action_obj_to_str. This step, however, has been done when generating dev_selfplay_out.txt

maybe remove action_obj_to_str?

  1. After that, another one appears:
    https://github.com/josephch405/airdialogue/blob/c74072f8667d92839dc39e98b386ce8e932c8c68/airdialogue/evaluator/evaluator_main.py#L252

pred_json_obj is not compatible with json_obj_to_tokens, where pred_json_obj do not have key dialogue. Instead pred_json_obj has a key called utterance

I can get the program run via replacing that line by

pred_raw_text = pred_json_obj['utterance'].replace('<t1> ','').replace('<t2> ','').split(' ')

However, it think that may not be the optimal solution.

@josephch405
Copy link
Contributor

This is addressed in a working version of a PR at the Airdialogue repository - will close once both are merged in with the README updates. At the moment we're basically doing what you mention in question 3 of transforming between utterances and dialogues, but on the Airdialogue cli side.

@josephch405
Copy link
Contributor

Address in #10

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants