Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RULEDST evaluation #227

Open
IreneSucameli opened this issue Mar 1, 2022 · 9 comments
Open

RULEDST evaluation #227

IreneSucameli opened this issue Mar 1, 2022 · 9 comments
Labels
feature Feature to add in the future

Comments

@IreneSucameli
Copy link

Hi, could you please provide more information on how the Rule DST module is evaluated?
Thanks

@IreneSucameli IreneSucameli added the feature Feature to add in the future label Mar 1, 2022
@zqwerty
Copy link
Member

zqwerty commented Mar 10, 2022

We did not evaluate the rule DST solely since it needs dialog acts as input. If you want to compare rule DST with other DST models, you may use the golden dialog acts as input or use an NLU model such as BERTNLU to parse both user and system acts.

@IreneSucameli
Copy link
Author

I would like to use the output of BERTNLU as the input for the dst; however, it is not clear for me how to pass the data from one module to another, and I haven't find any code for that in convlab, for the moment.

Could you kindly link the convlab's page where this is described, or provide me more information about this process?

@zqwerty
Copy link
Member

zqwerty commented Mar 21, 2022

You can refer to the Colab tutorial or the interface class for nlu and dst. You can see PipelineAgent for how to build an agent with modules. Example usage:
https://github.com/thu-coai/ConvLab-2/blob/master/tests/test_BERTNLU-RuleDST-RulePolicy-TemplateNLG.py

@IreneSucameli
Copy link
Author

Thank you for the info. Nevertheless, the Colab tutorial refers to an overall evaluation (nlu + dst+ nlg).
What if I would like to evaluate the nlu+dst only, in order to analyze if the defined rules are ok or need some improvements? Is that possible? Thanks again

@zqwerty
Copy link
Member

zqwerty commented Mar 21, 2022

Sure. Just feed the output of NLU to DST:

self.input_action = self.nlu.predict(observation, context=[x[1] for x in self.history[:-1]])
else:
self.input_action = observation
self.input_action = deepcopy(self.input_action) # get rid of reference problem
# get state
if self.dst is not None:
if self.name is 'sys':
self.dst.state['user_action'] = self.input_action
else:
self.dst.state['system_action'] = self.input_action
state = self.dst.update(self.input_action)

@IreneSucameli
Copy link
Author

From the code you posted it doesn't seem that the module is evaluated with F1 scores or a similar measure... perhaps I don't understand your point...

@zqwerty
Copy link
Member

zqwerty commented Mar 23, 2022

Sorry, I thought you need instruction about how to pass the output of NLU to DST. If you want to evaluate NLU+DST, you can write a script to: 1) read the original data; 2) pass utterances to NLU to get the user dialog acts; 3) pass user dialog acts to RuleDST to get predicted state; 4) compare predictions with references

@zqwerty
Copy link
Member

zqwerty commented Mar 23, 2022

@IreneSucameli
Copy link
Author

Ok, thanks, I'll try in this way!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature to add in the future
Projects
None yet
Development

No branches or pull requests

2 participants