Skip to content
This repository was archived by the owner on Aug 22, 2019. It is now read-only.

Conversation

@tmbo
Copy link
Member

@tmbo tmbo commented Sep 10, 2018

Proposed changes:

  • Don't run actions during evaluate
  • fixed export format to include the whole stories
  • use the name of the original story if possible
  • fixes #720

Status (please check what you already did):

  • made PR ready for code review
  • added some tests for the functionality
  • updated the documentation
  • updated the changelog

@akelad
Copy link
Contributor

akelad commented Sep 10, 2018

Something i noticed, if there's an OR in a story, the story title becomes a repetition of the story title, separated by a > (e.g. ## just newsletter > just newsletter) -- any way we can avoid this and just have the actual original title?

@akelad
Copy link
Contributor

akelad commented Sep 10, 2018

Also it prints out a failure for each OR possibility, one would be enough 😄

@tmbo
Copy link
Member Author

tmbo commented Sep 10, 2018

I've pushed a fix for the naming, would be great if you can try it out.

Concerning the number of failures: that is a tricky one and I don't think we can solve it as part of this one.

@akelad
Copy link
Contributor

akelad commented Sep 10, 2018

Haha yeah i thought that might be trickier. but yeah, will try out that fix!

@akelad
Copy link
Contributor

akelad commented Sep 10, 2018

just tested and the naming thing seems to be fixed :)

@akelad
Copy link
Contributor

akelad commented Sep 10, 2018

Actually I take back my comment about only printing out one OR possibility. Sometimes it is genuinely just the one intent that causes the problem and not the others

@tmbo tmbo requested a review from akelad September 10, 2018 16:49
@tmbo
Copy link
Member Author

tmbo commented Sep 10, 2018

ready for review ✅

@tmbo tmbo mentioned this pull request Sep 10, 2018
4 tasks
@Ghostvv Ghostvv self-requested a review September 12, 2018 08:24
@tmbo tmbo removed the request for review from akelad September 12, 2018 11:46
Copy link
Contributor

@Ghostvv Ghostvv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for inconsistent names (see comments): I think we should use everywhere actual or gold, prediction or pred. And name lists with plural ...s.

I ran it on paper bot - works fine, but additionally I would like to have the statement: number of correct stories out of

For some reason my logger statements start from :INFO:rasa_nlu.evaluate:..., why is it nlu?

preds_padding = (len(actions_between_utterances) -
len(last_prediction))
preds = []
gold = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name it golds as in above method

preds.extend(last_prediction)
preds_padding = (len(actions_between_utterances) -
len(last_prediction))
preds = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name it predictions here or preds above

actual_padding = (len(last_prediction) -
len(actions_between_utterances))
for tracker in tqdm(completed_trackers):
curr_gold, curr_predictions, predicted_tracker = \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like curr, can we rename it to current?

"{} but got {}.".format(p, a))


def align_lists(pred, actual):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's name it predictions, golds here, or preds, actuals, but then rename gold to actual everywhere

completed_trackers = evaluate._generate_trackers(
DEFAULT_STORIES_FILE, default_agent)

actual, preds, failed_stories = collect_story_predictions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistent names here as wall

@tmbo
Copy link
Member Author

tmbo commented Sep 13, 2018

Great push for better names 👍

from rasa_core.interpreter import NaturalLanguageInterpreter
from rasa_core.trackers import DialogueStateTracker
from rasa_core.training.generator import TrainingDataGenerator
from rasa_nlu.evaluate import plot_confusion_matrix, log_evaluation_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should not import anything from rasa_nlu?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's actually fine, rasa core depends on nlu (we also use things in other places from nlu) - the only things we shouldn't use are parts that depend on optional dependencies (e.g. spacy).

@Ghostvv
Copy link
Contributor

Ghostvv commented Sep 13, 2018

I guess the last thing is: could you please add number of correct stories out of and then we are good to merge


for tracker in tqdm(completed_trackers):
curr_gold, curr_predictions, predicted_tracker = \
current_gold, current_predictions, predicted_tracker = \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be current_golds I guess, because it is a list here

golds.extend(current_gold)

if not curr_gold == curr_predictions:
if not current_gold == current_predictions:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_golds

preds.extend(curr_predictions)
gold.extend(curr_gold)
predictions.extend(current_predictions)
golds.extend(current_gold)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current_golds

@tmbo
Copy link
Member Author

tmbo commented Sep 14, 2018

@Ghostvv all remarks have been addressed, this is ready to get merged if you give your ok

@tmbo
Copy link
Member Author

tmbo commented Sep 14, 2018

not sure what you mean, it is changed here: https://github.com/RasaHQ/rasa_core/pull/966/files#diff-4b73ffae216c7073d031c5bb15f53d11R157 is there any other location?

@Ghostvv
Copy link
Contributor

Ghostvv commented Sep 14, 2018

yes, for some reason, it was displayed incorrectly

@tmbo
Copy link
Member Author

tmbo commented Sep 14, 2018

so all good?

Copy link
Contributor

@Ghostvv Ghostvv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise looks good. All works, good to merge

@tmbo
Copy link
Member Author

tmbo commented Sep 14, 2018

That is printed now as well.

@tmbo tmbo merged commit 4da91e1 into master Sep 14, 2018
@tmbo tmbo deleted the proper-evaluate branch September 14, 2018 08:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants