New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include story source filename in core output for failed stories #5496
Include story source filename in core output for failed stories #5496
Conversation
Include story source filename in the story name in the failed stories output to help find the failed story more easily (see RasaHQ#3419). Passed the source filename starting from the `StoryFileReader` to a `StoryStep`. Besides the story block names the source filename is included in the tracker events which are used for outputting the failed stories. Because the story files are copied to a temporary folder it is not possible to include the original full story path.Instead only the file name is included. If a recursive folder structure is used with the same story file names it can still be hard to find the problem file.
b8e2b73
to
8537252
Compare
Thanks for submitting a pull request 🚀 @degiz will take a look at it as soon as possible ✨ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR 👍
I've left few comments. Also I think that the PR is missing a unit test that would actually check that for a failed story now prints the story name. 🙂
tests/core/test_data.py
Outdated
assert data.get_source_file_name("") == "" | ||
assert data.get_source_file_name("/tmp/stories.md") == "stories.md" | ||
assert data.get_source_file_name("/tmp/123_stories.md") == "stories.md" | ||
assert data.get_source_file_name("/tmp/123_my_stories.md") == "my_stories.md" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few questions:
- Why do we remove first
_
at all? I though the idea of the PR is to keep the file name - What is the reason behind removing only first
_
? So forold_123_my_stories.md
the result will be123_my_stories.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original stories are copied to a temporary directory and the filename get prefixed with an unique ID. This was the reason to strip file name so the original story file name was printed.
I agree to leave the file name intact so there is still a reference to the story (in the temporary directory) which failed e.g.
## happy path > /tmp/tmp73u056dx/4f8a5df888ec4e96bafbf9dad54c624d_stories2.md
Removed the function which stripped the file name and made changes to pass the file name instead of stripped one.
Give me some time to write a unit test and to be able to run it. I wasn't able to do it yet on my environment. Any guidance would be helpful. I'm using a Docker environment. |
* Add data types * Use f-strings * Story file is a copy in a temporary directory. For now leave file path intact till it is clear what needs to be included in failed_stories
9189ee9
to
74d3c5a
Compare
@degiz I just noticed two failing tests because of my changes:
Because I've made changes in the tracker's Do you think my changes are valid and we need to change the test asserts? |
Hey @cheemingli
So the idea of the change as I understand it is the following:
I think the test should try to train a story with incorrect block, and check the stdout/stderr for the messages.
I think it's fine to change the asserts in mentioned tests cases. |
The file source was always included in the tracker's sender id. This causes that persisted story to have 'different' trackers because the tracker's sender id will be different (because of the source). To make sure the impact is minimal only failed stories will be exported with the source of the story file. For this reason the tracker has been extended with an optional sender_source paramenter. (cherry picked from commit 090e55134d9922d13ba626b1311eeecaa8e04915)
@degiz I've pushed some new changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for addressing the comments!
I've added two more minor things! Could you please also include a changelog entry to the ./changelog
folder?
After that we'll be ready to merge 🚀
(cherry picked from commit 20ce257f21ec57ca388025511c2153a40e69694d)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome job! 🚀 🚀
…-failed-stories # Conflicts: # tests/core/test_evaluation.py
Proposed changes:
Include story source filename in the story name in the failed stories output to help find the failed story more easily (see #3419).
Passed the source filename starting from the
StoryFileReader
to aStoryStep
.Besides the story block names the source filename is included in the tracker events which are used for outputting the failed stories.
Status (please check what you already did):
black
(please check Readme for instructions)