Skip to content

Commit

Permalink
Merge pull request #7021 from RasaHQ/docs-rasa-data-validate
Browse files Browse the repository at this point in the history
Document `rasa data validate` in CLI docs
  • Loading branch information
rasabot committed Oct 16, 2020
2 parents 0db9d09 + 6092632 commit 4823d4a
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 25 deletions.
55 changes: 49 additions & 6 deletions docs/docs/command-line-interface.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,13 @@ abstract: The command line interface (CLI) gives you easy-to-remember commands f
|`rasa shell` |Loads your trained model and lets you talk to your assistant on the command line. |
|`rasa run` |Starts a server with your trained model. |
|`rasa run actions` |Starts an action server using the Rasa SDK. |
|`rasa visualize` |Generates a visual representation of your stories. |
|`rasa test` |Tests a trained Rasa model on any files starting with `test_`. |
|`rasa data split nlu` |Performs a 80/20 split of your NLU training data. |
|`rasa data convert` |Converts training data between different formats. |
|`rasa export` |Exports conversations from a tracker store to an event broker. |
|`rasa x` |Launches Rasa X locally. |
|`rasa visualize` |Generates a visual representation of your stories. |
|`rasa test` |Tests a trained Rasa model on any files starting with `test_`. |
|`rasa data split nlu` |Performs a 80/20 split of your NLU training data. |
|`rasa data convert` |Converts training data between different formats. |
|`rasa data validate` |Checks the domain, NLU and conversation data for inconsistencies. |
|`rasa export` |Exports conversations from a tracker store to an event broker. |
|`rasa x` |Launches Rasa X locally. |
|`rasa -h` |Shows all available commands. |

## rasa init
Expand Down Expand Up @@ -302,6 +303,48 @@ You can specify the input file or directory, output directory with the following
rasa data convert nlg --help
```

## rasa data validate

You can check your domain, NLU data, or conversation data for mistakes and inconsistencies.
To validate your data, run this command:

```bash
rasa data validate

By default, the validator searches only for errors in the data, e.g. the same training
example being listed as an example for two intents.
To catch minor issues that don't prevent training a model but might indicate messy data
(e.g. unused intents), use the `--fail-on-warnings` flag.
You can also validate the story structure by running this command:
```bash
rasa data validate stories
This validator checks if you have any stories where different assistant actions follow from the same
dialogue history. Conflicts between stories will prevent a model from learning the correct
pattern for a dialogue.
If you have a [Memoization Policy](./policies.mdx#memoization-policy) in your
`config.yml` file, run the validator with the `--max-history` argument and provide the `max_history`
value set in `config.yml`. If you didn't set `max_history` in the config file, provide the default value of `5`.

:::caution check your story names
The `rasa data validate stories` command assumes that all your story names are unique!
:::

:::caution experimental feature
The `rasa data validate stories` command is an experimental feature. We introduce experimental
features to get feedback from our community, so we encourage you to try it out! However, the functionality
might be changed or removed in the future. If you have feedback (positive or negative) please share
it with us on the [Rasa Forum](https://forum.rasa.com/).
:::

You can use `rasa data validate` with additional arguments, e.g. to specify the location of your data and
domain files:

```text [rasa data validate --help]
```

## rasa export

To export events from a tracker store using an event broker, run:
Expand Down
25 changes: 6 additions & 19 deletions docs/docs/setting-up-ci-cd.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -38,35 +38,22 @@ you can make a test run only if the pull request has a certain label (e.g. “NL

### Validating Data and Stories

Data validation verifies that there are no mistakes or
major inconsistencies in your domain, NLU data, or conversation data. To validate
your data, have your CI run this command:
Data validation verifies that there are no mistakes or major inconsistencies in your domain, NLU
data, or conversation data. To validate your data, have your CI run this command:

```bash
rasa data validate --fail-on-warnings --max-history <max_history>
```

By default the validator searches only for errors in the data (e.g. the same
example being listed as an example for two intents), but does not report other
minor issues (such as unused intents, responses that are not listed as
actions) that won't prevent training a model, but might indicate
messy data.
If you pass a `max_history` value to a Memoization policy in your `config.yml` file, provide the
same value in the above validator command. Otherwise, provide the default value of `5`.

If data validation results in errors, training a model will also fail, so it's
always good to run this check before training a model. By including the
`--fail-on-warnings` flag, this step will fail on warnings indicating more minor issues.

Data validation also includes story structure validation.
Story validation checks if you have any
stories where different bot actions follow from the same dialogue history.
Conflicts between stories will prevent a model from learning the correct
pattern for a dialogue. Set the `--max-history` parameter to the value of `max_history` for the
memoization policy in your `config.yml`. If you haven't set one, use the default of `5`.

:::caution check your story names
The `rasa data validate stories` command assumes that all your story names are unique!

:::
To read more about the validator and all of the available options, see [the documentation for
`rasa data validate`](./command-line-interface.mdx#rasa-data-validate).

### Training a Model

Expand Down

0 comments on commit 4823d4a

Please sign in to comment.