Merge c20abf8 into 8bb05aa

RasaHQ · Nov 9, 2018 · c906ec9 · c906ec9
2 parents 8bb05aa + c20abf8
commit c906ec9
Show file tree

Hide file tree

Showing 26 changed files with 677 additions and 401 deletions.
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -52,6 +52,7 @@ Added
 - add ``Form`` and ``FormValidation`` events
 - add ``REQUESTED_SLOT`` constant
 - add ability to read ``action_listen`` from stories
+- added train/eval scripts to compare policies
 
 Changed
 -------
@@ -65,10 +66,16 @@ Changed
 - forms were completely reworked, see changelog in ``rasa_core_sdk``
 - state featurization if some form is active changed
 - ``Domain`` raises ``InvalidDomain`` exception
+- interactive learning is now started with rasa_core.train interactive
+- passing a policy config file to train a model is now required
+- flags for output of evaluate script have been merged to one flag ``--output``
+  where you provide a folder where any output from the script should be stored
 
 Removed
 -------
 - removed graphviz dependency
+- policy config related flags in training script (see migration guide)
+
 
 Fixed
 -----

diff --git a/data/test_config/max_hist_config.yml b/data/test_config/max_hist_config.yml
@@ -0,0 +1,5 @@
+policies:
+  - name: MemoizationPolicy
+    max_history: 5
+  - name: KerasPolicy
+    max_history: 5
diff --git a/data/test_config/no_max_hist_config.yml b/data/test_config/no_max_hist_config.yml
@@ -0,0 +1,3 @@
+policies:
+  - name: MemoizationPolicy
+  - name: KerasPolicy
diff --git a/default_config.yml b/default_config.yml
@@ -0,0 +1,9 @@
+policies:
+  - name: KerasPolicy
+    epochs: 100
+    max_history: 5
+  - name: FallbackPolicy
+    fallback_action_name: 'action_default_fallback'
+  - name: MemoizationPolicy
+    max_history: 5
+  - name: FormPolicy
diff --git a/docs/evaluation.rst b/docs/evaluation.rst
@@ -20,21 +20,21 @@ by using the evaluate script:
 .. code-block:: bash
 
     $ python -m rasa_core.evaluate -d models/dialogue \
-      -s test_stories.md -o matrix.pdf --failed failed_stories.md
+      -s test_stories.md -o results
 
 
-This will print the failed stories to ``failed_stories.md``.
+This will print the failed stories to ``results/failed_stories.md``.
 We count any story as `failed` if at least one of the actions
 was predicted incorrectly.
 
 In addition, this will save a confusion matrix to a file called
-``matrix.pdf``. The confusion matrix shows, for each action in your
-domain, how often that action was predicted, and how often an
+``results/story_confmat.pdf``. The confusion matrix shows, for each action in 
+your domain, how often that action was predicted, and how often an
 incorrect action was predicted instead.
 
 The full list of options for the script is:
 
-.. program-output:: python -m rasa_core.evaluate -h
+.. program-output:: python -m rasa_core.evaluate default -h
 
 .. _end_to_end_evaluation:
 
@@ -77,7 +77,7 @@ the full end-to-end evaluation command is this:
 
 .. code-block:: bash
 
-  $ python -m rasa_core.evaluate -d models/dialogue --nlu models/nlu/current \
+  $ python -m rasa_core.evaluate default -d models/dialogue --nlu models/nlu/current \
     -s e2e_stories.md --e2e
 
 .. note::
@@ -98,14 +98,40 @@ your bot, so you don't just want to throw some away to use as a test set.
 
 Rasa Core has some scripts to help you choose and fine-tune your policy.
 Once you are happy with it, you can then train your final policy on your
-full data set. To do this, split your training data into multiple files
-in a single directory. You can then use the ``train_paper`` script to
-train multiple policies on the same data. You can choose one of the
-files to be partially excluded. This means that Rasa Core will be
-trained multiple times, with 0, 5, 25, 50, 70, 90, 95, and 100% of
-the stories in that file removed from the training data. By evaluating
-on the full set of stories, you can measure how well Rasa Core is
-predicting the held-out stories.
+full data set. To do this, you first have to train models for your different
+policies. Create two (or more) policy config files of the policies you want to
+compare (containing only one policy each), and then use the ``compare`` mode of
+the train script to train your models:
+
+.. code-block:: bash
+
+  $ python -m rasa_core.train compare -c policy_config1.yml policy_config2.yml \
+    -d domain.yml -s stories_folder -o comparison_models --runs 3 --percentages \
+    0 5 25 50 70 90 95
+
+For each policy configuration provided, Rasa Core will be trained multiple times
+with 0, 5, 25, 50, 70 and 95% of your training stories excluded from the training
+data. This is done for multiple runs, to ensure consistent results.
+
+Once this script has finished, you can now use the evaluate script in compare
+mode to evaluate the models you just trained:
+
+.. code-block:: bash
+
+  $ python -m rasa_core.evaluate compare -s stories_folder -d comparison_models \
+    -o comparison_results
+
+This will evaluate each of the models on the training set, and plot some graphs
+to show you which policy is best.  By evaluating on the full set of stories, you
+can measure how well Rasa Core is predicting the held-out stories.
+
+If you're not sure which policies to compare, we'd recommend trying out the
+``EmbeddingPolicy`` and the ``KerasPolicy`` to see which one works better for
+you.
+
+.. note::
+    This training process can take a long time, so we'd suggest letting it run
+    somewhere in the background where it can't be interrupted
 
 
 Evaluating stories over http
@@ -129,5 +155,3 @@ you may do so by adding the ``e2e=true`` query parameter:
   $ curl --data-binary @eval_stories.md "localhost:5005/evaluate?e2e=true" | python -m json.tool
 
 .. include:: feedback.inc
-
-
diff --git a/docs/interactive_learning.rst b/docs/interactive_learning.rst
@@ -17,7 +17,7 @@ Some people call this `Software 2.0 <https://medium.com/@karpathy/software-2-0-a
 Load up an existing bot
 ^^^^^^^^^^^^^^^^^^^^^^^
 
-We have created some initial stories, and now want to improve our bot 
+We have created some initial stories, and now want to improve our bot
 by providing feedback on mistakes it makes.
 
 Run the following command to start interactive learning:
@@ -27,15 +27,15 @@ Run the following command to start interactive learning:
    python -m rasa_core_sdk.endpoint --actions actions&
 
    python -m rasa_core.train \
-     --interactive -o models/dialogue \
+     interactive -o models/dialogue \
      -d domain.yml -s stories.md \
      --nlu models/current/nlu \
      --endpoints endpoints.yml
 
 The first command starts the action server (see :ref:`customactions`).
 
 The second command starts the bot in interactive mode.
-In interactive mode, the bot will ask you to confirm every prediction 
+In interactive mode, the bot will ask you to confirm every prediction
 made by NLU and Core before proceeding.
 Here's an example:
 
@@ -62,14 +62,14 @@ Here's an example:
     ? The bot wants to run 'utter_greet', correct?  (Y/n)
 
 
-The chat history and slot values are printed to the screen, which 
-should be all the information your need to decide what the correct 
+The chat history and slot values are printed to the screen, which
+should be all the information your need to decide what the correct
 next action is.
 
 In this case, the bot chose the
 right action (``utter_greet``), so we type ``y``.
 Then we type ``y`` again, because ``action_listen`` is the correct
-action after greeting. We continue this loop, chatting with the bot, 
+action after greeting. We continue this loop, chatting with the bot,
 until the bot chooses the wrong action.
 
 Providing feedback on errors
@@ -138,8 +138,8 @@ reviews!) so we select that action.
 
 Now we can keep talking to the bot for as long as we like to create a longer
 conversation. At any point you can press ``Ctrl-C`` and the bot will
-provide you with exit options. You can write your newly-created stories and NLU 
-data to files. You can also go back a step if you made a mistake when providing 
+provide you with exit options. You can write your newly-created stories and NLU
+data to files. You can also go back a step if you made a mistake when providing
 feedback.
 
 Make sure to combine the dumped stories and NLU examples with your original
@@ -165,20 +165,20 @@ script.
 Interactive Learning with Forms
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-If you're using a FormAction, there are some additional things to keep in mind 
+If you're using a FormAction, there are some additional things to keep in mind
 when using interactive learning.
 
 The ``form:`` prefix
 ~~~~~~~~~~~~~~~~~~~~
 
 The form logic is described by your ``FormAction`` class, and not by the stories.
-The machine learning policies should not have to learn this behavior, and should 
-not get confused if you later change your form action, for example by adding or 
-removing a required slot. 
-When you user interactive learning to generate stories containing a form, 
-the conversation steps handled by the form 
-get a :code:`form:` prefix. This tells Rasa Core to ignore these steps when training 
-your other policies. There is nothing special you have to do here, all of the form's 
+The machine learning policies should not have to learn this behavior, and should
+not get confused if you later change your form action, for example by adding or
+removing a required slot.
+When you user interactive learning to generate stories containing a form,
+the conversation steps handled by the form
+get a :code:`form:` prefix. This tells Rasa Core to ignore these steps when training
+your other policies. There is nothing special you have to do here, all of the form's
 happy paths are still covered by the basic story given in :ref:`section_form_basics`.
 
 Here is an example:
@@ -205,10 +205,10 @@ Here is an example:
 Input validation
 ~~~~~~~~~~~~~~~~
 
-Every time the user responds with something *other* than the requested slot or 
-any of the required slots, 
-you will be asked whether you want the form action to try and extract a slot 
-from the user's message when returning to the form. This is best explained with 
+Every time the user responds with something *other* than the requested slot or
+any of the required slots,
+you will be asked whether you want the form action to try and extract a slot
+from the user's message when returning to the form. This is best explained with
 and example:
 
 .. code-block:: text
@@ -241,7 +241,7 @@ and example:
     ? Should 'restaurant_form' validate user input to fill the slot 'outdoor_seating'?  (Y/n)
 
 Here the user asked to stop the form, and the bot asks the user whether they're sure
-they don't want to continue. The user says they want to continue (the ``/affirm`` intent). 
+they don't want to continue. The user says they want to continue (the ``/affirm`` intent).
 Here ``outdoor_seating`` has a ``from_intent`` slot mapping (mapping
 the ``/affirm`` intent to ``True``), so this user input could be used to fill
 that slot. However, in this case the user is just responding to the
@@ -260,7 +260,7 @@ should not be validated. The bot will then continue to ask for the
 
     **WARNING: FormPolicy predicted no form validation based on previous training
     stories. Make sure to remove contradictory stories from training data**
-    
+
     Once you've removed that story, you can press enter and continue with
     interactive learning
 

diff --git a/docs/migrations.rst b/docs/migrations.rst
@@ -19,6 +19,26 @@ how you can migrate from one version to another.
     before updating. Please make sure to
     **retrain your models when switching to this version**.
 
+Train script
+~~~~~~~~~~~~
+
+- You **must** pass a policy config flag with ``-c/--config`` now when training
+  a model, see :ref:`policy_file`. There is a default config file ``default_config.yml``
+  in the Github repo
+- Interactive learning is now started with ``python -m rasa_core.train interactive``
+  rather than the ``--interactive`` flag
+- All policy configuration related flags have been removed (--epochs,
+  --max_history, --validation_split, --batch_size, --nlu_threshold, --core_threshold,
+  --fallback_action_name), specify these in the policy config file instead,
+  see :ref:`policy_file`
+
+Evaluation script
+~~~~~~~~~~~~~~~~~
+
+- The ``--output`` flag now takes one argument: the name of the folder any files
+  generated from the script should be written to
+- The ``--failed`` flag was removed, as this is part of the ``--output`` flag now
+
 Forms
 ~~~~~
 
@@ -310,6 +330,3 @@ There have been some API changes to classes and methods:
 
 
 .. include:: feedback.inc
-
-
-
diff --git a/docs/policies.rst b/docs/policies.rst
@@ -18,7 +18,7 @@ You can run training from the command line like in the :ref:`quickstart`:
 .. code-block:: bash
 
    python -m rasa_core.train -d domain.yml -s data/stories.md \
-     -o models/current/dialogue --epochs 200
+     -o models/current/dialogue -c default_config.yml
 
 Or by creating an agent and running the train method yourself:
 
@@ -66,8 +66,8 @@ One important hyperparameter for Rasa Core policies is the ``max_history``.
 This controls how much dialogue history the model looks at to decide which
 action to take next.
 
-You can set the ``max_history`` using the training script's ``--history``
-flag or by passing it to your policy's ``Featurizer``.
+You can set the ``max_history`` by passing it to your policy's ``Featurizer``
+in the policy configuration yaml file.
 
 .. note::
 
@@ -102,7 +102,7 @@ slot. Slot information is always available for every featurizer.
 Training Script Options
 ^^^^^^^^^^^^^^^^^^^^^^^
 
-.. program-output:: python -m rasa_core.train -h
+.. program-output:: python -m rasa_core.train default -h
 
 
 
@@ -122,15 +122,20 @@ highest confidence will be used.
 Configuring polices using a configuration file
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-You can set the policies you would like the Core model to use in a YAML file.
+If you are using the training script, you must set the policies you would like
+the Core model to use in a YAML file.
 
 For example:
 
 .. code-block:: yaml
 
   policies:
     - name: "KerasPolicy"
-      max_history: 5
+      featurizer:
+      - name: MaxHistoryTrackerFeaturizer
+        max_history: 5
+        state_featurizer:
+          - name: BinarySingleStateFeaturizer
     - name: "MemoizationPolicy"
       max_history: 5
     - name: "FallbackPolicy"
@@ -141,9 +146,8 @@ For example:
       arg1: "..."
 
 Pass the YAML file's name to the train script using the ``--config``
-argument (or just ``-c``). If no config.yaml is given, the policies
-default to
-``[KerasPolicy(), MemoizationPolicy(), FallbackPolicy(), FormPolicy()]``.
+argument (or just ``-c``). There is a default config file you can use in the
+github repository called ``default_config.yml``
 
 .. note::
 
@@ -377,5 +381,3 @@ It is recommended to use
 
 
 .. include:: feedback.inc
-
-
diff --git a/examples/formbot/Makefile b/examples/formbot/Makefile
@@ -25,7 +25,7 @@ run:
 	make run-core
 
 train-interactive:
-	python -m rasa_core.train --interactive -s data/stories.md -d domain.yml -o models/dialogue --debug --endpoints endpoints.yml
+	python -m rasa_core.train interactive -s data/stories.md -d domain.yml -o models/dialogue --debug --endpoints endpoints.yml
 
 visualize:
 	python -m rasa_core.visualize -s data/stories.md -d domain.yml -o story_graph.png

diff --git a/examples/moodbot/Makefile b/examples/moodbot/Makefile
@@ -11,7 +11,7 @@ train-nlu:
 	       --data ./data/nlu.md --path models/ --project nlu
 
 train-core:
-	python -m rasa_core.train -s data/stories.md -d domain.yml -o models/dialogue --epochs 300
+	python -m rasa_core.train -s data/stories.md -d domain.yml -o models/dialogue -c ../../default_config.yml
 
 run-fb:
 	python -m rasa_core.run -d models/dialogue -u models/nlu/current -p 5002 -c facebook --credentials credentials.yml

diff --git a/examples/restaurantbot/bot.py b/examples/restaurantbot/bot.py
@@ -25,14 +25,12 @@ def train_dialogue(domain_file="restaurant_domain.yml",
                    training_data_file="data/babi_stories.md"):
     agent = Agent(domain_file,
                   policies=[MemoizationPolicy(max_history=3),
-                            RestaurantPolicy()])
+                            RestaurantPolicy(batch_size=100, epochs=400,
+                                             validation_split=0.2)])
 
     training_data = agent.load_data(training_data_file)
     agent.train(
-            training_data,
-            epochs=400,
-            batch_size=100,
-            validation_split=0.2
+            training_data
     )
 
     agent.persist(model_path)