diff --git a/docs/community.rst b/docs/community.rst index 35c964a9b4d4..cdeeb4f90039 100644 --- a/docs/community.rst +++ b/docs/community.rst @@ -1,7 +1,7 @@ .. _section_community: Community Contributions -============ +======================= .. note:: This is an (incomplete) list of external resources created by the Rasa community. @@ -11,7 +11,7 @@ Community Contributions Community Written Documentation -^^^^^^^^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - A three part tutorial on using Rasa NLU in combination with Node-RED to create a basic chat bot and integrate it with Slack and Twilio. diff --git a/docs/dataformat.rst b/docs/dataformat.rst index a889e23d2a9b..d9e67039f9ae 100644 --- a/docs/dataformat.rst +++ b/docs/dataformat.rst @@ -163,7 +163,7 @@ Markdown Format Alternatively training data can be used in the following markdown format. Examples are listed using the unordered list syntax, e.g. minus ``-``, asterisk ``*``, or plus ``+``: -.. code-block:: markdown +.. code-block:: md ## intent:check_balance - what is my balance @@ -193,6 +193,7 @@ Storing files with different file formats, i.e. mixing markdown and JSON, is cur Splitting the training data into multiple files currently only works for markdown and JSON data. For other file formats you have to use the single-file approach. +.. _train_parameters: Train a Model ~~~~~~~~~~~~~ @@ -210,4 +211,4 @@ Here is a quick overview over the parameters you can pass to that script: The other ways to train a model are - training it using your own python code -- training it using the HTTP api (:ref:`http`) +- training it using the HTTP api (:ref:`section_http`) diff --git a/docs/http.rst b/docs/http.rst index b967989ccf29..5ed927ff72c0 100644 --- a/docs/http.rst +++ b/docs/http.rst @@ -64,16 +64,16 @@ By default, when the project is not specified in the query, the ``"default"`` one will be used. You can (should) specify the project you want to use in your query : -.. code-block:: bash +.. code-block:: console - $ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot"} + $ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot"}' By default the latest trained model for the project will be loaded. You can also query against a specific model for a project : -.. code-block:: bash +.. code-block:: console - $ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot", "model": } + $ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot", "model": ""}' ``POST /train`` @@ -198,7 +198,7 @@ This will return the default model configuration of the Rasa NLU instance. } ``DELETE /models`` -^^^^^^^^^^^^^^^^^ +^^^^^^^^^^^^^^^^^^ This will unload a model from the server memory @@ -230,22 +230,24 @@ So if you are serving multiple models in production, you want to serve these from the same process & avoid duplicating the memory load. .. note:: -Although this saves the backend from loading the same backend twice, it still needs to load one set of + + Although this saves the backend from loading the same backend twice, it still needs to load one set of word vectors (which make up most of the memory consumption) per language and backend. As stated previously, Rasa NLU naturally handles serving multiple apps : by default the server will load all projects found under the ``path`` directory defined in the configuration. The file structure under ``path directory`` is as follows : -- - - - - - - - ... - - - - - ... - ... +.. code-block:: text + - + - + - + - + ... + - + - + ... + ... So you can specify which one to use in your ``/parse`` requests: @@ -267,13 +269,14 @@ You can also specify the model you want to use for a given project, the default If no project is to be found by the server under the ``path`` directory, a ``"default"`` one will be used, using a simple fallback model. +.. _server_parameters: Server Parameters ----------------- There are a number of parameters you can pass when running the server. -.. code-block:: bash +.. code-block:: console $ python -m rasa_nlu.server diff --git a/docs/installation.rst b/docs/installation.rst index 548d1e2fa113..d4aaab7f6009 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -105,6 +105,7 @@ The complete pipeline for mitie can be found here Training MITIE can be quite slow on datasets with more than a few intents. You can try + - to use the sklearn + MITIE backend instead (which uses sklearn for the training) or - you can install `our mitie fork `_ diff --git a/docs/languages.rst b/docs/languages.rst index 8a94827feb71..9ab467813413 100644 --- a/docs/languages.rst +++ b/docs/languages.rst @@ -51,7 +51,7 @@ MITIE ^^^^^ 1. Get a ~clean language corpus (a Wikipedia dump works) as a set of text files -2. Build and run `MITIE wordrep tool `_ on your corpus. This can take several hours/days depending on your dataset and your workstation. You'll need something like 128GB of RAM for wordrep to run - yes that's alot: try to extend your swap. +2. Build and run `MITIE Wordrep Tool`_ on your corpus. This can take several hours/days depending on your dataset and your workstation. You'll need something like 128GB of RAM for wordrep to run - yes that's alot: try to extend your swap. 3. Set the path of your new ``total_word_feature_extractor.dat`` as value of the *mitie_file* parameter in ``config_mitie.json`` .. _jieba: @@ -66,7 +66,7 @@ from a Chinese corpus using the MITIE wordrep tools (takes 2-3 days for training). For training, please build the -`MITIE Wordrep Tool `_. +`MITIE Wordrep Tool`_. Note that Chinese corpus should be tokenized first before feeding into the tool for training. Close-domain corpus that best matches user case works best. @@ -74,3 +74,5 @@ user case works best. A detailed instruction on how to train the model yourself can be found in A trained model from Chinese Wikipedia Dump and Baidu Baike can be `crownpku `_ 's `blogpost `_. + +.. _`MITIE Wordrep Tool`: https://github.com/mit-nlp/MITIE/tree/master/tools/wordrep \ No newline at end of file diff --git a/docs/migrations.rst b/docs/migrations.rst index 6d0a7b973d80..5b6a7ff381e9 100644 --- a/docs/migrations.rst +++ b/docs/migrations.rst @@ -24,8 +24,9 @@ parameters. Example: All other parameters have either been moved to the scripts -for training / serving models :ref:`scripts`, or put into the pipeline -configuration (:ref:`pipeline`). +for training (:ref:`train_parameters`), serving models +(:ref:`server_parameters`), or put into the pipeline +configuration (:ref:`section_pipeline`). persistors: ~~~~~~~~~~~ diff --git a/docs/pipeline.rst b/docs/pipeline.rst index 2f1daa6bc606..64df31eec3bc 100644 --- a/docs/pipeline.rst +++ b/docs/pipeline.rst @@ -133,7 +133,7 @@ to use the components and configure them separately: tensorflow_embedding -~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~ to use it as a template: @@ -158,7 +158,7 @@ To use the components and configure them separately: - name: "intent_classifier_tensorflow_embedding" Custom pipelines -~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~ Creating your own pipelines is possible by directly passing the names of the ~ components to Rasa NLU in the ``pipeline`` configuration variable, e.g. @@ -717,7 +717,6 @@ ner_duckling Duckling allows to recognize dates, numbers, distances and other structured entities and normalizes them (for a reference of all available entities see `the duckling documentation `_). - The component recognizes the entity types defined by the :ref:`duckling dimensions configuration variable `. Please be aware that duckling tries to extract as many entity types as possible without providing a ranking. For example, if you specify both ``number`` and ``time`` as dimensions for the duckling component, the component will extract two entities: ``10`` as a number and