Skip to content

Commit

Permalink
fixed doc issues
Browse files Browse the repository at this point in the history
  • Loading branch information
tmbo committed Apr 17, 2018
1 parent cb1c284 commit 50dd257
Show file tree
Hide file tree
Showing 7 changed files with 34 additions and 27 deletions.
4 changes: 2 additions & 2 deletions docs/community.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _section_community:

Community Contributions
============
=======================

.. note::
This is an (incomplete) list of external resources created by the Rasa community.
Expand All @@ -11,7 +11,7 @@ Community Contributions


Community Written Documentation
^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


- A three part tutorial on using Rasa NLU in combination with Node-RED to create a basic chat bot and integrate it with Slack and Twilio.
Expand Down
5 changes: 3 additions & 2 deletions docs/dataformat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ Markdown Format
Alternatively training data can be used in the following markdown format. Examples are listed using the unordered
list syntax, e.g. minus ``-``, asterisk ``*``, or plus ``+``:

.. code-block:: markdown
.. code-block:: md
## intent:check_balance
- what is my balance <!-- no entity -->
Expand Down Expand Up @@ -193,6 +193,7 @@ Storing files with different file formats, i.e. mixing markdown and JSON, is cur
Splitting the training data into multiple files currently only works for markdown and JSON data.
For other file formats you have to use the single-file approach.

.. _train_parameters:

Train a Model
~~~~~~~~~~~~~
Expand All @@ -210,4 +211,4 @@ Here is a quick overview over the parameters you can pass to that script:
The other ways to train a model are

- training it using your own python code
- training it using the HTTP api (:ref:`http`)
- training it using the HTTP api (:ref:`section_http`)
35 changes: 19 additions & 16 deletions docs/http.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,16 +64,16 @@ By default, when the project is not specified in the query, the
``"default"`` one will be used.
You can (should) specify the project you want to use in your query :

.. code-block:: bash
.. code-block:: console
$ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot"}
$ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot"}'
By default the latest trained model for the project will be loaded.
You can also query against a specific model for a project :

.. code-block:: bash
.. code-block:: console
$ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot", "model": <model_XXXXXX>}
$ curl -XPOST localhost:5000/parse -d '{"q":"hello there", "project": "my_restaurant_search_bot", "model": "<model_XXXXXX>"}'
``POST /train``
Expand Down Expand Up @@ -198,7 +198,7 @@ This will return the default model configuration of the Rasa NLU instance.
}
``DELETE /models``
^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^

This will unload a model from the server memory

Expand Down Expand Up @@ -230,22 +230,24 @@ So if you are serving multiple models in production, you want to serve these
from the same process & avoid duplicating the memory load.

.. note::
Although this saves the backend from loading the same backend twice, it still needs to load one set of

Although this saves the backend from loading the same backend twice, it still needs to load one set of
word vectors (which make up most of the memory consumption) per language and backend.

As stated previously, Rasa NLU naturally handles serving multiple apps : by default the server will load all projects found
under the ``path`` directory defined in the configuration. The file structure under ``path directory`` is as follows :

- <path>
- <project_A>
- <model_XXXXXX>
- <model_XXXXXX>
...
- <project_B>
- <model_XXXXXX>
...
...
.. code-block:: text
- <path>
- <project_A>
- <model_XXXXXX>
- <model_XXXXXX>
...
- <project_B>
- <model_XXXXXX>
...
...
So you can specify which one to use in your ``/parse`` requests:

Expand All @@ -267,13 +269,14 @@ You can also specify the model you want to use for a given project, the default
If no project is to be found by the server under the ``path`` directory, a ``"default"`` one will be used, using a simple fallback model.

.. _server_parameters:

Server Parameters
-----------------

There are a number of parameters you can pass when running the server.

.. code-block:: bash
.. code-block:: console
$ python -m rasa_nlu.server
Expand Down
1 change: 1 addition & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ The complete pipeline for mitie can be found here

Training MITIE can be quite slow on datasets
with more than a few intents. You can try

- to use the sklearn + MITIE backend instead
(which uses sklearn for the training) or
- you can install `our mitie fork <https://github.com/tmbo/mitie>`_
Expand Down
6 changes: 4 additions & 2 deletions docs/languages.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ MITIE
^^^^^

1. Get a ~clean language corpus (a Wikipedia dump works) as a set of text files
2. Build and run `MITIE wordrep tool <https://github.com/mit-nlp/MITIE>`_ on your corpus. This can take several hours/days depending on your dataset and your workstation. You'll need something like 128GB of RAM for wordrep to run - yes that's alot: try to extend your swap.
2. Build and run `MITIE Wordrep Tool`_ on your corpus. This can take several hours/days depending on your dataset and your workstation. You'll need something like 128GB of RAM for wordrep to run - yes that's alot: try to extend your swap.
3. Set the path of your new ``total_word_feature_extractor.dat`` as value of the *mitie_file* parameter in ``config_mitie.json``

.. _jieba:
Expand All @@ -66,11 +66,13 @@ from a Chinese corpus using the MITIE wordrep tools
(takes 2-3 days for training).

For training, please build the
`MITIE Wordrep Tool <https://github.com/mit-nlp/MITIE/tree/master/tools/wordrep>`_.
`MITIE Wordrep Tool`_.
Note that Chinese corpus should be tokenized first before feeding
into the tool for training. Close-domain corpus that best matches
user case works best.

A detailed instruction on how to train the model yourself can be found in
A trained model from Chinese Wikipedia Dump and Baidu Baike can be `crownpku <https://github.com/crownpku>`_ 's
`blogpost <http://www.crownpku.com/2017/07/27/%E7%94%A8Rasa_NLU%E6%9E%84%E5%BB%BA%E8%87%AA%E5%B7%B1%E7%9A%84%E4%B8%AD%E6%96%87NLU%E7%B3%BB%E7%BB%9F.html>`_.

.. _`MITIE Wordrep Tool`: https://github.com/mit-nlp/MITIE/tree/master/tools/wordrep
5 changes: 3 additions & 2 deletions docs/migrations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,9 @@ parameters. Example:
All other parameters have either been moved to the scripts
for training / serving models :ref:`scripts`, or put into the pipeline
configuration (:ref:`pipeline`).
for training (:ref:`train_parameters`), serving models
(:ref:`server_parameters`), or put into the pipeline
configuration (:ref:`section_pipeline`).

persistors:
~~~~~~~~~~~
Expand Down
5 changes: 2 additions & 3 deletions docs/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ to use the components and configure them separately:
tensorflow_embedding
~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~

to use it as a template:

Expand All @@ -158,7 +158,7 @@ To use the components and configure them separately:
- name: "intent_classifier_tensorflow_embedding"
Custom pipelines
~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~

Creating your own pipelines is possible by directly passing the names of the ~
components to Rasa NLU in the ``pipeline`` configuration variable, e.g.
Expand Down Expand Up @@ -717,7 +717,6 @@ ner_duckling
Duckling allows to recognize dates, numbers, distances and other structured entities
and normalizes them (for a reference of all available entities
see `the duckling documentation <https://duckling.wit.ai/#getting-started>`_).
The component recognizes the entity types defined by the :ref:`duckling dimensions configuration variable <section_configuration_duckling_dimensions>`.
Please be aware that duckling tries to extract as many entity types as possible without
providing a ranking. For example, if you specify both ``number`` and ``time`` as dimensions
for the duckling component, the component will extract two entities: ``10`` as a number and
Expand Down

0 comments on commit 50dd257

Please sign in to comment.