Skip to content

Commit

Permalink
Merge pull request #88 from deepmipt/DeepPavlov-docs-patch-2
Browse files Browse the repository at this point in the history
Update config.rst
  • Loading branch information
p-rdx committed Apr 3, 2020
2 parents 87e3d5c + 7169463 commit 7dfcd08
Showing 1 changed file with 46 additions and 41 deletions.
87 changes: 46 additions & 41 deletions docs/source/config/config.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
Agent Configuration
======================

You can provide pipeline and database configuration for agent with config files. Both ``json`` and ``yml`` formats are acceptable.
Configuration of pipeline and database for the **Agent** can be defined
in ``json`` or ``yml`` file.

**Config Description**

**Database**

Database configuration parameters are provided via ``db_conf`` file. Currently, agent runs on Mongo DB.
Database configuration parameters are provided via ``db_conf`` file. Currently, agent supports Mongo DB.

All default values are taken from `Mongo DB documentation <https://docs.mongodb.com/manual/>`__. Please refer to these docs if you need to
change anything.
All default values are taken from `Mongo DB documentation <https://docs.mongodb.com/manual/>`__.
Please refer to these docs if you need to change anything.

Sample database config:
Example of a database config:

.. code-block:: json
Expand All @@ -24,24 +25,26 @@ Sample database config:
}
* **env**
* If set to **false** (or not mentioned), exact parameters values will be used for db initialisation. Otherwise, agent will try to get an environmental variable by name, associated with parameter.
* If set to **false** (or not mentioned), specified parameters' values will be used for db initialisation. Otherwise, agent will try to get an environmental variable by name, associated with parameter.
* **host**
* A database host, or env variable, where database host name is stored
* A database host, or env variable, where database host name is stored.
* **port**
* A database port, or env variable, where database port is stored
* A database port, or env variable, where database port is stored.
* **name**
* An name of the database, or env variable, where name of the database is stored
* An name of the database, or env variable, where name of the database is stored.


**Pipeline**

Pipeline configuration parameters are provided via ``pipeline_conf`` file. There are two different sections in config, which are used to configure Connectors and Services
Pipeline configuration parameters are specified in ``pipeline_conf`` file.
There are two different sections in ``pipeline_conf`` to configure Connectors and Services.

**Services Config**

Service represents a single node of pipeline graph, or single step in processing of user message.
In ``pipeline_conf`` all services are grouped under "service" key.
Sample service config:
Service is a single node of pipeline graph, or a single step in processing of user message.
In ``pipeline_conf`` all services are grouped under ``service`` key.

Example of a service config:

.. code-block:: json
Expand All @@ -59,43 +62,45 @@ Sample service config:
}
* **group name**
* This is an optional key. If it is presented, you can mention services via their group name (in previous_services and required_previous_services)
* In case if `group name` is presented, the actual service name will be ``<group name>.<service label>``
* This is an optional key. If it is specified then services can be referenced by their `group name` in ``previous_services`` and ``required_previous_services``.
* If `group name` is specified then the service name is ``<group name>.<service label>``.
* **service_label**
* Label of the service. Used as unique service name, if service is not grouped
* Passed to state manager method, associated with service. So, service_label is saved in state
* Label of the service. Used as a unique service name, if service is not grouped.
* Passed to a state manager method, associated with the service. So,``service_label`` is saved in state.
* **dialog_formatter**
* Function, which extracts all the needed information from dialog and generate a list of tasks for sending to services
* Can be configured in ``<python module name>:<function name>`` format
* Formatter can produce several tasks from one dialog (for instance, you want to annotate all hypotheses)
* Each task represents a single valid request payload, which can be processed by service without further formatting
* **response formatter**
* Function, which re-formats a service response in a way, which is suitable for saving in dialog state
* Can be configured in ``<python module name>:<function name>`` format
* Optional parameter. Exact service output will be sent to state manager method, if that parameter is not presented
* Generates list of tasks for services from a dialog state.
* Can be configured as ``<python module name>:<function name>``.
* Formatter can generate several tasks from the same dialog, for example, if you want to annotate all hypotheses.
* Each generated task corresponds to a single valid request payload to be processed by service without further formatting.
* **response_formatter**
* Maps a service response to the format of dialog state.
* Can be configured as ``<python module name>:<function name>``.
* Optional parameter. If not specified then unformatted service output is sent to state manager method.
* **connector**
* Function, which represents a connector to service. Can be configured here, or in Connectors
* You can link a connector from `connectors` section by typing ``connectors.<connector name>``
* Specifies a connector to a service. Can be configured here, or in `connectors` section.
* You can link a connector from `connectors` section by specifying ``connectors.<connector name>``.
* **previous_services**
* List of name of services, which should be completed (or skipped, or respond with an error) before sending data to current service
* Should contain either groups names or service names
* List of services to be executed (or skipped, or respond with an error) before sending data to the service.
* Should contain either group names or service names.
* **required_previous_services**
* List of names of services, which must be correctly completed before this service since their results are used in current service
* If at least one of the required_previous_services is skipped or finished with error, current service will be skipped to
* Should contain either groups names or service names
* List of services to be completed correctly before the service, because it depends on their output.
* If at least one of the ``required_previous_services`` is skipped or finished with an error, the service is not executed.
* Should contain either group names or service names.
* **state_manager_method**
* Name of the method of a StateManager class, which will be executed afterwards
* Name of a ``StateManager`` class method to be executed after the service response.
* **tags**
* Tags, associated with the service
* Currently, tags are used in order to separate a service with specific behaviour
* **selector** - this tag marks a skill selector service. It returns a list of skills, which are selected for further processing
* **timeout** - this tag marks a timeout service, which will engage if deadline timestamp is presented and processing time exceeds it
* **last_chance** - this tag marks a last chance service, which will engage if other services in pipeline have finished executing with an error, and further processing became impossible
* Tags, associated with the service to indicate a specific behaviour.
* **selector** - corresponds to skill selector service. This service returns a list of skills selected for response generation.
* **timeout** - corresponds to timeout service. This service is called when processing time exceeds specified limit.
* **last_chance** - corresponds to last chance service. This service is called if other services in pipeline have returned an error, and further processing is impossible.

**Connectors config**

Connector represents a function, where tasks are sent in order to process. Can be implementation of some data transfer protocol or model implemented in python.
Since agent is based on asynchronous execution, and can be slowed down by blocking synchronous parts, it is strongly advised to implement computational heavy services separate from agent, and use some protocols (like http) for data transfer.
Connector represents a function, where tasks are sent in order to process.
Can be implementation of some data transfer protocol or model implemented in python.
Since agent is based on asynchronous execution, and can be slowed down by blocking synchronous parts,
it is strongly advised to implement computational heavy services separate from agent,
and use some protocols (like http) for data transfer.

There are several possibilities, to configure connector:

Expand All @@ -117,7 +122,7 @@ There are several possibilities, to configure connector:
* **url**
* Actual url, where an external service api is accessible. Should be in format ``http://<host>:<port>/<path>``
* **batch_size**
* Represents a maximum task count, which will be sent to a service in a batch. If not presented is interpreted as 1
* Represents a maximum task count, which will be sent to a service in a batch. If not specified is interpreted as 1
* If the value is 1, an `HTTPConnector <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L10>`__ class is used.
* If the value is more than one, agent will use `AioQueueConnector <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L32>`__. That connector sends data to asyncio queue. Same time, worker `QueueListenerBatchifyer <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L40>`__, which collects data from queue, assembles batches and sends them to a service.

Expand Down

0 comments on commit 7dfcd08

Please sign in to comment.