Skip to content

Commit

Permalink
Merge branch 'master' into dev
Browse files Browse the repository at this point in the history
  • Loading branch information
p-rdx committed Apr 3, 2020
2 parents 8e97a3a + 7dfcd08 commit 8c006db
Show file tree
Hide file tree
Showing 4 changed files with 70 additions and 50 deletions.
9 changes: 3 additions & 6 deletions deeppavlov_agent/core/state_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,8 +215,8 @@ def to_dict(self):
'human': self.human.to_dict(),
'bot': self.bot.to_dict(),
'channel_type': self.channel_type,
'date_start': self.date_start,
'date_finish': self.date_finish,
'date_start': str(self.date_start),
'date_finish': str(self.date_finish),
}

async def load_external_info(self, db):
Expand Down Expand Up @@ -273,10 +273,7 @@ async def get_by_id(cls, db, dialog_id):
if dialog:
human = await Human.get_by_id(db, dialog['_human_id'])
dialog_obj = cls(actual=True, human=human, **dialog)
human_utterances = await HumanUtterance.get_many(db, dialog_obj._id)
bot_utterances = await BotUtterance.get_many(db, dialog_obj._id)
dialog_obj.utterances = sorted(chain(human_utterances, bot_utterances), key=lambda x: x._in_dialog_id)
dialog_obj.bot = await Bot.get_or_create(db, dialog_obj._bot_id)
await dialog_obj.load_external_info(db)
return dialog_obj
return None

Expand Down
14 changes: 13 additions & 1 deletion deeppavlov_agent/parse_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,13 @@ def make_connector(self, name: str, data: Dict):
self.connectors[name] = connector

def make_service(self, group: str, name: str, data: Dict):
def check_ext_module(class_name):
params = class_name.split(':')
formatter_class = None
if len(params) == 2:
formatter_class = getattr(self.get_external_module(params[0]), params[1])
return formatter_class

connector_data = data.get('connector', None)
service_name = ".".join([i for i in [group, name] if i])
if 'workflow_formatter' in data and not data['workflow_formatter']:
Expand Down Expand Up @@ -117,15 +124,20 @@ def make_service(self, group: str, name: str, data: Dict):
dialog_formatter_name = data.get('dialog_formatter', None)
response_formatter_name = data.get('response_formatter', None)
if dialog_formatter_name:
if dialog_formatter_name in dialog_formatter_name:
if dialog_formatter_name in all_formatters:
dialog_formatter = all_formatters[dialog_formatter_name]
else:
dialog_formatter = check_ext_module(dialog_formatter_name)
if not dialog_formatter:
raise ValueError(f"formatter {dialog_formatter_name} doesn't exist (declared in {service_name})")
if response_formatter_name:
if response_formatter_name in all_formatters:
response_formatter = all_formatters[response_formatter_name]
else:
response_formatter = check_ext_module(response_formatter_name)
if not response_formatter:
raise ValueError(f"formatter {response_formatter_name} doesn't exist (declared in {service_name})")

names_previous_services = set()
for sn in data.get('previous_services', set()):
names_previous_services.update(self.services_names.get(sn, set()))
Expand Down
87 changes: 46 additions & 41 deletions docs/source/config/config.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
Agent Configuration
====================

You can provide pipeline and database configuration for agent with config files. Both ``json`` and ``yml`` formats are acceptable.
Configuration of pipeline and database for the **Agent** can be defined
in ``json`` or ``yml`` file.

Database Config Description
===========================

Database configuration parameters are provided via ``db_conf`` file. Currently, agent runs on Mongo DB.
Database configuration parameters are provided via ``db_conf`` file. Currently, agent supports Mongo DB.

All default values are taken from `Mongo DB documentation <https://docs.mongodb.com/manual/>`__. Please refer to these docs if you need to
change anything.
All default values are taken from `Mongo DB documentation <https://docs.mongodb.com/manual/>`__.
Please refer to these docs if you need to change anything.

Sample database config:
Example of a database config:

.. code-block:: json
Expand All @@ -23,27 +24,29 @@ Sample database config:
}
* **env**
* If set to **false** (or not mentioned), exact parameters values will be used for db initialisation. Otherwise, agent will try to get an environmental variable by name, associated with parameter.
* If set to **false** (or not mentioned), specified parameters' values will be used for db initialisation. Otherwise, agent will try to get an environmental variable by name, associated with parameter.
* **host**
* A database host, or env variable, where database host name is stored
* A database host, or env variable, where database host name is stored.
* **port**
* A database port, or env variable, where database port is stored
* A database port, or env variable, where database port is stored.
* **name**
* An name of the database, or env variable, where name of the database is stored
* An name of the database, or env variable, where name of the database is stored.


Pipeline Config Description
===========================

Pipeline configuration parameters are provided via ``pipeline_conf`` file. There are two different sections in config, which are used to configure Connectors and Services
Pipeline configuration parameters are specified in ``pipeline_conf`` file.
There are two different sections in ``pipeline_conf`` to configure Connectors and Services.


**Services Config**
-------------------

Service represents a single node of pipeline graph, or single step in processing of user message.
In ``pipeline_conf`` all services are grouped under "service" key.
Sample service config:
Service is a single node of pipeline graph, or a single step in processing of user message.
In ``pipeline_conf`` all services are grouped under ``service`` key.

Example of a service config:

.. code-block:: json
Expand All @@ -61,45 +64,47 @@ Sample service config:
}
* **group name**
* This is an optional key. If it is presented, you can mention services via their group name (in previous_services and required_previous_services)
* In case if `group name` is presented, the actual service name will be ``<group name>.<service label>``
* This is an optional key. If it is specified then services can be referenced by their `group name` in ``previous_services`` and ``required_previous_services``.
* If `group name` is specified then the service name is ``<group name>.<service label>``.
* **service_label**
* Label of the service. Used as unique service name, if service is not grouped
* Passed to state manager method, associated with service. So, service_label is saved in state
* Label of the service. Used as a unique service name, if service is not grouped.
* Passed to a state manager method, associated with the service. So,``service_label`` is saved in state.
* **dialog_formatter**
* Function, which extracts all the needed information from dialog and generate a list of tasks for sending to services
* Can be configured in ``<python module name>:<function name>`` format
* Formatter can produce several tasks from one dialog (for instance, you want to annotate all hypotheses)
* Each task represents a single valid request payload, which can be processed by service without further formatting
* **response formatter**
* Function, which re-formats a service response in a way, which is suitable for saving in dialog state
* Can be configured in ``<python module name>:<function name>`` format
* Optional parameter. Exact service output will be sent to state manager method, if that parameter is not presented
* Generates list of tasks for services from a dialog state.
* Can be configured as ``<python module name>:<function name>``.
* Formatter can generate several tasks from the same dialog, for example, if you want to annotate all hypotheses.
* Each generated task corresponds to a single valid request payload to be processed by service without further formatting.
* **response_formatter**
* Maps a service response to the format of dialog state.
* Can be configured as ``<python module name>:<function name>``.
* Optional parameter. If not specified then unformatted service output is sent to state manager method.
* **connector**
* Function, which represents a connector to service. Can be configured here, or in Connectors
* You can link a connector from `connectors` section by typing ``connectors.<connector name>``
* Specifies a connector to a service. Can be configured here, or in `connectors` section.
* You can link a connector from `connectors` section by specifying ``connectors.<connector name>``.
* **previous_services**
* List of name of services, which should be completed (or skipped, or respond with an error) before sending data to current service
* Should contain either groups names or service names
* List of services to be executed (or skipped, or respond with an error) before sending data to the service.
* Should contain either group names or service names.
* **required_previous_services**
* List of names of services, which must be correctly completed before this service since their results are used in current service
* If at least one of the required_previous_services is skipped or finished with error, current service will be skipped to
* Should contain either groups names or service names
* List of services to be completed correctly before the service, because it depends on their output.
* If at least one of the ``required_previous_services`` is skipped or finished with an error, the service is not executed.
* Should contain either group names or service names.
* **state_manager_method**
* Name of the method of a StateManager class, which will be executed afterwards
* Name of a ``StateManager`` class method to be executed after the service response.
* **tags**
* Tags, associated with the service
* Currently, tags are used in order to separate a service with specific behaviour
* **selector** - this tag marks a skill selector service. It returns a list of skills, which are selected for further processing
* **timeout** - this tag marks a timeout service, which will engage if deadline timestamp is presented and processing time exceeds it
* **last_chance** - this tag marks a last chance service, which will engage if other services in pipeline have finished executing with an error, and further processing became impossible
* Tags, associated with the service to indicate a specific behaviour.
* **selector** - corresponds to skill selector service. This service returns a list of skills selected for response generation.
* **timeout** - corresponds to timeout service. This service is called when processing time exceeds specified limit.
* **last_chance** - corresponds to last chance service. This service is called if other services in pipeline have returned an error, and further processing is impossible.


**Connectors config**
---------------------

Connector represents a function, where tasks are sent in order to process. Can be implementation of some data transfer protocol or model implemented in python.
Since agent is based on asynchronous execution, and can be slowed down by blocking synchronous parts, it is strongly advised to implement computational heavy services separate from agent, and use some protocols (like http) for data transfer.
Connector represents a function, where tasks are sent in order to process.
Can be implementation of some data transfer protocol or model implemented in python.
Since agent is based on asynchronous execution, and can be slowed down by blocking synchronous parts,
it is strongly advised to implement computational heavy services separate from agent,
and use some protocols (like http) for data transfer.

There are several possibilities, to configure connector:

Expand All @@ -121,7 +126,7 @@ There are several possibilities, to configure connector:
* **url**
* Actual url, where an external service api is accessible. Should be in format ``http://<host>:<port>/<path>``
* **batch_size**
* Represents a maximum task count, which will be sent to a service in a batch. If not presented is interpreted as 1
* Represents a maximum task count, which will be sent to a service in a batch. If not specified is interpreted as 1
* If the value is 1, an `HTTPConnector <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L10>`__ class is used.
* If the value is more than one, agent will use `AioQueueConnector <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L32>`__. That connector sends data to asyncio queue. Same time, worker `QueueListenerBatchifyer <https://github.com/deepmipt/dp-agent/blob/master/deeppavlov_agent/core/connectors.py#L40>`__, which collects data from queue, assembles batches and sends them to a service.

Expand Down
10 changes: 8 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,16 @@ def read_requirements():

setuptools.setup(
name='deeppavlov_agent',
version='2.0.0',
version='2.0.1',
include_package_data=True,
description='An open source library, allowing you to create data processing systems based on a sequence graph, '
'alongside with saving sample processing results in database.',
long_description='An open source library, allowing you to create data processing systems based on a sequence '
'graph, alongside with saving sample processing results in database. '
'Possible application is chatbots or other NLP systems which combine multiple skills.',
keywords=['chatbots', 'microservices', 'dialog systems', 'NLP'],
packages=setuptools.find_packages(exclude=('docs',)),
python_requires='>=3.7',
url="https://https://github.com/deepmipt/dp-agent",
url="https://github.com/deepmipt/dp-agent",
**read_requirements()
)

0 comments on commit 8c006db

Please sign in to comment.