Agents endpoints #449

PHLF · 2017-06-28T07:11:45Z

Proposed changes:

Rasa logic should work with agents (endpoints)

Status:

ready for code review
there are tests for the functionality
documentation updated
changelog updated

The idea behind this is to enhance how RASA handles multiple apps to be on par with services like api.ai. Indeed when you have a client sending a training request like this 127.0.0.1:5000/train?name=agent_A you expect your agent (endpoint) to still be available for parsing requests and updated with the newly trained model once available. So models would be a way to version your agent:

agent_A
- model_YYYYMMDD-HHMMSS (model actually in use)
- model_YYYYMMDD-HHMMSS
- model_YYYYMMDD-HHMMSS
- ...
agent_B
- model_YYYYMMDD-HHMMSS
- ...

Original PR: #426

wrathagom · 2017-06-28T12:17:39Z

@PHLF am I correct that you haven't made any changes to the flask endpoints yet? I was curious what it would look like to load a specific agent /parse?q=hello&agent=myagent? How about a specific model of a specific agent? /parse?q=hello&agent=myagent&model=mymodel?

PHLF · 2017-06-28T12:24:27Z

You're right: I didn't do any api changes at the moment. I'd like the actual work to be reviewed first before introducing any breaking changes. And yes loading a specific model for your agent is something that would be great to introduce 😄

wrathagom · 2017-06-28T12:29:51Z

So without the ability to load a specific model, this would be a major breaking change for us. When we train a model we put it through an evaluation phase, only after the evaluation is done does a user in our system choose whether or not to activate it. The activation is done by maintain our own list of active models and simple routing request approprietly.

In the end we can hack it either way by just strictly interacting with agents as we have models. Just food for thought.

I'm swamped this week, but may get me or my team involved next week. Either way, we are watching with nervous anticipation.

PHLF · 2017-06-28T12:40:12Z

You don't have to be nervous as my incentive is not to break anything but to remove the need of hacking things you mentioned. By the way it should be easy to extend the http api to request a specific model for a specific agent. Still this is just a PR with no guarantee to be integrated at all 😉

tmbo

Looking good 👍 some minor remarks

tmbo · 2017-07-07T12:33:33Z

_pytest/test_multitenancy.py

@@ -119,19 +115,19 @@ def test_post_parse_invalid_model(client, response_test):
    assert response.json.get("error").startswith(response_test.expected_response["error"])


-if __name__ == '__main__':


Should still have a main method to easily generate the models as these are models we don't want to retrain during the execution of the tests but rather test if we can still load previously trained models.

I call the train() function each time because I had an issue where the test_models trained with python 2 where not loadable with python 3 and vice versa because of pickle (the issue was more specifically related to joblib)

tmbo · 2017-07-07T13:26:39Z

rasa_nlu/data_router.py

+    @staticmethod
+    def _latest_agent_model(agent_path):
+        """Retrieves the latest trained model for an agent"""
+        agent_models = {model[6:]: model for model in os.listdir(agent_path)}


need to remove the magic number (e.g. define "model_" as a prefix constant somewhere and take its length here).

tmbo · 2017-07-07T13:31:25Z

rasa_nlu/data_router.py

        return {
            "trainings_under_this_process": num_trainings,
-            "available_models": models,
+            "available_models": agents,


these aren't models, right? ;)

Right, this was left as is to be matter for discussion: what do we give to the end user?

Just the list of endpoints

The list of endpoints with the actual model

" " + the endpoints status {ready, training...} (would that be really useful)?

Can we actually get the information for the last one for all the endpoints (if easy that would be great if not its not a big deal)?

What do you mean exactly? Are you talking about the endpoint status or the underlying model?

How I see things:

{ "agent A": { "status": "ready", "default_model": "model_xxxxxx", "available_models": [ "model_xxxxxx", "model_xxxxxx" ] }, "agent B": { "status": "training", "default_model": "model_xxxxxx", "available_models": [ "model_xxxxxx", "model_xxxxxx" ] } }

@wrathagom Would that be enough for your usecase?

PHLF · 2017-07-07T14:46:34Z

@tmbo Do you have any suggestion regarding some http API changes: can I go for loading a specific model for a specific agent?

tmbo · 2017-07-21T15:06:47Z

@PHLF sure that sounds very useful. The only thing we need then is a way for the API user to know which model currently is loaded for an agent.

tmbo · 2017-07-21T15:07:42Z

And there might be threading issues, right? So you could have loaded the recent model within one thread and an older one in another thread, which sounds dangerous.

PHLF · 2017-07-21T15:21:24Z

That should be part of the /status endpoint response.
I need to check that. Edit: the case you mentioned is not really an issue but I agree that changing a model which is being used by another thread for parsing is to be avoided.

PHLF · 2017-07-23T20:17:03Z

Checks are erroring: that's OK. That's still very much W.I.P. but at least it is advanced enough to allow you to give some feedback about the current code structure (among other things, I need to take care of multithreading issues).

PHLF · 2017-09-11T07:32:41Z

Agent -> Projects ? Sounds a bit strange to me. What is the motivation behind such a name ?

amn41 · 2017-09-11T07:35:14Z

So you might have multiple bots under development (say one restaurant bot and one customer service bot), and each has it's own training data and model. So the project would be e.g. 'restaurants', and you can always call /parse?project=restaurants and get the latest model

PHLF · 2017-09-11T07:45:00Z

Fair enough. Let's do some renaming then.

PHLF · 2017-09-11T13:09:03Z

Should the logic behind the emulators be changed? As they only allow to query a specific "project" therefore automatically querying the latest model (which is the behaviour of API.ai likes services).

amn41 · 2017-09-11T13:42:42Z

I think sensible behaviour would be to either specify a project (&rasa uses the latest) or specifying a specific model - @tmbo WDYT?

tmbo · 2017-09-11T13:56:01Z

Yes, I agree, querying the latest is a sensible default. But being able to use a specific model is quite useful as well - should do both.

PHLF · 2017-09-11T13:59:33Z

Querying a specific model for a specific project already works, but I'm not sure it works with the emulators as well.

tmbo · 2017-09-11T14:02:18Z

sounds good. why wouldn't the emulation work? I think they are just post processing the pipeline result

PHLF · 2017-09-11T14:08:48Z

My bad, it's not related to the emulators but to the way some parameters are normalized. Does this need further changes to also normalize a model parameter ?

    def normalise_request_json(self, data):
        # type: (Dict[Text, Any]) -> Dict[Text, Any]

        _data = {}
        _data["text"] = data["q"][0] if type(data["q"]) == list else data["q"]
        if not data.get("project"):
            _data["project"] = "default"
        elif type(data["project"]) == list:
            _data["project"] = data["project"][0]
        else:
            _data["project"] = data["project"]
        _data['time'] = data["time"] if "time" in data else None
        return _data

tmbo · 2017-09-11T14:14:19Z

yes, that's right, the parameter needs to be added there.

PHLF · 2017-09-12T05:17:21Z

I made some changes (readers-writer lock) to ease the development of further enhancements like an unload HTTP endpoint to delete unused models from memory.

tmbo · 2017-09-12T09:03:36Z

rasa_nlu/project.py

@@ -32,23 +34,32 @@ def __init__(self, config=None, component_builder=None, project=None):
        self._default_model = self._latest_project_model() or 'fallback'

    def parse(self, text, time=None, model=None):
-        # Lazy model loading
+        # Readers-writer lock simple double mutex implementation
+        self._reader_lock.acquire()


let's put the lock handling into a separate function

tmbo · 2017-09-12T09:03:51Z

rasa_nlu/project.py

-        self._lock.release()
+        response = self._models[model].parse(text, time)
+
+        self._reader_lock.acquire()


separate function

tmbo · 2017-09-12T11:38:33Z

@PHLF do you think we can get this merged today?

PHLF · 2017-09-12T12:11:45Z

I have to update the docs, but I think this is technically ready.

PHLF · 2017-09-12T14:04:13Z

@amn41 @tmbo @wrathagom As I said I think this is technically OK. I did a first pass on the docs. Let me know what you think.

tmbo · 2017-09-12T14:47:38Z

looks really good - you did a great job with this PR let's get it merged when the Travis build is finished. we'll tackle issues with the documentation along the way.

PHLF · 2017-09-12T14:51:59Z

you did a great job with this PR

Only time will tell 😃

Next thing to add: an HTTP endpoint to unload models from memory. The server side logic is in place, it just requires a new endpoint and a new route handler in the datarouter.

Thanks Tom for your support.

tmbo · 2017-09-12T15:00:13Z

Yes, you are right, I think that is another really good addition. But as I don't want to delay this PR any further (I know that you already needed to resolve a lot of merge conflicts 😉 ) I'd rather see them added separately.

tmbo · 2017-09-12T15:00:25Z

🎉

PHLF added 3 commits June 27, 2017 23:49

Extracted agents logic from flask_to_klein branch

92f0f74

Minor fixes

a67508d

Fixed tests

ab266a6

PHLF mentioned this pull request Jun 28, 2017

Flask to klein strikes back #426

Closed

6 tasks

Added exception handling fo KeyError. Fixed test_*_parse_invalid_model

3045e53

PHLF mentioned this pull request Jun 28, 2017

Flask to klein no agents #450

Merged

6 tasks

PHLF added 7 commits June 28, 2017 15:22

Added missing space

ecb85fb

Minor error handling fix

e21dacc

Merge remote-tracking branch 'upstream/master' into agents_endpoints

a0ef55f

Merge remote-tracking branch 'upstream/master' into agents_endpoints

e622626

Merge remote-tracking branch 'upstream/master' into agents_endpoints

067a0a9

Merge remote-tracking branch 'upstream/master' into agents_endpoints

8b31b06

Forgot to uncomment a line

114fa9d

wrathagom mentioned this pull request Jul 6, 2017

Training the same model twice with different data - Incremental Learning #463

Closed

tmbo reviewed Jul 7, 2017

View reviewed changes

PHLF added 3 commits July 15, 2017 23:24

Merge remote-tracking branch 'upstream/master' into agents_endpoints

0a8a332

Merge remote-tracking branch 'upstream/master' into agents_endpoints

0c4c400

Merge remote-tracking branch 'upstream/master' into agents_endpoints

a3c84d1

PHLF added 2 commits July 23, 2017 15:53

Merge remote-tracking branch 'upstream/master' into agents_endpoints

9345236

Added Agent class (W.I.P)

f56cc65

Set flag for agent to update

96e14b1

Merge remote-tracking branch 'upstream/master' into agents_endpoints

09884af

PHLF added 2 commits September 11, 2017 14:16

Renamed Agent to Project

e8b500e

Renamed Agent to Project (2nd pass)

192d11e

Added readers-writer lock to protect projects' models dict

956887e

tmbo reviewed Sep 12, 2017

View reviewed changes

Fixed RW lock. Added function to unload specific model.

1451838

PHLF added 2 commits September 12, 2017 16:00

Updated documentation according to the newly introduced projects logic.

dc93398

Merge remote-tracking branch 'upstream/master' into agents_endpoints

faae04e

tmbo merged commit 3ef9663 into RasaHQ:master Sep 12, 2017

tmbo mentioned this pull request Sep 13, 2017

Switch to current travis build images #460

Closed

PHLF mentioned this pull request Sep 26, 2017

added server_model_directory argument in server.py #607

Closed

disimone mentioned this pull request Mar 21, 2019

allow for different log levels in ConsoleInputChannel and on file #2115

Closed

		@@ -119,19 +115,19 @@ def test_post_parse_invalid_model(client, response_test):
		assert response.json.get("error").startswith(response_test.expected_response["error"])


		if __name__ == '__main__':

Agents endpoints #449

Agents endpoints #449

Conversation

PHLF commented Jun 28, 2017 • edited Loading

wrathagom commented Jun 28, 2017

PHLF commented Jun 28, 2017

wrathagom commented Jun 28, 2017

PHLF commented Jun 28, 2017

tmbo left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PHLF Jul 7, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PHLF Jul 23, 2017 • edited Loading

Choose a reason for hiding this comment

PHLF commented Jul 7, 2017 • edited Loading

tmbo commented Jul 21, 2017

tmbo commented Jul 21, 2017

PHLF commented Jul 21, 2017 • edited Loading

PHLF commented Jul 23, 2017

PHLF commented Sep 11, 2017

amn41 commented Sep 11, 2017

PHLF commented Sep 11, 2017

PHLF commented Sep 11, 2017 • edited Loading

amn41 commented Sep 11, 2017

tmbo commented Sep 11, 2017

PHLF commented Sep 11, 2017

tmbo commented Sep 11, 2017

PHLF commented Sep 11, 2017

tmbo commented Sep 11, 2017

PHLF commented Sep 12, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tmbo commented Sep 12, 2017

PHLF commented Sep 12, 2017

PHLF commented Sep 12, 2017 • edited Loading

tmbo commented Sep 12, 2017 • edited Loading

PHLF commented Sep 12, 2017

tmbo commented Sep 12, 2017

tmbo commented Sep 12, 2017

PHLF commented Jun 28, 2017 •

edited

Loading

tmbo left a comment •

edited

Loading

PHLF Jul 7, 2017 •

edited

Loading

PHLF Jul 23, 2017 •

edited

Loading

PHLF commented Jul 7, 2017 •

edited

Loading

PHLF commented Jul 21, 2017 •

edited

Loading

PHLF commented Sep 11, 2017 •

edited

Loading

PHLF commented Sep 12, 2017 •

edited

Loading

tmbo commented Sep 12, 2017 •

edited

Loading