New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multitenancy support #36
Comments
Personally I like more the idea of having several models served by one server...although I think that the implications on that over spacy and MITIE would have been well investigated. So basically you would have a /models/flights or /models/restaurants right? each one of them would then be able to answer queries at /models/flights/parse?q=flight from Tokyo to Munich |
Just an idea: use multiprocessing's Process class to spawn processes from a "main" process then use it's Queue class to implement a facade over Spacy, MITIE and anything that has a heavy memory footprint or a long start-up time. That way those things only run in the "main" process, as long as they can be shared. EDIT: upon second thought, it would be cleaner and just as performant to make Spacy and MITIE microservices themselves. |
thanks for the input @baregawi - you guys are on AWS right? so would you run this on elastic beanstalk or just a regular VPS? |
For our main servers, we use Flask as our application framework and deploy through AWS CodeDeploy at the moment since our application is not simple enough for Heroku or Elastic Beanstalk. But I imagine that if we contribute models to rasa_nlu they will be our individual ML/NLP modules which are all Python classes at the moment. |
Add info about verbose-switch to README
…sampling-parameter Eng 471 llm generated output size sampling parameter
This suggestion also comes from @3x14159265 .
Idea is to have rasa NLU provide multiple
apps
, e.g. have several models loaded into memory and serve requests based on them (routed by the URL).The simplest approach is to start a separate server for each model, and use a supervisor. But each process will have word vectors loaded into memory, which means you can't fit very many on a server.
A better way would be to have several models loaded within one server, although I think only the
spaCy
backend would actually be able to share the large memory component between them. Would probably be doable to modify MITIE to support that as well.To help plan this out, would be really helpful if people wrote their intended deployment setup here, so we can discuss various trade-offs.
The text was updated successfully, but these errors were encountered: