-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Multiple users, multiple conversations, multiple contexts #2942
Comments
Super cool, really amazing what you've done You might need to create your own custom world to manage all this context. The One really important point: |
(that is, instead of using |
Thank you! I'll try to create one agent due to initialization and cloning that "pure" model every time new user starts the conversation. Maybe I will post something important in this topic, so I ask you not to close it. |
Just skimmed your branch quickly, very cool. I think others would love a telegram chat service being added to ParlAI upstream. Let me know if you're interested in generalizing some of your work. |
Ok! But I am developing commercial order, so it is question of confidentiality. I created "cache" dict and cloning agent if it has been already created:
It works! New agents take only 500Mb of RAM. But I am not sure: is this memory heap is good for 90M model or this is still place for optimization? Just because everything is relative and I do not know if this heap bad or good. At this moment server with 16GB RAM allows to store about 25-30 context, that is good, but still not enough for big project. |
Wow a new agent takes 500mb? I would expect more like.... 5mb. |
(Also I would advise against BlenderBot for commercial purposes. There are a number of real issues in terms of safety, coherence, etc. It's very much research) |
Hmm, maybe world takes 500 MB, not agent? I have no idea. Do I understand correctly that agent for model is called
Is that correct? Maybe I should override UPD: Indeed, world heaps memory, not agent. I'm definitely sure, I did some tests. I tried to call
But Is there way to reduce world's memory heap? |
You might be able to lower memory usage by using quantization. We don't have this implemented (yet), but the pytorch docs have a tutorial on it. |
So, at this moment, final solution is to create one world, pass all user messages as contexts, generate answer and full reset the world. In future I think this will be paralleled to multiply worlds, but now I have problems with transforming async HTTP requests to syncro chatting with world. |
@AdamArutyunov hello, Currently, I am using I found your fork and read that you have solved that problem in your fork. Can you let me know the command lines, please? Which terminal codes should I run in order to see your work? Best regards, |
Hello! My final solution is passing message history to agent, let him observe every message inside it, and reset agent after every request. This trick allows me to keep only one world and one agent, so, it increases queue time, but removes memory limit. So, with this solution I can keep infinite users. All changes are currently available in parlai/chat_services/services/websocket folder. There is an API on Flask which can help you with new abstraction layer. You can always watch differences between my fork and original repo. However, I must warn you that this project is commercial order and you cannot use my repo in your projects. |
Hi @AdamArutyunov , could you let me know how you pass message history to agent? Thanks! |
@nikhil-iyer-97 hello! Every agent implements |
@AdamArutyunov I see you have maintained the message history as a list of strings, which i guess helps you maintain the persona of the agent, when you pass the history to the agent. What i could not find was how your model learns the user persona using the history everytime you reset the agent. Please let me know about this. Thanks |
@nikhil-iyer-97 to be honest, I did not understand what "learns the user persona" does exactly mean. Message history should be passed to API with every request. For example, client must do something like this: Sending message history with last message: ["Hello!"] So, "user persona" is determined by all previous message history. |
Thanks, I just wanted to know how your model or agent learns the user persona after you reset the agent ( basically where you make sure that the user when back online, doesnt feel like he/she is talking to a new agent, rather continuing from where they left off). Since you reset it everytime, I was confused |
So!
My task was to develop an interface (API), that allows many users to talk with ParlAI (Blender bot) and set unique persona for every user (so, name of the bot talking to user 1 could be Sarah and bot talking with user 2 could be Jessica). I did it, but there is problem with perfomance and I do not know whether this solution is good from the point of project architecture. That is how I did it.
1. Entry point
I decided to use websockets chat service as entry point to bot. But I ran into the problem: every new connection new WebsocketAgent creates with new random .sid, and bot thinks that is other person and sends him standard message ("welcome, type begin..."). So I did small hack: I am passing user ID into the message and change Agent's .sid every time (and do not generate random .sid in init). That is how it looks like (socket.py):
This hack works properly. If you know better easy way to do this, please, share with me.
2. Config
I use
InteractiveWorld
fromparlai/tasks/blended_skill_talk
. Moreower, I copy-pastedMessengerOverworld
andMessengerBotChatOnboardWorld
fromparlai/chat_service/tasks/chatbot/worlds.py
. So, my websocket config looks like:(There is some confusion at the end because world creator do not recognite
model
andmodel_file
inblender_90M
suboption, so I pasted it direct intoopt
)3. Worlds
I need to set all contexts manually, so in
ParlAI/parlai/tasks/blended_skill_talk/worlds.py
, in_load_personas
I inserted line beforereturn
:To use
InteractiveWorld
with websockets I needed to implement staticgenerate_wold
function:Moreower, I needed to change
parley
function inParlAI/parlai/tasks/interactive/worlds.py
. First of all, creating .first_time bool parameter and do this:Then we check whether "[DONE]" in user's message:
Important thing! We check if "your persona:" in user's message and if it is true, we send context message to bot:
So!
This all works good and works as I supposed to work, but there is problem. Every time new user writes to bot, it creates new Agent, new task, new InteractiveWorld, loads new model and all of this takes about 12% of RAM (1.9 GB). So, seven users easy turning server into the brick.
That is why I opened this issue. What is the best way to implement this feature using only one world?
I have an idea, but I do not know whether it is good or not. We continue using one Agent and one World, but with user ID we also pass all user's messages of chat history and loads all this messages to context. Then we generate an answer and clear world with
world.reset()
function.Does this idea good? In this way, how to use
Overworld
andOnboardWorld
?You can watch my forked repository: https://github.com/AdamArutyunov/ParlAI
The text was updated successfully, but these errors were encountered: