Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple sessions on one model. #41

Closed
BruceKristelijn opened this issue Jun 26, 2023 · 9 comments
Closed

Multiple sessions on one model. #41

BruceKristelijn opened this issue Jun 26, 2023 · 9 comments

Comments

@BruceKristelijn
Copy link

Hi there, would love to have multiple sessions on the same model but th sessions seem to remember new information given by the other chat sessions. In the docs and settings I couldn't find anyhting. I am curious if this is somemthing I am doing wrong?

@martindevans
Copy link
Member

At the moment the model weights and the context are all bound together in one object. You need to save and restore "states" to have two contexts in one set of weights. This is due to how llama.cpp itself used to work.

They've since made a change which splits model weights and model contexts into two separate things, so you can make multiple contexts from one set of shared weights. My PR (#64) partially addresses this by adding in support for the new loading system. Future PRs will modify the higher level APIs to use this.

@BruceKristelijn
Copy link
Author

Thanks for the response. I am trying it right now but my responses seems to lose some context. I assumed that when a state is loaded / saved it retains the chat and prompt history, or am I mistaken?

@BruceKristelijn
Copy link
Author

I tried including the chat history after loading the session again aswell but this seemed to reset the "memory" of the previous conversation aswell,

@martindevans
Copy link
Member

Is this testing all being done on top of my PR (#64), with master or with some other version?

@BruceKristelijn
Copy link
Author

Not yet, this was my next course of action. Was hoping I understoot the behaviour correctly first.

@martindevans
Copy link
Member

I'm not too sure, sorry. I've been contributing PRs on some of the lower level bits of the stack but not the "higher level "stuff yet. I do know there are a few layers, which should all save and reload state together (executor, context etc), so maybe try backtracing some of that to check it all looks reasonable.

@BruceKristelijn
Copy link
Author

BruceKristelijn commented Jul 30, 2023

Thanks, I just build your PR and it seems to work better without changing a lot of code which is great! Might be the wrong place to ask but I couldn't find it in sourcecode. Do you know if LLamaSharp adds things like 'Assistant:', and 'User:' to the chat?

@martindevans
Copy link
Member

martindevans commented Jul 30, 2023

As far as I know it does not, but that'd be in the higher level parts that I'm not too familiar with so I'm not too sure on that!

@martindevans
Copy link
Member

0.4.2 is out now. Does that resolve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants