Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to deploy to a serverless environment #205

Closed
moltar opened this issue Sep 28, 2023 · 10 comments
Closed

Ability to deploy to a serverless environment #205

moltar opened this issue Sep 28, 2023 · 10 comments

Comments

@moltar
Copy link

moltar commented Sep 28, 2023

It is already possible to deploy a Go server, wrapped into a Docker, into an AWS Lambda now, with an adapter. This could be hugely beneficial for occasional usage scenarios.

But the one thing I am concerned about though is state.

Does Zep maintain any in-memory state for a long time?

Or is it mainly an API layer between the services and the storage (Postgres)?

Will it commit, or drain, the state on SIGTERM?

@danielchalef
Copy link
Member

The open source version of Zep does retain state as task management for async embedding, summarization, etc have not yet been moved into an external message queue. I'd recommend using ECS or EKS rather than Lambdas.

@oesni
Copy link

oesni commented Oct 4, 2023

So, ZEP itself is a stateful application?? I want to deploy zep to my k8s cluster, but I think it's not safe to scale out if it's stateful.
@danielchalef

@danielchalef
Copy link
Member

danielchalef commented Oct 6, 2023

@oesni Zep can be scaled horizontally, but can't be scaled in without being sure that queues have drained on an instance of Zep. You could potentially taint the pod, removing it from the load balancer, wait for the queues to drain and then deleting the pod. This is being improved to utilize message queues.

@moltar
Copy link
Author

moltar commented Oct 17, 2023

@danielchalef Does this also mean that load balancing multiple instances is currently not recommended, as they would have split loads? Or would they own the loads entirely, and there's no cross-talk required?

@danielchalef
Copy link
Member

Instances own the load and there's no crosstalk.

@danielchalef
Copy link
Member

Closing this as since #246 , Zep no longer holds state.

@moltar
Copy link
Author

moltar commented Nov 5, 2023

To run the API in a serverless environment then is all clear. There's an adapter available even specific to go servers.

What about the jobs then? Would I need to execute the binary on schedule to process the jobs?

Is there a specific entry point or a flag that would run the workers without the http server?

Is there way to obtain the queue size?

@danielchalef
Copy link
Member

@moltar While Zep doesn't hold state anymore, it's still not designed for a serverless deployment. I'm unsure how it would perform in such an environment, particularly as there is some warm up time required (<1sec, but still meaningful). Deployment using Kubernetes now makes a lot more sense, since you can automatically scale in and out without concern for state.

We could potentially add some command line flags that tell Zep to only run the API / or the web UI. Or only run the TaskRouter. This may reduce startup time for API-only use. It would also mean you could run a persistent implementation of a TaskRouter instance or two, which would execute tasks.

Would love a contribution if the above might be helpful!

@moltar
Copy link
Author

moltar commented Nov 5, 2023

The use case I was thinking of is occasional, low volume use. It'd be more economical to run in a Lambda even with the warm up times being a bit high.

For anything serious a proper container is of course better.

@danielchalef
Copy link
Member

Cool. Let me know how it goes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants