-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ray Serve] Memory leak in 2.6 #38089
Comments
When did this behavior start to happen? E.g. is it between 2.6.0 and 2.6.1? |
Between 2.5.1 and 2.6 |
After much profiling and pain, we discovered that the root cause is a bug in the Ray core streaming object ref generator code causing the "end of stream" object to never be removed from the in memory object store. Verified by:
@rkooo567 is taking it from here |
Also verified the leak is not present with |
For posterity, some flamegraphs taken over time as the leak occurred for the leaking ( |
I am investigating it now. Btw, how did you guys find the memory leak? Do you run release tests with memory usage now? |
Re-opening until cherry picked |
NVM it's already cherry-picked! |
What happened + What you expected to happen
See a memory leak in HTTPProxy on multiple serve clusters in Anyscale Workspaces/Services. This regression was likely introduced in 2.6
![Screenshot 2023-08-03 at 1 52 12 PM](https://private-user-images.githubusercontent.com/122416226/258234860-bf80d5c0-2e3b-4e3a-8339-eccd3485e792.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA1ODI0ODgsIm5iZiI6MTcyMDU4MjE4OCwicGF0aCI6Ii8xMjI0MTYyMjYvMjU4MjM0ODYwLWJmODBkNWMwLTJlM2ItNGUzYS04MzM5LWVjY2QzNDg1ZTc5Mi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzEwJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxMFQwMzI5NDhaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iNWQzNGU0YWU3MDhlZTNkYzJkNGJlODA2OWNjNjYxODNhNmIyMmFmZWY0MjQ2ODE2OWJhYWY1YjdmOWJhNjRkJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.iMH69kwdZyl1EvbY6lYSNj9Nppi1PbnFNodW-1I1AM4)
Versions / Dependencies
Ray 2.6.1
Reproduction script
Run a serve deployment and send requests repeatedly
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: