-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory issues due to requests not completing #124
Comments
It appears that an upstream factor is that many of the resources are taking an extremely long time to read... they are finishing, but they should complete reading in less than 10 seconds, and as can be seen, there aren't an extraordinary number of segments in some of the resources being read once they are complete.
|
INFO:sliderule.sliderule:processed 2497 segments in ATL03_20181022195209_03690112_005_01.h5 (after 2900 seconds) |
Those long runs in the image above occur because the processing requests are still running and therefore their memory is not being freed. For the lines that go back up, that occurs because long processing requests finally finish and free up their memory. |
Big drops in available memory correspond to spikes in CPU usage on that same instance. |
When I killed the client request, the server side freed all of the memory of the requests that were still being processed and the memory all came back. |
Two changes were made to dramatically improve the situation:
|
When @SmithB was running YAPC processing requests on somewhat larger regions, the server's available memory plummeted but then either never came back or came back very slowly. As a result, the clients saw multiple heavy usage messages with retries. In addition, there were many cases where the servers never recovered and reset because they ran out of memory.
The initial version this was observed on was
v1.4.0
Here is a snippet that recreates the problem:
The text was updated successfully, but these errors were encountered: