New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leaks on queries kill performance tests #456
Comments
Thanks @elvis197686 for reporting |
@elvis197686 fix just pushed to master branch, can you please verify? |
Thanks @swilly22 - we can test it but it will be a lot easier to test if we can use a docker image - is the |
Master is always autobuilt into edge (note commit id) from |
Great, we'll give it a go |
Hi @elvis197686 did you had a chance to test the latest Master branch? |
Yes, we have re-run our performance tests with a build from the master branch, and they did complete. However the amount of memory used still went up significantly. We are going to do more tests tomorrow, but I suspect it won't take much more for the memory limit to be exceeded. BTW we have tried the |
@elvis197686, would it be possible to get your testing script(s) or a list of queries you're using? |
Yes, we will give more information to help reproduce the problem when we have completed the tests, but it might be a few days |
@swilly22 The test is the same as we wrote in the initial description: We did an initial ingestion of a linear DAG consisting of maybe 1000 nodes, then executed the same command for 30 minutes. It hasn't reached the end in any of our tests - in the last test we started getting the following errors about half way through:
And here is the graph showing the memory usage, etc. the test ran from around 13:45 to 14:00. |
Had the same issue on our prod and dev cluster. Read queries cause the amount of memory consumed to grow on master node & on slave nodes |
Some parser tokens (strings provided in user queries) currently cause memory leaks, so this is an outstanding problem that becomes noticeable with high query volumes. We're replacing our current parser entirely (#454), which I expect should be completed in about 3 weeks. Those leaks should not be an issue after the replacement, so I think that trying to patch the current one wouldn't be an effective use of time! |
Stale issue message |
Taking over from elvis197686, we can confirm v2 m6 still has the memory leak (ever climbing memory usage) but to a lesser extent than v1. This is using exactly the same details as elvis197686 has posted previously. |
@elvis197686 major patches have been done in #785 which is included in latest M07 and M08 releases. Could you please validate this again and if it's still leaking, send us the queries you run that are leaking? |
Hi @K-Jo, replying on behalf of @jnsnjspr We are still seeing the memory leak in M08: This is a sample of the queries: https://gist.githubusercontent.com/2ZZ/9c95d3cb423fb91b562a751c53a798b9/raw/3dd939df6ca9b51b51cf96264054695778941f78/redisgraph-memoryleak.log |
Hey @2ZZ, Thanks for the gist! That was very helpful. I converted it into a script to reproduce your findings, and also observed the memory leak - https://gist.github.com/jeffreylovitz/59c2ee0b5bcd1423e8c85571fb9522d4#file-a_results-md . I found that much of the memory could be reclaimed with the Redis command MEMORY PURGE, which indicates that jemalloc is storing some allocations for its own use rather than returning it to the system. This is fine provided that it does not become too overzealous - are you still able to trigger OOM errors on your server? Some of the memory loss can't be accounted for by jemalloc, so we'll continue looking into this. If you have the opportunity, let us know how |
Hi @jeffreylovitz |
We will resume testing on this using the sample queries provided in #456 (comment) . What are the details of the server configuration, especially with regard to clustering and replication? |
We're doing this in Kubernetes, there is a single pod with 1 Redis container and 1 Sentinel container. Both running redislabs/redisgraph:1.99.7 Redis config:
Sentinel config
|
I can confirm the memory leak for our workflow as outlined by elvis197686 in the first post |
Thanks, @jesper-wh and all! |
Hello Redis,
Recently we have been running performance tests on Redis Graph, however these tests have kept failing because the Graph has run out of memory
(it either crashes or we get the
redis.clients.jedis.exceptions.JedisDataException: OOM command not allowed when used memory > 'maxmemory
error).Our monitoring tools have shown us that this seems to be happening because memory is being allocated and never released.
We conducted some experiments to see whether we could easily reproduce memory leaks, and we have.
We have created a medium size DB (around a 1000 objects) using a series of commands like the ones given below.
There are a few levels in the heirarchy, which (for illustration purposes) are called TOP, MIDDLE and BOTTOM
Example insert commands:
We then executed a large number of queries (>50,000) of the same format but with different criteria.
The queries were executed in parallel (about 50 a second)
An example query is shown below - the "N" is replaced by the current UTC (in seconds)
Example Query:
graph.query GRAPH_NAME "MATCH (:TOP {id: 'TOP_1'})-[:HAS]->(:MIDDLE)-[:HAS]->(e:BOTTOM) WHERE e.sort = 'BOTTOM_SORT_1' AND e.startDateTime > N AND e.activeCount > 0 RETURN e ORDER BY e.startDateTime, e.name SKIP 0 LIMIT 100"
By doing this we were able to make Redis Graph consistently run out of memory after about 20 minutes of these queries.
The graph below shows that the memory is allocated and never released.
Note that we have run this with sequential queries, and there did appear to be a memory leak but it wasn't as severe.
Memory Graph:
Version:
The latest RedisGraph docker image (as of yesterday)
The text was updated successfully, but these errors were encountered: