-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory leak, in RMapCache with evictionScheduler workflow, Redisson - 3.15.x #5158
Comments
Can you share Redisson logs?
Do you call RMapCache.destroy() if map isn't needed anymore?
Which read operations do you use?
RMapCache uses a lot of them.
RMapCache uses eval scripts which in turn modify data. |
We don't have any error logs (as of now), or any Redisson logs for that matter, because we are over provisioned, but earlier we saw Redisson timeout errors when application was not able to connect resources. We do not call RMapCache.destroy anywhere, I'll look into this, but the heap analysis show the evictionScheduler is possibly the cause of memory leak? As for Reads, we have normal key value storage only, but internally, it is probably getting converted to eval based commands. or how can we optimize the Redisson client for use case and is upgrade recommended as a solution for any of the above problems ?? |
You need call
Eval scripts can't be avoided. |
Do you use RLocalCachedMap object and not RMapCache? |
Yes, RLocalCachedMap. There are 21 application instances connected to redis cluster
How can i check this? |
|
Edit: |
I have 4 questions: This is somewhat like my current implementation for get, put
put:
with something like this ?
get
|
No. You can decrease
No. Metrics are available only in PRO version.
Yes. This part should be improved.
MapCache get() commands modify idleTimeout setting. This is why master node is used. Try to use getWithTTLOnly() or getAllWithTTLOnly() methods. |
Use
call
Can you share more details? getWithTTLOnly() always uses slaves if readMode = SLAVE
Set MinimumIdleSize and PoolSize equal to the same value. |
As option you can use |
I'm sorry I didn't fully understand the question, We are creating map with every get/put call and using it there only (not reusing it anywhere else, I believe), I understand now that we can't delete after every put, that deletes the data in the cache,
So, earlier we used to have cacheMap.get(), that made all calls go to master nodes only, but now that we have moved to getWithTTLOnly(), we see that read replicas are getting load, but most of it is still going to primary nodes, but there is one behaviour that I observed during load test that, if we kept the load running, and all calls are being served from cache, eventually the load moves to read replicas. But in production we are seeing that replicas are serving less than 10% of the traffic.
We have not explicitly set any value to these configurations, & default value for MasterMinIdleSize and poolSize is 24 only, should I set SlavePoolSize to 24 as well ? I wonder how that would make the difference ? |
You can upgrade to 3.23.0 version and try my suggestion #5158 (comment)
Yes, try it.
Write/read operations ratio might be different for production. |
Not asking for metrics specifically, but a way to see it programatically maybe, or with logs.. The problem that i see is that the wiki does not explain exactly what these settings do and in what case we should increase and based in what? we're completly blind here, the only way that i see is increasing it a little each time and hoping that the error goes away...
What would be a reasonable value for Sorry about asking it, but we're completly blind here, without knowing exactly what these settings does and knowing what is exactly being currently used...
In my case i use RLocalCachedMap, i don't see any getAllWithTTLOnly method.. this map type don't seens to have timeout... |
We are using Redisson client 3.15.x with AWS Elasticache, and we are seeing a couple issues with redisson Client, while using RMapCache.
MONITOR
on our cluster slave nodes and noticed that they were being redirected to primary node. We changed from.get
to.getWithTTLOnly
that started sending read requests to replica nodes, but the read request count is still higher on the primary node (90:10)Any suggestions on these issues and are some of them related ?
Is it a known issue and fixed in higher versions ?
The text was updated successfully, but these errors were encountered: