-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage Collection crashes with timeout #991
Comments
How's the computer resource usage, anything saturated? I think the timeout is due to Riak just being slow due to heavy load, which can be internal or external. |
Hello, As far as we can tell, the resources of the machine are not under stress during garbage collection. We are running RiakCS on a VM, which has 2 CPUs, 4G RAM, and 10G disk space. We tried increasing the RAM to 16G at one point, as well as monitoring I/O using Thanks, Raina && @ruthie |
We had the same issue. Resolved by setting parameter +zdbbl 32768 in riak/vm.args. |
I'm going to close as I heard the problem solved, but if there are any additional issues or questions don't hesitate to add here. |
Hello,
We have been experimenting with various GC scenarios and configurations, and have run into the same crash repeatedly.
In all our experiments, we used s3cmd to interact with our 3-node RiakCS cluster. We have tried modulating these factors:
gc_batch_size
- Tried from the default value (1000) down to 1. It seemed like reducing the batch size improved our success rate, but still didn't help with the larger GCsgc_max_workers
- Tried between 1 and 20. Reducing concurrency by setting to 1 or settingdelete_concurrency
to 1 did not seem to improve things.We searched to see if our issue had been discovered before, and came across #949, #946, and #827. Since these were marked as fixed for RiakCS 1.5.1, we downloaded the new release and tried many of the same scenarios. Unfortunately, there appears to be no significant difference in GC between RiakCS 1.5.0 and 1.5.1.
https://gist.github.com/rmasand/4a4c0975ad5b494c1c90 contains our config and our logs.
We would appreciate either a recommendation of a workaround in 1.5.0, or suggestions as to why this still appears broken in 1.5.1.
Raina & @robdimsdale
Cloud Foundry Services
The text was updated successfully, but these errors were encountered: