New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suspected memory leak #7207
Comments
Do you use any dictionaries? We fixed a rare memory leak with them recently, but 19.9 doesn't have the fix: #6447 |
RSS = 1752780 - 1.7 GB It's totally fine. Because a mark cache = 5BG. (check with |
@akuzm - no, we don't use dictionaries. |
As I can guess you have 16GB. It's hardly viable for production You can reduce Disable uncompressed cache if enabled. |
Thank you @den-crane, this is very helpful.
The result is 188MB, which is strange: why would the marks cache grow to 800MB if the total size of the marks is much smaller? Regarding being not viable for production - I guess it depends on how much data ClickHouse needs to manage. We have systems running with only 8MB of memory and they work great - but of course their databases are relatively small (e.g. 20GB). |
Not sure, I suspect because it could contain cache for marks of inactive/merged_parts. And these marks will be pushed out by LRU mechanism.
CH could run with 8 GB until it get OOM. Because CH design CH is super-eager for the memory. Default buffers sizes are 1MB. Each column on insert take 2MB RAM. Each select's thread take several 1MB buffers to read and uncompress. |
@fastest963 you definitely should turn off uncompressed cache. It's nothing to discuss your server has only 26GB and you wasted 10GB to uncompressed cache. I think we should add to CH documentation that minimal requirement to production use is 32GB.
Queries. Any SELECT or INSERT able to use 20-200GB RAM easily. |
@den-crane I don't think you should state 32GB as a requirement since this would turn away people from using ClickHouse when they don't have huge amounts of data. Think of startup companies who may know that eventually they'll have a lot of data, but currently they don't, and they also cannot afford paying hundreds of dollars a month for large servers. As I wrote above, we have many ClickHouse instances running with 8GB (and they are not even the only process running on the machine!). They work great with modest data sizes - e.g. tens of gigabytes. What I would suggest is adding a section to the docs about running ClickHouse with limited memory - e.g. configuration examples for running a server with 8GB, 16GB and 32GB. This would help users determine what settings need to be tweaked, as currently the default settings are targeted for larger instances and bigger workloads. |
Many users successfully run ClickHouse on servers with 4 GiB RAM, although this usage is indeed limited. |
My use case for ClickHouse is as a cache alternative to MySQL for interactive sorts and filters. Expected row counts are in the low millions, and data size in single digit GB. CH being able to scale down to less memory is its primary selling case for me. |
Until recently (version 20.1) we had unusual behaviour of mark cache. It kept records not only up to |
I close this issue because it is resolved. For the reference: it is possible to configure ClickHouse to run successfully even on machine with 4 GiB RAM. Even configurations with 1 GiB RAM will work after careful tuning. |
Is there an example of what specific tuning is required to run with 8GB or 16GB? I've tried the values suggested in this thread and elsewhere (example) but have not been able to find configuration that works for my scenario. This is the error I see (both when running queries and clickhouse-client):
Each time I restart Clickhouse with a new config, the
My use case is to run ClickHouse on a MacBook Pro for local development of application code (running unit tests, etc that depend on ClickHouse). I can set up a cluster in the cloud for testing but it seems unnecessary if it's possible to tune the settings to run locally. |
We have ClickHouse servers that continually grow in RAM usage. Our setup is quite simple:
We run "ps" every hour and you can see that the RSS grows and grows:
When looking at all the jemalloc statistics in the system.asynchronous_metrics table, the values do not change over time: jemalloc.allocated remains at 1990 MB.
Could it be related to the large number of databases on the server? We have another server that has only one database, and its memory use (RSS) does not grow. It fluctuates over time, but after growing it also shrinks back down.
Any assistance would be appreciated, as this issue is forcing our customers to restart their servers from time to time in order to keep memory usage down.
The text was updated successfully, but these errors were encountered: