Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vttablet is blowing up the memory usage during a sysbench (oomkill) #5511

Closed
Smana opened this issue Dec 5, 2019 · 6 comments
Closed

Vttablet is blowing up the memory usage during a sysbench (oomkill) #5511

Smana opened this issue Dec 5, 2019 · 6 comments

Comments

@Smana
Copy link

Smana commented Dec 5, 2019

When filing a bug, please include the following headings if
possible. Any example text in this template can be deleted.

Overview of the Issue

After successfully installing Vitess using the Helm chart, we wanted to run a tpcc bench using sysbench in order to validate that we have acceptable performances.
However during the prepare phase of the sysbench we're facing an ever growing memory by the vttablet container until the pod restarts.

Reproduction Steps

  1. Create a kubernetes cluster v1.13.11-gke.14

  2. Configure Helm (v2.14.3), init tiller with cluster-admin role

  3. Install the etcd operator

helm install --name etcd-operator stable/etcd-operator
  1. Use this values.yaml and run
    warn: I've created a storage class in order to use ssd storage
helm upgrade --install vitess-release-name -f values.yaml --namespace vitess ${VITESS_REPO}/helm/vitess
  1. From another server (a db loader), clone the Planetscale's sysbench repository
git clone https://github.com/planetscale/sysbench-tpcc
cd sysbench-tpcc
  1. Run the sysbench prepare command
    warn: You will have to expose the vtgates, here I used a GCP internal loadbalancer (here the hostname is vitess)
./tpcc.lua --mysql-host=vitess --mysql-port=3306 --mysql-db=my_db --time=300 --threads=32 --report-interval=1 --tables=10 --scale=10 --db-driver=mysql --use-fk=0 prepare

After a few seconds we can see that the vittablet memory usage reaches the limit and the pod is restarted
image

I tried to build a new vitess/lite image using the master branch in order to add this change #5444
But that didn't help.

Operating system and Environment details

Kubernetes cluster running on GKE version v1.13.11-gke.14
Helm version v2.14.3
Sysbench version 1.0.18

Log Fragments

On slack @makmanalp suggest me to provide dumps in order to get more info for debug.
Please find them here

@Smana
Copy link
Author

Smana commented Dec 6, 2019

Actually after adding the flag queryserver-config-query-cache-size=1 to the vttablet container flags I don't have the issue anymore (tested twice).
So it seems to be solved by the PR #5471
Waiting it to be merged in order to confirm

@Smana
Copy link
Author

Smana commented Dec 6, 2019

Understood, setting up the above parameter configures the underlying mysql with this value:

MySQL [dailymotion@replica]> show variables like '%query_cache_size%';                                  
+------------------+---------+
| Variable_name    | Value   |
+------------------+---------+
| query_cache_size | 1048576 |
+------------------+---------+
1 row in set (0.01 sec)

Now, I have an ever growing memory usage on MySQL. Need to figure out why.

@Smana
Copy link
Author

Smana commented Dec 6, 2019

Forget about my previous message, these parameter are not related

@morgo morgo added the Type: Bug label Dec 7, 2019
@dkhenry
Copy link
Contributor

dkhenry commented Dec 10, 2019

@Smana, Memory usage on MySQL is generally bound by the setting of innodb_buffer_pool_size We have found that MySQL has been using more memory then we expect and we are attributing it to the allocator. As of #5444 you can now set the allocator. Try setting tcmalloc as the allocator, that is the one testing has shown us is the kindest to your memory

@derekperkins
Copy link
Member

@dkhenry it's actually not that easy to use an alternate allocator because we can't pass any flags to mysqlctld #5466

@gedgar
Copy link
Contributor

gedgar commented Sep 22, 2021

Passthrough support for LD_PRELOAD was added in #5730, which makes it simple to switch the allocator to tcmalloc.

Closing this issue for now, feel free to reopen if you are still having issues @Smana. Thanks!

@gedgar gedgar closed this as completed Sep 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants