Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THREESCALE-7864 async-pool v0.3.12 #324

Merged
merged 1 commit into from
Oct 18, 2022
Merged

THREESCALE-7864 async-pool v0.3.12 #324

merged 1 commit into from
Oct 18, 2022

Conversation

eguzki
Copy link
Member

@eguzki eguzki commented Oct 17, 2022

what

Upgrade async deps

  • async 1.24.2 -> 1.26.2
  • async-pool 0.2.0 -> 0.3.12

Fixes https://issues.redhat.com/browse/THREESCALE-7864
Fixes #308

Performed tests:

image async-pool version Mem leak fixed connection dropping issue fixed
latest 2.12 0.2.0 ✔️
master-c1889f94-async-pool-0.3.9 0.3.9 ✔️
master-c1889f94-async-pool-0.3.12 0.3.12 ✔️ ✔️

How check mem leak

  • Deploy 3scale with the operator
  • Remove redis virtual databases. Change backend-redis secret, and both storage and queue have the same value: "redis://backend-redis:6379"
oc rollout latest deploymentconfig/backend-listener
  • Apply backend worker async
oc set env deploymentconfig/backend-worker CONFIG_REDIS_ASYNC=1
  • 3scale provisioning using buddhi
bundle exec perftest-toolkit-buddhi --portal https://TOKEN@3scale-admin.example.com --profile backend --public-base-url "" --private-base-url "http://toystore.my-namespace.svc:80" -o /dev/stdout
  • Traffic generator (siege)
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: siege
spec:
  replicas: 1
  selector:
    matchLabels:
      app: siege
  template:
    metadata:
      labels:
        app: siege
    spec:
      containers:
        - args:
            - --time=10H
            - --concurrent=16
            - --quiet
            - "http://backend-listener.my-namespace.svc/transactions/authrep.xml?service_token=2bf6c08b2f58bf513367edca811e24df52f7f5966b736186c1cc2fabeb6484ec&service_id=2555417932766&usage%5Bhits%5D=1&user_key=37002980fcb6bb2bbc9030ed5318f3d1&log%5Bcode%5D=200"
          image: ecliptik/docker-siege
          imagePullPolicy: IfNotPresent
          name: siege
      restartPolicy: Always
  • Monitor backend worker job queue length: make sure to keep the queue length at low levels (less than 10k, ideally less that 1k)
watch -n 10  "oc rsh $(kubectl get pods -l 'deploymentConfig=backend-redis' -o json | jq '.items[0].metadata.name' -r) /bin/sh -i -c 'redis-cli llen resque:queue:priority &&redis-cli llen resque:queue:main'"
  • prometheus query to monitor mem used by the worker
container_memory_working_set_bytes{cluster="", namespace="my-namespace", pod=~"backend-worker-.*", container="backend-worker", image!=""}
  • customize backend worker image using the apimanager CR
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
  name: apimanager1
spec:
  [...]
  backend:
    cronSpec:
      replicas: 1
    image: quay.io/3scale/apisonator:master-c1889f94-async-pool-0.3.12
    listenerSpec:
      replicas: 1
    workerSpec:
      replicas: 1

Mem leak tests results

mem

As it can bee seen, 2.12 image shows the memory increasing while other images show stable memory usage

How check connection management issue

To verify connection issue, explained here and here, a socat relay server was used as proxy to monitor connection management.

  • Deploy 3scale as explained above with async mode enabled in backend.
  • Downscale to 0 replicas the worker deployed in the cluster using the apimanager CR
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
  name: apimanager1
spec:
  [...]
  backend:
    cronSpec:
      replicas: 1
    image: quay.io/3scale/apisonator:master-c1889f94-async-pool-0.3.12
    listenerSpec:
      replicas: 1
    workerSpec:
      replicas: 0
  • Port forward locally to the backend redis service
kubectl port-forward service/backend-redis 6379
  • Run socat monitoring connections (-d -d ) and proxying in port 7379
socat -d -d TCP-LISTEN:7379,fork TCP:127.0.0.1:6379
  • Run backend worker locally
    • it needs a little patch
--- a/bin/3scale_backend_worker
+++ b/bin/3scale_backend_worker
@@ -10,7 +10,7 @@ end
 options = {
   multiple: true,
   dir_mode: :normal,
-  dir: '/var/run/3scale'
+  dir: '/home/eguzki/tmp/3scale-backend-worker'
 }
$ cd apisonator
$ CONFIG_FILE=./openshift/3scale_backend.conf CONFIG_REDIS_ASYNC=1 CONFIG_WORKERS_LOG_FILE=/dev/stdout CONFIG_REDIS_PROXY="redis://127.0.0.1:7379" CONFIG_QUEUES_MASTER_NAME="redis://127.0.0.1:7379" RACK_ENV=production bundle exec 3scale_backend_worker run
  • The number of connections should be stable. There should not be logs about creating and dropping connections continuously. If the dropping connection issue happens, socat shows the client dropping connection continuously, connections are not reused. Example of a socat log showing dropped connection:
2021/12/02 12:54:34 socat[861] N socket 1 (fd 6) is at EOF

@eguzki eguzki requested a review from slopezz October 18, 2022 09:34
@eguzki eguzki marked this pull request as ready for review October 18, 2022 10:05
Copy link
Member

@slopezz slopezz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work @eguzki , let's merge it, and so productized image is created and can be tested in staging!

@eguzki eguzki merged commit 3ff294a into master Oct 18, 2022
@bors bors bot deleted the async-pool-0.3.12 branch October 18, 2022 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async execution model memory leak
2 participants