Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THREESCALE-10167 Fix cpu detection cgroupsv2 #1410

Merged
merged 3 commits into from
Sep 29, 2023

Conversation

eguzki
Copy link
Member

@eguzki eguzki commented Sep 26, 2023

What

Support Cgroups v2 for auto-detection of cpus for worker processes (when APICAST_WORKERS env is not set)

Adds https://issues.redhat.com/browse/THREESCALE-10167 for apicast

Align with the implementation from 3scale/porta#3565

No testing added. Tried busted stub mechanism on io.open without success. The development environment user does not have access to write on files like /sys/fs/cgroup/cpu.weight. The previous cpu detection implementation did not add tests either #600.

Verification Steps

1) Development environment (docker compose)

Using docker 24.0.6

❯ docker --version
Docker version 24.0.6, build ed223bc
  • Checkout this branch
  • Run development env
make development
make dependencies
  • Create configuration file
cat <<EOF >config.json
{
  "services": [
    {
      "backend_version": "1",
      "proxy": {
        "hosts": [
          "example.com"
        ],
        "api_backend": "https://echo-api.3scale.net",
        "backend": {
          "endpoint": "http://127.0.0.1:8081",
          "host": "backend"
        },
        "proxy_rules": [
          {
            "http_method": "GET",
            "pattern": "/",
            "metric_system_name": "hits",
            "delta": 1,
            "parameters": [],
            "querystring_parameters": {}
          }
        ],
        "policy_chain": [
          {
            "name": "apicast.policy.apicast"
          }
        ]
      }
    }
  ]
}
EOF

Run APICast without APICAST_WORKERS env

APICAST_LOG_LEVEL=debug THREESCALE_CONFIG_FILE=config.json ./bin/apicast

The logs show that in docker env cgroups V2 is being used.

2023/09/26 14:23:55 [debug] 62527#62527: *2 [lua] environment.lua:63: cpu_shares(): detecting cpus in Cgroups v2
2023/09/26 14:23:55 [debug] 62527#62527: *2 [lua] environment.lua:92: cpus(): cpu_shares = 3

Check the number of workers: logs should show N lines like the following. In the example, three of them which matches the cpu_shares computation

2023/09/26 14:35:41 [notice] 72304#72304: start worker process 72316
2023/09/26 14:35:41 [notice] 72304#72304: start worker process 72317
2023/09/26 14:35:41 [notice] 72304#72304: start worker process 72318

2) Openshift 4.12 (using cgroups v1)

  • Checkout this branch
  • Build and push docker image to registry
make runtime-image IMAGE_NAME=quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
docker push quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
  • Deploy APIcast operator
  • Deploy APICast instance from the image
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: apicast-config-1
  labels:
    apicast.apps.3scale.net/watched-by: apicast
type: Opaque
stringData:
  config.json: |  
    {
    "services": [
      {
        "backend_version": "1",
        "proxy": {
          "hosts": [
            "example.com"
          ],
          "api_backend": "https://echo-api.3scale.net",
          "backend": {
            "endpoint": "http://127.0.0.1:8081",
            "host": "backend"
          },
          "proxy_rules": [
            {
              "http_method": "GET",
              "pattern": "/",
              "metric_system_name": "hits",
              "delta": 1,
              "parameters": [],
              "querystring_parameters": {}
            }
          ],
          "policy_chain": [
            {
              "name": "apicast.policy.apicast"
            }
          ]
        }
      }
    ]
    }
---
apiVersion: apps.3scale.net/v1alpha1
kind: APIcast
metadata:
  name: apicast1
spec:
  embeddedConfigurationSecretRef:
    name: apicast-config-1
  image: quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
  resources: {} 
  logLevel: debug
EOF
  • Check logs for cpu shares computation
kubectl logs $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name})  | grep environment

resulting in cgroups v1 computation

2023/09/26 15:07:06 [debug] 14#14: *2 [lua] environment.lua:75: cpu_shares(): detecting cpus in Cgroups v1
2023/09/26 15:07:06 [debug] 14#14: *2 [lua] environment.lua:92: cpus(): cpu_shares = 1
2023/09/26 15:07:07 [debug] 14#14: environment.lua:75: cpu_shares(): detecting cpus in Cgroups v1
2023/09/26 15:07:07 [debug] 14#14: environment.lua:92: cpus(): cpu_shares = 1
2023/09/26 15:07:07 [notice] 14#14: [lua] environment.lua:238: add(): loading environment configuration: /opt/app-root/src/config/production.lua

cpu.shares value is 2:

kubectl exec -it $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- cat /sys/fs/cgroup/cpu/cpu.shares

2

Check the number of apicast workers is 1

kubectl exec -it  $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- bash -c "ls /proc | grep '[0-9]' | xargs -I {} sh -c '[ -f /proc/{}/cmdline ] && cat /proc/{}/cmdline && echo \"\"'" 2>/dev/null | strings | grep "nginx: worker process" | wc -l

1

Let's request resources for cpu: 1400m (~1.4 cores)

kubectl patch apicast apicast1 --type=merge --patch '{"spec": {"resources": {"requests": {"cpu": "1400m"}}}}'

New cpu.shares value is 1433:

kubectl exec -it $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- cat /sys/fs/cgroup/cpu/cpu.shares

1433

Logs show cpu_shares = 2

❯ k logs $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) | grep cpus
2023/09/26 15:08:34 [debug] 14#14: *2 [lua] environment.lua:75: cpu_shares(): detecting cpus in Cgroups v1
2023/09/26 15:08:34 [debug] 14#14: *2 [lua] environment.lua:92: cpus(): cpu_shares = 2
2023/09/26 15:08:34 [debug] 14#14: environment.lua:75: cpu_shares(): detecting cpus in Cgroups v1
2023/09/26 15:08:34 [debug] 14#14: environment.lua:92: cpus(): cpu_shares = 2

Check the number of apicast workers is 2

kubectl exec -it  $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- bash -c "ls /proc | grep '[0-9]' | xargs -I {} sh -c '[ -f /proc/{}/cmdline ] && cat /proc/{}/cmdline && echo \"\"'" 2>/dev/null | strings | grep "nginx: worker process" | wc -l

2

3) Openshift 4.14 (using cgroups v2)

  • Checkout this branch
  • Build and push docker image to registry
make runtime-image IMAGE_NAME=quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
docker push quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
  • Deploy APIcast operator
  • Deploy APICast instance from the image
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: apicast-config-1
  labels:
    apicast.apps.3scale.net/watched-by: apicast
type: Opaque
stringData:
  config.json: |  
    {
    "services": [
      {
        "backend_version": "1",
        "proxy": {
          "hosts": [
            "example.com"
          ],
          "api_backend": "https://echo-api.3scale.net",
          "backend": {
            "endpoint": "http://127.0.0.1:8081",
            "host": "backend"
          },
          "proxy_rules": [
            {
              "http_method": "GET",
              "pattern": "/",
              "metric_system_name": "hits",
              "delta": 1,
              "parameters": [],
              "querystring_parameters": {}
            }
          ],
          "policy_chain": [
            {
              "name": "apicast.policy.apicast"
            }
          ]
        }
      }
    ]
    }
---
apiVersion: apps.3scale.net/v1alpha1
kind: APIcast
metadata:
  name: apicast1
spec:
  embeddedConfigurationSecretRef:
    name: apicast-config-1
  image: quay.io/3scale/apicast:test-cpu-detection-cgroups-v2
  resources: {} 
  logLevel: debug
EOF
  • Check logs for cpu shares computation
kubectl logs $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name})  | grep environment

resulting in cgroups v2 computation

2023/09/26 13:41:48 [debug] 14#14: *2 [lua] environment.lua:63: cpu_shares(): detecting cpus in Cgroups v2
2023/09/26 13:41:48 [debug] 14#14: *2 [lua] environment.lua:92: cpus(): cpu_shares = 1
2023/09/26 13:41:48 [debug] 14#14: environment.lua:63: cpu_shares(): detecting cpus in Cgroups v2
2023/09/26 13:41:48 [debug] 14#14: environment.lua:92: cpus(): cpu_shares = 1
2023/09/26 13:41:48 [notice] 14#14: [lua] environment.lua:238: add(): loading environment configuration: /opt/app-root/src/config/production.lua

weight value is 1:

kubectl exec -it $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- cat /sys/fs/cgroup/cpu.weight`
1

Check the number of apicast workers is 1

kubectl exec -it  $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- bash -c "ls /proc | grep '[0-9]' | xargs -I {} sh -c '[ -f /proc/{}/cmdline ] && cat /proc/{}/cmdline && echo \"\"'" 2>/dev/null | strings | grep "nginx: worker process" | wc -l

1

Let's request resources for cpu: 1400m (~1.4 cores)

kubectl patch apicast apicast1 --type=merge --patch '{"spec": {"resources": {"requests": {"cpu": "1400m"}}}}'

New weight value is 55:

kubectl exec -it $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- cat /sys/fs/cgroup/cpu.weight`
55

Logs show cpu_shares = 2

2023/09/26 13:46:12 [debug] 14#14: *2 [lua] environment.lua:63: cpu_shares(): detecting cpus in Cgroups v2
2023/09/26 13:46:12 [debug] 14#14: *2 [lua] environment.lua:92: cpus(): cpu_shares = 2

Check the number of apicast workers is 2

kubectl exec -it  $(kubectl get pod -l deployment=apicast-apicast1 -o jsonpath={.items..metadata.name}) -- bash -c "ls /proc | grep '[0-9]' | xargs -I {} sh -c '[ -f /proc/{}/cmdline ] && cat /proc/{}/cmdline && echo \"\"'" 2>/dev/null | strings | grep "nginx: worker process" | wc -l

2

@eguzki eguzki requested a review from a team as a code owner September 26, 2023 13:55
@eguzki
Copy link
Member Author

eguzki commented Sep 26, 2023

link-check tests fixed on #1411

Copy link
Contributor

@mayorova mayorova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't actually run the code locally (I don't have my dev env set up), but the code looks good to me, also the info in the description is great and shows that this works as expected 👍

@eguzki eguzki merged commit cff87c9 into master Sep 29, 2023
11 checks passed
@eguzki eguzki deleted the fix-cpu-detection-cgroupsv2 branch September 29, 2023 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants