Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leases.coordination.k8s.io retrieval error #224

Closed
prmmuthu opened this issue Apr 29, 2023 · 7 comments
Closed

leases.coordination.k8s.io retrieval error #224

prmmuthu opened this issue Apr 29, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@prmmuthu
Copy link

prmmuthu commented Apr 29, 2023

Brief summary

Hello All,

We are using k6 operator for the past 6 months, but now our tests are not picked up by k6 operator and getting below error even after restarting the pods.

image

k6-operator version or image

v0.0.8

K6 YAML

apiVersion: k6.io/v1alpha1
kind: K6
metadata:
  name: k6-test1-1682685331
  namespace: k6-operator-system
spec:
  parallelism: 1
  script:
    volumeClaim:
      name: k6-pvc
      file: 50.tar
  arguments: --tag test_id=50 --tag region=westeurope --tag provider=azure --tag platform_name=tse
    --tag product_id=test1 --tag env=prod --tag app=loadstar
  runner:
    image: crgazwe.azurecr.io/k6test
    imagePullSecrets:
    - name: k6-cred
    env:
    - name: K6_OUT
      value: xk6-prometheus-rw
    - name: K6_LOG_OUTPUT
      value: "loki=https://<...>:<...>@telemetry.pensieve.maersk-digital.net/loki/api/v1/push,label.app_name=50,label.region=westeurope,label.provider=azure,label.product_id=test1,label.env=prod,label.app=loadstar,allowedLabels=[region,provider,product_id,env,app,app_name]"
    - name: K6_PROMETHEUS_RW_SERVER_URL
      value: https://telemetry.pensieve.maersk-digital.net/api/v1/push
    - name: K6_PROMETHEUS_RW_TREND_STATS
      value: "avg,p(90),p(99),min,max"
    - name: K6_PROMETHEUS_RW_HEADERS_AUTHORIZATION
      value: Bearer <...>
    - name: K6_STAGES
      value: 2m:1
    resources:
      limits:
        cpu: 200m
        memory: 500Mi
      requests:
        cpu: 100m
        memory: 300Mi

Other environment details (if applicable)

No response

Steps to reproduce the problem

Can be reproduced by using the same version

Expected behaviour

Need to elect the leader without error and start the k6 test

Actual behaviour

leader election failing

@prmmuthu prmmuthu added the bug Something isn't working label Apr 29, 2023
@yorugac
Copy link
Collaborator

yorugac commented May 3, 2023

Hi @prmmuthu, thanks for repoting. I believe this is due to an incomplete update of the operator. Specifically:

error retrieving resource lock k6-operator-system/fcdfce80.io: leases.coordination.k8s.io "fcdfce80.io" is forbidden: User "system:serviceaccount:k6-operator-system:k6-operator-controller" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "k6-operator-system"

^ this error can happen when one is using controller-v0.0.9 image but with manifests from the older version. See this PR for details. I suggest running a full make deploy from the latest main (or v0.0.9 tag) and checking if that helps.

To be clear, this error shouldn't happen with v0.0.8 at all.

(FYI, I've updated your initial message to remove Bearer and auth parts, just in case 😅)

@yorugac yorugac changed the title leaderelection.go:330] error retrieving resource lock k6-operator-system/fcdfce80.io leases.coordination.k8s.io retrieval error May 3, 2023
@prmuthu
Copy link

prmuthu commented May 4, 2023

Thanks @yorugac for your input, when i changed the image to controller-v0.0.9rc2, the error disappears. We are following fluxCD model for installing k6 operator in to our cluster.. Is there any blog for installing operators using fluxCD in better way.

@yorugac
Copy link
Collaborator

yorugac commented May 5, 2023

when i changed the image to controller-v0.0.9rc2, the error disappears

TBH, this sounds a bit strange: v0.0.9rc2 had broken Docker manifests due to buildx issue, fixed in #190. And of course, update of dependencies (appearance of leases) didn't happen until v0.0.9... Did you try make deploy with v0.0.9 image?

As for FluxCD, we plan to support bundle generation in #4. You can also try to create a bundle yourself and try that in your setup.

@prmuthu
Copy link

prmuthu commented May 8, 2023

when i changed the image to controller-v0.0.9rc2, the error disappears

TBH, this sounds a bit strange: v0.0.9rc2 had broken Docker manifests due to buildx issue, fixed in #190. And of course, update of dependencies (appearance of leases) didn't happen until v0.0.9... Did you try make deploy with v0.0.9 image?

As for FluxCD, we plan to support bundle generation in #4. You can also try to create a bundle yourself and try that in your setup.

Thanks for the input, will install the operator(v0.0.9) with make deploy command. And thanks for considering our ask reg bundle generation

@prmuthu
Copy link

prmuthu commented May 8, 2023

@yorugac I am unable shell interactive into manager container but i can get into kube-rbac-proxy..

image

@yorugac
Copy link
Collaborator

yorugac commented May 23, 2023

@prmmuthu has there been any update about this issue? Btw, you don't need an interactive shell to see the logs. The kubectl logs command can be used instead.

@yorugac
Copy link
Collaborator

yorugac commented Aug 17, 2023

This seems like a transient issue on update of the operator: closing it. Feel free to re-open if needs be.

@yorugac yorugac closed this as completed Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants