Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manager Abnormal Exit #2687

Closed
crossoverJie opened this issue Feb 29, 2024 · 6 comments
Closed

Manager Abnormal Exit #2687

crossoverJie opened this issue Feb 29, 2024 · 6 comments
Labels
bug Something isn't working needs triage

Comments

@crossoverJie
Copy link
Contributor

Component(s)

instrumentation

What happened?

Description

When I activate auto-instrumentation injection for a Pod:

instrumentation.opentelemetry.io/inject-java: "true"

The opentelemetry-operator-controller-manager has a certain probability of triggering a shutdown.

and then he Pod was not injected successfully.

Steps to Reproduce

Expected Result

Automatic Injection Successful.

Actual Result

Automatic Injection Failed.

Kubernetes Version

1.22.5

Operator version

0.94.0

Collector version

0.92.0

Environment information

No response

Log output

{"level":"error","ts":"2024-02-29T10:53:21.922838487Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: autoscaling/v2: the server could not find the requested resource","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1.1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/source/kind.go:68\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func2\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/loop.go:87\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/loop.go:88\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/home/runner/go/pkg/mod/k8s.io/apimachinery@v0.29.1/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind).Start.func1\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.1/pkg/internal/source/kind.go:56"}




{"level":"info","ts":"2024-02-29T10:54:32.053121059Z","msg":"Stopping and waiting for webhooks"}
{"level":"info","ts":"2024-02-29T10:54:32.05337108Z","logger":"controller-runtime.webhook","msg":"Shutting down webhook server with timeout of 1 minute"}
{"level":"info","ts":"2024-02-29T10:54:32.053428921Z","msg":"Stopping and waiting for HTTP servers"}
{"level":"info","ts":"2024-02-29T10:54:32.053449569Z","msg":"shutting down server","kind":"health probe","addr":"[::]:8081"}
{"level":"info","ts":"2024-02-29T10:54:32.053476625Z","logger":"controller-runtime.metrics","msg":"Shutting down metrics server with timeout of 1 minute"}
{"level":"info","ts":"2024-02-29T10:54:32.053498484Z","msg":"Wait completed, proceeding to shutdown the manager"}
{"level":"error","ts":"2024-02-29T10:54:32.053533684Z","logger":"setup","msg":"problem running manager","error":"failed to wait for opentelemetrycollector caches to sync: timed out waiting for cache to be synced for Kind *v2.HorizontalPodAutoscaler","stacktrace":"main.main\n\t/home/runner/work/opentelemetry-operator/opentelemetry-operator/main.go:319\nruntime.main\n\t/opt/hostedtoolcache/go/1.21.7/x64/src/runtime/proc.go:267"}

Additional context

No response

@crossoverJie crossoverJie added bug Something isn't working needs triage labels Feb 29, 2024
@iblancasa
Copy link
Contributor

Could you provide more information about your environment?

Are you trying to modify a running pod or do you create the pod with the annotation?

and then he Pod was not injected successfully.

Do you see the environment variables at least?

@crossoverJie
Copy link
Contributor Author

crossoverJie commented Mar 1, 2024

Thank you for your reply.

Could you provide more information about your environment?

Kubernetes version: 1.22.5; this is a serverless cluster.

Are you trying to modify a running pod or do you create the pod with the annotation?

image

Yeah, I created a Pod with the annotation.

Do you see the environment variables at least?

image

Automatic injection is successful when the manager is working properly.

@crossoverJie
Copy link
Contributor Author

Switching to the standard Kubernetes cluster still has this problem, and the error log is the same.

@iblancasa
Copy link
Contributor

Would you be able to provide a reproducer?

@crossoverJie
Copy link
Contributor Author

image

I installed operator using the steps here:

  1. Install cert-manager.
  2. Install Operator: kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/latest/download/opentelemetry-operator.yaml
  3. When an event occurs, the log described above appears and the manager exits.
{"level":"info","ts":"2024-03-06T03:11:20.011803425Z","logger":"setup","msg":"the env var WATCH_NAMESPACE isn't set, watching all namespaces"}

I have not specified a specific namespace, I will try to specify my test namespace for testing.

@crossoverJie
Copy link
Contributor Author

Thank you for your reply.
It's a compatibility problem, and it's normal after decreasing to v0.85.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage
Projects
None yet
Development

No branches or pull requests

2 participants