Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collector core dumps when trying to use smartagent & k8s_cluster #623

Closed
govardha opened this issue Jan 5, 2023 · 4 comments
Closed

Collector core dumps when trying to use smartagent & k8s_cluster #623

govardha opened this issue Jan 5, 2023 · 4 comments
Labels
bug Something isn't working Splunk Observability Issue related to Splunk Observability destination

Comments

@govardha
Copy link

govardha commented Jan 5, 2023

Hello,

I am using the helm chart to deploy Otel to capture logs/metrics and send it to my Splunk Enterprise instance.

I created the following receivers and enabled the pipeline.

      receivers:
        k8s_cluster:
          collection_interval: 60s
          node_conditions_to_report: [Ready, MemoryPressure, DiskPressure, PIDPressure, NetworkUnavailable]
          allocatable_types_to_report: [cpu, memory, storage, ephemeral-storage]

        smartagent/kubernetes-cluster:
          type: kubernetes-cluster
          intervalSeconds: 10
          kubernetesAPI:
            authType: serviceAccount
          extraDimensions:
            metric_source: kube-state-metrics
       service:
        extensions:
        - health_check
        - memory_ballast
        pipelines:
          metrics:
            processors:
            - memory_limiter
            - batch
            - resource
            - resource/k8s_cluster
            - resource/add_foo1
            receivers:
            - smartagent/kubernetes-cluster
            - k8s_cluster
            

When I try to implement this, the otelcol core dumps as shown below. Am I doing something wrong?

2023-01-05T22:55:45.313Z        info    service/pipelines.go:108        Receiver started.       {"kind": "receiver", "name": "prometheus/k8s_cluster_receiver", "pipeline": "metrics"}
2023-01-05T22:55:45.313Z        info    service/pipelines.go:104        Receiver is starting... {"kind": "receiver", "name": "smartagent/kubernetes-cluster", "pipeline": "metrics"}
2023-01-05T22:55:45.313Z        info    prometheusreceiver@v0.67.0/metrics_receiver.go:288      Starting scrape manager {"kind": "receiver", "name": "prometheus/k8s_cluster_receiver", "pipeline": "metrics"}
2023-01-05T22:55:45.313Z        info    service/service.go:114  Starting shutdown...
2023-01-05T22:55:45.313Z        info    healthcheck/handler.go:129      Health Check state change       {"kind": "extension", "name": "health_check", "status": "unavailable"}
2023-01-05T22:55:45.313Z        info    service/pipelines.go:120        Stopping receivers...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x25d8499]

goroutine 1 [running]:
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/k8sclusterreceiver.(*kubernetesReceiver).Shutdown(0xc0011cf160?, {0x0?, 0x0?})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/k8sclusterreceiver@v0.67.0/receiver.go:101 +0x19
go.opentelemetry.io/collector/service.(*builtPipelines).ShutdownAll(0xc0017f7ea0, {0x5a82650, 0xc00013a000})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/pipelines.go:123 +0x499
go.opentelemetry.io/collector/service.(*service).Shutdown(0xc0017b7800, {0x5a82650, 0xc00013a000})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/service.go:120 +0xd4
go.opentelemetry.io/collector/service.(*Collector).shutdownServiceAndTelemetry(0xc0011cf8c0, {0x5a82650?, 0xc00013a000?})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/collector.go:259 +0x36
go.opentelemetry.io/collector/service.(*Collector).setupConfigurationComponents(0xc0011cf8c0, {0x5a82650, 0xc00013a000})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/collector.go:164 +0x286
go.opentelemetry.io/collector/service.(*Collector).Run(0xc0011cf8c0, {0x5a82650, 0xc00013a000})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/collector.go:188 +0x65
go.opentelemetry.io/collector/service.NewCommand.func1(0xc000644c00, {0x500c204?, 0x0?, 0x1?})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/go.opentelemetry.io/collector@v0.67.0/service/command.go:53 +0x479
github.com/spf13/cobra.(*Command).execute(0xc000644c00, {0xc000134010, 0x0, 0x1})
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0xc000644c00)
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        /builds/o11y-gdi/splunk-otel-collector-releaser/.go/pkg/mod/github.com/spf13/cobra@v1.6.1/command.go:968
main.runInteractive({{0xc00116bc80, 0xc001502810, 0xc00116be90, 0xc00116b8c0}, {{0x500b848, 0x7}, {0x0, 0x0}, {0x5a29908, 0x7}}, ...})
        /builds/o11y-gdi/splunk-otel-collector-releaser/cmd/otelcol/main.go:117 +0x5a
main.run(...)
        /builds/o11y-gdi/splunk-otel-collector-releaser/cmd/otelcol/main_others.go:26
main.main()
        /builds/o11y-gdi/splunk-otel-collector-releaser/cmd/otelcol/main.go:110 +0x131e
@atoulme atoulme transferred this issue from signalfx/splunk-otel-collector Jan 5, 2023
@atoulme atoulme added the bug Something isn't working label Jan 5, 2023
@atoulme
Copy link
Contributor

atoulme commented Jan 5, 2023

Can you try to upgrade the version of the Docker image to use 0.68.0? I will also track down this panic upstream, thanks for the report.

@govardha
Copy link
Author

govardha commented Jan 6, 2023

Hello @atoulme

I upgraded to 0.68(just the image obviously, not the helm chart) and it doesn't core dump, although I have other problems, but that is more with how I have configured the receiver and it not coming up.

Thanks for the assist.

@jvoravong
Copy link
Contributor

jvoravong commented Jan 6, 2023

The root of this issue could likely related to open-telemetry/opentelemetry-collector#6507

Keep us updated @govardha if this issue persists after upgrading to v0.68.0.

@atoulme atoulme added the Splunk Observability Issue related to Splunk Observability destination label Apr 5, 2023
@atoulme
Copy link
Contributor

atoulme commented Apr 5, 2023

Closing for inactivity. Please reopen if the issue persists.

@atoulme atoulme closed this as completed Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Splunk Observability Issue related to Splunk Observability destination
Projects
None yet
Development

No branches or pull requests

3 participants