Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otelcol] Add a custom zapcore.Core for confmap logging #10056

Merged
merged 13 commits into from
May 10, 2024

Conversation

TylerHelmuth
Copy link
Member

@TylerHelmuth TylerHelmuth commented Apr 30, 2024

Description

Provides a logger to confmap that buffers logs in memory until the primary logger can be used. Once the primary logger exists, places that used the original logger are given the updated Core.

If an error occurs that would shut down the collector before the primary logger could be created, the logs are written to stdout/err using a fallback logger.

Alternative to #10008

I've pushed the testing I did to show how the logger successfully updates. Before config resolution the debug log in confmap is not printed, but afterwards it is.

test config:

receivers:
  nop:

exporters:
  otlphttp:
    endpoint: http://0.0.0.0:4317
    headers:
      # Not set
      x-test: ${env:TEMP3}
  debug:
    # set to "detailed"
    verbosity: $TEMP

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    traces:
      receivers:
        - nop
      exporters:
        - debug

image

Link to tracking issue

Related to #9162
Related to #5615

Testing

If we like this approach I'll add tests

Documentation

otelcol/collector.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Apr 30, 2024

Codecov Report

Attention: Patch coverage is 89.79592% with 10 lines in your changes are missing coverage. Please review.

Project coverage is 91.67%. Comparing base (c6b70a7) to head (c3b7488).
Report is 3 commits behind head on main.

Files Patch % Lines
otelcol/collector.go 72.22% 7 Missing and 3 partials ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##             main   #10056    +/-   ##
========================================
  Coverage   91.67%   91.67%            
========================================
  Files         362      367     +5     
  Lines       16754    16878   +124     
========================================
+ Hits        15359    15473   +114     
- Misses       1056     1063     +7     
- Partials      339      342     +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@TylerHelmuth TylerHelmuth marked this pull request as ready for review May 3, 2024 18:10
@TylerHelmuth TylerHelmuth requested a review from a team as a code owner May 3, 2024 18:10
@djaglowski
Copy link
Member

Generally looks good to me

otelcol/buffered_core.go Outdated Show resolved Hide resolved
otelcol/buffered_core.go Outdated Show resolved Hide resolved
otelcol/buffered_core.go Outdated Show resolved Hide resolved
otelcol/collector.go Outdated Show resolved Hide resolved
otelcol/collector.go Outdated Show resolved Hide resolved
otelcol/collector_core.go Outdated Show resolved Hide resolved
otelcol/collector_core.go Outdated Show resolved Hide resolved
codeboten pushed a commit that referenced this pull request May 6, 2024
#### Description

This is an RFC to help us decide how we want `otelcol` to provide a
logger before the primary logger is created. As we discuss I will update
the doc. Before this is merged we should have decided on a solution and
the Accepted Solution section must be updated.

Related to
#10056

#### Link to tracking issue
This unblocks:
- #9162
- #5615

---------

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
@TylerHelmuth
Copy link
Member Author

@mx-psi I believe I've address all feedback

otelcol/collector.go Outdated Show resolved Hide resolved
@mx-psi
Copy link
Member

mx-psi commented May 8, 2024

@TylerHelmuth can you address the lint error?

Error: collector.go:289:10: indent-error-flow: if block ends with a return statement, so drop this else and outdent its block (revive)

I can merge after that

@mx-psi
Copy link
Member

mx-psi commented May 8, 2024

The Windows test failure seems to be legit:

    collector_windows_service_test.go:146: 
        	Error Trace:	D:/a/opentelemetry-collector/opentelemetry-collector/otelcol/collector_windows_service_test.go:146
        	Error:      	Should be true
        	Test:       	TestCollectorAsService/ConfigFileNotFound

It's failing here

require.True(t, eventTime.After(startTime.In(time.UTC)))

with this test

{
name: "ConfigFileNotFound",
configFile: filepath.Join(".", "non", "existent", "otel-config.yaml"),
expectStartFailure: true,
},

The Windows service uses a special zap core

func withWindowsCore(elog *eventlog.Log, serviceConfig **service.Config) func(zapcore.Core) zapcore.Core {

I am wondering if the fallback logger is somehow messing with this

@TylerHelmuth
Copy link
Member Author

Agreed that the failure is legit, looking into it. I'll be using CI as the runner lol

@TylerHelmuth
Copy link
Member Author

@pjanotti I need some help here, why can the windows service not handle instantiating a fallback logger before returning the error on start? If I use zap.NewNOP it works, but something about the builder causes problems.

@@ -202,7 +202,7 @@ func (w windowsEventLogCore) Sync() error {

func withWindowsCore(elog *eventlog.Log, serviceConfig **service.Config) func(zapcore.Core) zapcore.Core {
return func(core zapcore.Core) zapcore.Core {
if serviceConfig != nil {
if serviceConfig != nil && *serviceConfig != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mx-psi fixed the windows test. The issue was that when the config could not resolve then the underlying serviceConfig was nil, but this check was only checking the pointer to the pointer not being nil. When we passed the options to the fallback logger we ended up in this method and tried to do (*serviceConfig).Telemetry.Logs.OutputPaths on nil. Added an extra nil check to make sure the underlying serviceConfig is also not nil.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting to the bottom of this Windows test, can we make you the official Windows reviewer? 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No thank you

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this, we'll officially name you the Windows guy in a separate PR 😄 🪟

@mx-psi mx-psi merged commit 634223b into open-telemetry:main May 10, 2024
48 of 49 checks passed
@TylerHelmuth TylerHelmuth deleted the confmap-logger-wrapper-idea branch May 10, 2024 13:02
andrzej-stencel pushed a commit to andrzej-stencel/opentelemetry-collector that referenced this pull request May 27, 2024
#### Description

This is an RFC to help us decide how we want `otelcol` to provide a
logger before the primary logger is created. As we discuss I will update
the doc. Before this is merged we should have decided on a solution and
the Accepted Solution section must be updated.

Related to
open-telemetry#10056

#### Link to tracking issue
This unblocks:
- open-telemetry#9162
- open-telemetry#5615

---------

Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
andrzej-stencel pushed a commit to andrzej-stencel/opentelemetry-collector that referenced this pull request May 27, 2024
…ry#10056)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Provides a logger to confmap that buffers logs in memory until the
primary logger can be used. Once the primary logger exists, places that
used the original logger are given the updated Core.

If an error occurs that would shut down the collector before the primary
logger could be created, the logs are written to stdout/err using a
fallback logger.

Alternative to
open-telemetry#10008

I've pushed the testing I did to show how the logger successfully
updates. Before config resolution the debug log in confmap is not
printed, but afterwards it is.

test config:
```yaml
receivers:
  nop:

exporters:
  otlphttp:
    endpoint: http://0.0.0.0:4317
    headers:
      # Not set
      x-test: ${env:TEMP3}
  debug:
    # set to "detailed"
    verbosity: $TEMP

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    traces:
      receivers:
        - nop
      exporters:
        - debug
```


![image](https://github.com/open-telemetry/opentelemetry-collector/assets/12352919/6a17993f-1f97-4c54-9165-5c34dd58d108)

<!-- Issue number if applicable -->
#### Link to tracking issue
Related to
open-telemetry#9162
Related to
open-telemetry#5615

<!--Describe what testing was performed and which tests were added.-->
#### Testing
If we like this approach I'll add tests

<!--Describe the documentation added.-->
#### Documentation

---------

Co-authored-by: Dan Jaglowski <jaglows3@gmail.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants