Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otelcol] Allow confmap to write logs using configured logger #10008

Conversation

TylerHelmuth
Copy link
Member

@TylerHelmuth TylerHelmuth commented Apr 19, 2024

Description

This PR allows confmap to create logs, and then actually writes the logs out after the collector's real logger is instantiated.

Example of the logger in action

receivers:
  nop:

exporters:
  otlphttp:
    endpoint: http://0.0.0.0:4317
    headers:
      # Not set
      x-test: ${env:TEMP3}
  debug:
    # set to "detailed"
    verbosity: $TEMP

service:
  pipelines:
    traces:
      receivers:
        - nop
      exporters:
        - debug

image

Alternative to #10007

Link to tracking issue

Related to #9162
Related to #5615

Testing

If we like this approach I'll add tests

@TylerHelmuth TylerHelmuth requested a review from a team as a code owner April 19, 2024 20:41
@TylerHelmuth
Copy link
Member Author

A downside to this approach is that the logs come in a weird order. This is highlighted in the description's screenshot.

@TylerHelmuth TylerHelmuth added the Skip Changelog PRs that do not require a CHANGELOG.md entry label Apr 19, 2024
@evan-bradley
Copy link
Contributor

evan-bradley commented Apr 19, 2024

@TylerHelmuth I think this is a more scalable solution given that providers may want to do additional logging in the future (e.g. an HTTP provider that polls an endpoint can't reach the endpoint during runtime) and it would be nice if these logging settings are in line with the rest of the Collector.

Regarding the log ordering, this highlights the fact that, in my opinion, otelcol.Collector and service have a lot of weird overlap. We could realistically do various odd things to fix the ordering (e.g. passing the logs into the service), but I don't think it's such a big deal.

@mx-psi
Copy link
Member

mx-psi commented Apr 22, 2024

A downside to this approach is that the logs come in a weird order. This is highlighted in the description's screenshot.

I don't feel like this is a big deal. As part of #4970 we could explore moving the logger initialization outside of service, that would give us more flexibility with how to do this.

Note that service.Logger() has this comment:

This is a temporary API that may be removed soon after investigating how the collector should record different events.

I am fine with using it here for now, but I guess we'll have to work on refactoring this part at some point (it does feel a bit weird that that method exists).

cc @ankitpatel96, this is a rework of #9908

@TylerHelmuth
Copy link
Member Author

Another potential downside I need to investigate: if there is an error during confmap resolution will the logs still be written

@TylerHelmuth TylerHelmuth changed the title [otelcol] Write logs once we have a logger [otelcol] Allow confmap to write logs using configured logger Apr 22, 2024
@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Apr 22, 2024

if there is an error during confmap resolution will the logs still be written

The answer is no, the logs will not be written. So if we wanted to add debug logs to help users while troubleshooting configuration resolution that wouldn't work - they'd need to entirely depend on the errors returned.

With this solution any logs from confmap will only be written on a successful collector configuration resolution AND service creation.

@evan-bradley
Copy link
Contributor

With this solution any logs from confmap will only be written on a successful collector configuration resolution AND service creation.

Could we circumvent this by creating an "error only" logger in otelcol.Collector that replays the logs from the observer with a configuration similar to the one in #10007?

@mx-psi
Copy link
Member

mx-psi commented Apr 23, 2024

The answer is no, the logs will not be written. So if we wanted to add debug logs to help users while troubleshooting configuration resolution that wouldn't work - they'd need to entirely depend on the errors returned.

The alternative (without changing public API) would be to create the logger twice: once before creating the service to log these, and once inside the server.

I am okay with dealing with this problem later if we don't like that though, this is a net improvement IMO and we could merge after adding some tests

Copy link

codecov bot commented Apr 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.56%. Comparing base (31528ce) to head (8e9bce4).

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #10008   +/-   ##
=======================================
  Coverage   91.56%   91.56%           
=======================================
  Files         360      360           
  Lines       16698    16703    +5     
=======================================
+ Hits        15289    15294    +5     
  Misses       1073     1073           
  Partials      336      336           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

otelcol/collector_test.go Outdated Show resolved Hide resolved
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
@@ -202,6 +207,12 @@ func (col *Collector) setupConfigurationComponents(ctx context.Context) error {
return err
}

if col.ol != nil {
for _, log := range col.ol.All() {
col.service.Logger().Log(log.Level, log.Message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we should use the core directly here via its Check and Write methods. That way we would preserve the exact logged message, including the timestamp and stack trace. I think it can be done, but it's not trivial since there is some logic we would have to copy from here.

Do you think it's worth the effort?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would having out-of-order timestamps be a problem? I dont think stacktraces is a valid scenario bc currently if the confmap errors its logs will not get written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think stacktraces is a valid scenario bc currently if the confmap errors its logs will not get written.

I don't get what you mean by this. What I meant is that instead of having otelcol@v0.98.0/collector:XX we would have the actual line where this log came from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at #10056 this is actually necessary since we are not logging the fields (if you compare the screenshots, with this PR you don't know what env var is unset)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like the idea of duplicating functionality that exists in zapcore already

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I can fix the field issue by passing in log.Context... as the last param.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like the idea of duplicating functionality that exists in zapcore already

To be clear: it does not exist in zapcore, it exists in *zap.Logger

otelcol/collector.go Show resolved Hide resolved
@@ -15,6 +15,7 @@ import (

"go.uber.org/multierr"
"go.uber.org/zap"
"go.uber.org/zap/zaptest/observer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bringing in a test import here seems a bit strange. from the observer docs:

It's useful for applications that want to unit test their log output without tying their tests to a particular output encoding.

I worry that this package may change in the future causing issue, but maybe it's not a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're certainly using the package in a use case I think they didn't intend. I still think the simplicity of #10007 is most appropriate as it is the collector making logging decisions before being told how to log. It keeps the logs in proper order as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry that this package may change in the future causing issue, but maybe it's not a problem?

I don't think that should be a problem since (i) the fact that we use this package is an implementation detail and (ii) even if the package disappeared we could rebuild it ourselves (it does not depend on any internal bits of zap)

@TylerHelmuth
Copy link
Member Author

I think this is a more scalable solution given that providers may want to do additional logging in the future (e.g. an HTTP provider that polls an endpoint can't reach the endpoint during runtime) and it would be nice if these logging settings are in line with the rest of the Collector.

@evan-bradley to address this concern in #10007 could the confmap provider's logger be updated with a new logger once the actual one is instantiated?

@evan-bradley
Copy link
Contributor

evan-bradley commented Apr 29, 2024

@evan-bradley to address this concern in #10007 could the confmap provider's logger be updated with a new logger once the actual one is instantiated?

I would prefer not to mess around with the logger instance if we can get away with it. Additionally, that may be a challenge: once we pass the logger in and the providers store the reference, I don't think we have any real ability to update it.

If we want to avoid using the observer, I would say we should do one of these:

  1. Refactor the otelcol<->service interface to move config resolution responsibilities to the service. This needs a lot of discussion and may not be the right move, but I can bring this one up at the next SIG if there aren't any major objections to it.
  2. Have otelcol.Collector instantiate the logger and pass it to the service.

In both of these cases we would ideally find a way to delay actually outputting any logs until we've resolved a log level. Otherwise we would just have to use a default log level until we get the real one.

Actually, looking at this a second time, I think we may have to move away from the observer regardless if we want logs at runtime. It doesn't look like it has any methods that would support "streaming" the logs to a real logger.

@evan-bradley
Copy link
Contributor

once we pass the logger in and the providers store the reference, I don't think we have any real ability to update it.

We could get around this by passing a wrapper struct that allows us to update the instance reference so long as providers access the logger only through the wrapper. It would be nice to see if we could solve this some other way first.

@TylerHelmuth
Copy link
Member Author

@evan-bradley I don't think either option 1 or 2 will solve the problem of trying to have an the actual user-configured logger during config resolution since they don't solve the problem that the user-configured logger depends on config resolution.

@TylerHelmuth
Copy link
Member Author

We could get around this by passing a wrapper struct that allows us to update the instance reference so long as providers access the logger only through the wrapper. It would be nice to see if we could solve this some other way first.

This would work and would be enforceable if ProviderSettings.Logger used that wrapper instead of *zap.Logger. I believe we'd also need a new method on ConfigProvider that would allow updating the logger, or we'd need to keep around a reference to the original wrapper passed into ProviderSettings that could be updated.

@evan-bradley
Copy link
Contributor

@evan-bradley I don't think either option 1 or 2 will solve the problem of trying to have an the actual user-configured logger during config resolution since they don't solve the problem that the user-configured logger depends on config resolution.

Both of those options are intended to keep a single instance of the logger and just update it in-place after the config is resolved. I think you may be right that they don't solve the problem though, since it appears the zap APIs all return a new instance when you update a logger.

This would work and would be enforceable if ProviderSettings.Logger used that wrapper instead of *zap.Logger.

Providers could still store the *zap.Logger reference and just use that; any updates to the wrapper would then go unused. I think to make it fully enforceable we would need to fully wrap *zap.Logger and keep the instance reference hidden. Realistically we could also rely on documentation and just say to only store references to the wrapper since the instance may change.

@TylerHelmuth
Copy link
Member Author

I think to make it fully enforceable we would need to fully wrap *zap.Logger and keep the instance reference hidden.

Agreed, ProviderSettings would need to look something like

// ProviderSettings are the settings to initialize a Provider.

type OurCustomLoggingInterface {
  # expose the zap.Logger functions here and the ability to update the underlying zap logger.
}

type ProviderSettings struct {
	Logger OurCustomLoggingInterface
}

I don't like it.

@mx-psi
Copy link
Member

mx-psi commented Apr 30, 2024

Re: how to update the logger at runtime. I don't get why we need to wrap *zap.Logger into something else. Why not just build a custom zapcore.Core that we can update at runtime? It's an interface already so we can do whatever we want there, including replacing its internals after the fact, can't we?

@TylerHelmuth
Copy link
Member Author

Why not just build a custom zapcore.Core that we can update at runtime?

I can investigate this idea

@TylerHelmuth
Copy link
Member Author

Tried out the idea here: #10056

@@ -202,6 +207,12 @@ func (col *Collector) setupConfigurationComponents(ctx context.Context) error {
return err
}

if col.ol != nil {
for _, log := range col.ol.All() {
col.service.Logger().Log(log.Level, log.Message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at #10056 this is actually necessary since we are not logging the fields (if you compare the screenshots, with this PR you don't know what env var is unset)

@TylerHelmuth
Copy link
Member Author

Closing this for now in favor of #10056

mx-psi added a commit that referenced this pull request May 10, 2024
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Provides a logger to confmap that buffers logs in memory until the
primary logger can be used. Once the primary logger exists, places that
used the original logger are given the updated Core.

If an error occurs that would shut down the collector before the primary
logger could be created, the logs are written to stdout/err using a
fallback logger.

Alternative to
#10008

I've pushed the testing I did to show how the logger successfully
updates. Before config resolution the debug log in confmap is not
printed, but afterwards it is.

test config:
```yaml
receivers:
  nop:

exporters:
  otlphttp:
    endpoint: http://0.0.0.0:4317
    headers:
      # Not set
      x-test: ${env:TEMP3}
  debug:
    # set to "detailed"
    verbosity: $TEMP

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    traces:
      receivers:
        - nop
      exporters:
        - debug
```


![image](https://github.com/open-telemetry/opentelemetry-collector/assets/12352919/6a17993f-1f97-4c54-9165-5c34dd58d108)

<!-- Issue number if applicable -->
#### Link to tracking issue
Related to
#9162
Related to
#5615

<!--Describe what testing was performed and which tests were added.-->
#### Testing
If we like this approach I'll add tests

<!--Describe the documentation added.-->
#### Documentation

---------

Co-authored-by: Dan Jaglowski <jaglows3@gmail.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
andrzej-stencel pushed a commit to andrzej-stencel/opentelemetry-collector that referenced this pull request May 27, 2024
…ry#10056)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Provides a logger to confmap that buffers logs in memory until the
primary logger can be used. Once the primary logger exists, places that
used the original logger are given the updated Core.

If an error occurs that would shut down the collector before the primary
logger could be created, the logs are written to stdout/err using a
fallback logger.

Alternative to
open-telemetry#10008

I've pushed the testing I did to show how the logger successfully
updates. Before config resolution the debug log in confmap is not
printed, but afterwards it is.

test config:
```yaml
receivers:
  nop:

exporters:
  otlphttp:
    endpoint: http://0.0.0.0:4317
    headers:
      # Not set
      x-test: ${env:TEMP3}
  debug:
    # set to "detailed"
    verbosity: $TEMP

service:
  telemetry:
    logs:
      level: debug
  pipelines:
    traces:
      receivers:
        - nop
      exporters:
        - debug
```


![image](https://github.com/open-telemetry/opentelemetry-collector/assets/12352919/6a17993f-1f97-4c54-9165-5c34dd58d108)

<!-- Issue number if applicable -->
#### Link to tracking issue
Related to
open-telemetry#9162
Related to
open-telemetry#5615

<!--Describe what testing was performed and which tests were added.-->
#### Testing
If we like this approach I'll add tests

<!--Describe the documentation added.-->
#### Documentation

---------

Co-authored-by: Dan Jaglowski <jaglows3@gmail.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Skip Changelog PRs that do not require a CHANGELOG.md entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants