Remove observability for internal resources #9633

rtribotte · 2023-01-02T14:14:20Z

What does this PR do?

This PR disables access logs, metrics, and tracing, for internal routers and services.
It also introduces an addInternals option for AccessLogs, Metrics, and Tracing, to enable the latter for internal resources.

Motivation

Fixes #9170
Fixes #6861

More

Added/updated tests
Added/updated documentation

Additional Notes

This PR drops an access logs test in pkg/server/router because it was redundant with actual integration tests in place (and would have required some work to be adapted due to PR changes).

juliens

LGTM

netsandbox · 2023-02-13T16:36:29Z

I'm not sure if unconditionally dropping observability for all @internal resources is the best way, because maybe there are use cases where you want or need this information.

For example if you secure the Prometheus metrics with BasicAuth and want to see the remote IP for the HTTP 403 errors, you need the information for the prometheus@internal service in the access log.
For the same use case you may also need the Prometheus metrics for prometheus@internal to monitor the HTTP 4xx errors.

dhirschfeld · 2023-04-17T23:30:58Z

I'm not sure if unconditionally dropping observability for all @internal resources is the best way, because maybe there are use cases where you want or need this information.

I'm just piping up as a user who'd really like the ability to turn off these logs as they just add noise and make debugging harder in the common case.

If you're trying to debug an issue with internal services then I guess not having any logs is equally as painful.

Perhaps the logging for internal services could be configurable so you could set it to e.g. only log errors rather than every successful ping request.

rtribotte · 2023-04-18T13:58:32Z

Hello @netsandbox @dhirschfeld,

Thanks for your feedback!

We did think about adding an option to get the observability back for internal resources when opening the PR.
But we also thought at that time that we wanted to gather first feedback to see the traction around it before adding yet another option.

This makes total sense and we are going to rework the PR to add the option.

rtribotte · 2023-05-15T12:42:06Z

Hello @netsandbox @dhirschfeld,

We changed our mind about introducing the opt-in option to get back internal observability in this PR.
We want to address it in another PR and to have first the opportunity to gather more feedback on the need, to determine what would be the best approach for this option.
Would you be inclined to open a new issue to discuss it?

jpds · 2023-05-16T09:08:11Z

This should really be implemented with conditional logging as nginx does: https://nginx.org/en/docs/http/ngx_http_log_module.html and also be off by default.

Speaking as someone with a systems security background - there's another class of software out there that hides what it's doing from operators by default, and that's malware. Here's the Prometheus metrics for my internal resources for the past 5 days:

From an audit perspective - there's an obvious baseline here for what's "normal" requests, but then there's a sudden increase in requests to a Traefik instance in the last hour - is that:

Me just running curl against an internal endpoint
A confused internal user accessing the wrong port?
A misconfigured Prometheus instance?
A attacker that's probing whatever endpoints they can (be it from a foothold they've gained into the internal network (if so, where did those requests come from?), or through a misconfigured firewall)?

Because without logs/metrics - you cannot even reason about any of the above.

zetaab · 2023-11-07T16:42:33Z

@rtribotte @juliens what is status of this PR could it be merged? As I see it, it would be nice to ignore these internal services

ldez · 2023-11-07T22:38:18Z

to be merged a PR needs to be up to date, CI ok, and 3 approves

rtribotte · 2023-12-20T10:33:48Z

Hello,

We decided to move the target of this PR to the next milestone, as we changed our minds about making it optional on another iteration.
The next step is to introduce an option to control the observability of the internal resources.

justinabrahms · 2024-01-11T17:37:17Z

I need this change as traefik is producing too many useless spans (health checks) to be economical for us.

Aside from getting CI happy, what needs to be done before folks feel ready to approve? Is it at least directionally correct?

nmengin

LGTM 🎉

comminutus · 2024-01-30T16:05:11Z

What version will this be included in next and how can one activate / use it?

rtribotte added kind/enhancement a new or improved feature. status/2-needs-review area/middleware/metrics area/middleware/tracing area/accesslogs labels Jan 2, 2023

traefiker added the size/M label Jan 2, 2023

traefiker added this to the 3.0 milestone Jan 2, 2023

rtribotte force-pushed the remove-internals-observability branch from dcdfaf4 to 7097e37 Compare January 2, 2023 15:37

rtribotte added the breaking label Jan 4, 2023

rtribotte force-pushed the remove-internals-observability branch from 3a904dd to 55018e9 Compare January 9, 2023 10:49

juliens approved these changes Jan 9, 2023

View reviewed changes

rtribotte force-pushed the remove-internals-observability branch from 6971539 to 8eb29ea Compare January 9, 2023 17:05

rtribotte added the bot/no-merge label Apr 18, 2023

rtribotte removed the bot/no-merge label May 15, 2023

rtribotte modified the milestones: 3.0, next Dec 20, 2023

rtribotte added the contributor/waiting-for-corrections label Dec 20, 2023

rtribotte changed the base branch from v3.0 to master December 20, 2023 13:02

nmengin assigned rtribotte Jan 15, 2024

rtribotte force-pushed the remove-internals-observability branch from 8eb29ea to 182caff Compare January 22, 2024 10:42

rtribotte changed the base branch from master to v3.0 January 22, 2024 10:42

nmengin added the priority/P1 need to be fixed in next release label Jan 30, 2024

nmengin approved these changes Jan 30, 2024

View reviewed changes

nmengin added status/3-needs-merge and removed status/2-needs-review labels Jan 30, 2024

traefiker added the status/4-merge-in-progress label Jan 30, 2024

rtribotte added 14 commits January 30, 2024 15:02

feat: remove observability for internal resources

31a3e71

review: fix accesslog rotation integrations tests

bc88e78

review: remove unecessary checks on the provider name

cd446c0

review: add comment on tracing middleware wrapping

c838a6c

review: move entryPoint name field handler to observability chain

9a90c1c

review: add an option to enable internal resources observability

3ebe8b9

review: add distinct options to enable internal resources

0fc7a05

doc: review

17c9a27

doc: review

36fbba1

review

699758a

review

950585d

review

e23d2d9

review

a6bb2ce

review

512312b

traefiker force-pushed the remove-internals-observability branch from ceac983 to 512312b Compare January 30, 2024 15:02

traefiker added bot/need-human-merge and removed status/4-merge-in-progress labels Jan 30, 2024

nmengin removed the bot/need-human-merge label Jan 30, 2024

traefiker merged commit 8b77f0c into traefik:v3.0 Jan 30, 2024
22 checks passed

traefiker removed the status/3-needs-merge label Jan 30, 2024

traefiker mentioned this pull request Jan 30, 2024

Stop Tracing Internal Traffic (ping@internal, prometheus@internal, etc.) #9170

Closed

2 tasks

rtribotte mentioned this pull request Feb 1, 2024

Filtering out ping access logs #6861

Closed

rtribotte removed their assignment Feb 26, 2024

rtribotte deleted the remove-internals-observability branch March 15, 2024 14:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove observability for internal resources #9633

Remove observability for internal resources #9633

rtribotte commented Jan 2, 2023 •

edited

juliens left a comment

netsandbox commented Feb 13, 2023

dhirschfeld commented Apr 17, 2023

rtribotte commented Apr 18, 2023 •

edited

rtribotte commented May 15, 2023

jpds commented May 16, 2023 •

edited

zetaab commented Nov 7, 2023

ldez commented Nov 7, 2023

rtribotte commented Dec 20, 2023

justinabrahms commented Jan 11, 2024

nmengin left a comment

comminutus commented Jan 30, 2024

Remove observability for internal resources #9633

Remove observability for internal resources #9633

Conversation

rtribotte commented Jan 2, 2023 • edited

What does this PR do?

Motivation

More

Additional Notes

juliens left a comment

Choose a reason for hiding this comment

netsandbox commented Feb 13, 2023

dhirschfeld commented Apr 17, 2023

rtribotte commented Apr 18, 2023 • edited

rtribotte commented May 15, 2023

jpds commented May 16, 2023 • edited

zetaab commented Nov 7, 2023

ldez commented Nov 7, 2023

rtribotte commented Dec 20, 2023

justinabrahms commented Jan 11, 2024

nmengin left a comment

Choose a reason for hiding this comment

comminutus commented Jan 30, 2024

rtribotte commented Jan 2, 2023 •

edited

rtribotte commented Apr 18, 2023 •

edited

jpds commented May 16, 2023 •

edited