Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add proxy for Prometheus remote write #2065

Merged
merged 2 commits into from
Feb 3, 2022
Merged

Conversation

swiatekm
Copy link

@swiatekm swiatekm commented Feb 1, 2022

Description

Prometheus remote write uses a single persistent HTTP connection per target, which interacts poorly with TCP load balancing with iptables that K8s Services do. Use a real HTTP load balancer for this instead - nginx with a very simple configuration.

I manually ran integration tests and they passed. I don't want to add another test to run for all K8s versions, so I'll add the integration test after we redo the test matrix in the CI do allow tests to only run in a single K8s environment.


Checklist

Remove items which don't apply to your PR.

  • Changelog updated
Testing performed
  • Redeploy fluentd and fluentd-events pods
  • Confirm events, logs, and metrics are coming in

@github-actions github-actions bot added the documentation documentation label Feb 1, 2022
Copy link
Contributor

@pmalek-sumo pmalek-sumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a changelog entry for this

@sumo-drosiek
Copy link
Contributor

autoscalling will probably end with poor loadbalancing like we observe for current architecture, but I think it will be improvement anyway

@swiatekm swiatekm force-pushed the feat/remote-write-proxy branch 2 times, most recently from f2fa6b1 to 61c1e63 Compare February 2, 2022 13:22
@swiatekm swiatekm marked this pull request as ready for review February 2, 2022 13:38
@swiatekm swiatekm requested a review from a team as a code owner February 2, 2022 13:38
@swiatekm
Copy link
Author

swiatekm commented Feb 2, 2022

autoscalling will probably end with poor loadbalancing like we observe for current architecture, but I think it will be improvement anyway

In the real world, I'm not convinced anyone will actually need more than three Nginx instances - the three replicas are only here for availability. In a real-world use case, a customer is pushing ~13M samples per minute through nginx, and the total CPU usage is around 500 mCPU.

Nginx also eats so few resources doing this, that I think the additional complexity of the HPA isn't worth it.

@swiatekm swiatekm force-pushed the feat/remote-write-proxy branch 3 times, most recently from a1cc929 to e5d56de Compare February 3, 2022 12:21
Copy link
Contributor

@perk-sumo perk-sumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall pending some comments :)

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
Mikołaj Świątek added 2 commits February 3, 2022 15:41
Prometheus remote write uses a single persistent HTTP connection per
target, which interacts poorly with TCP load balancing with iptables
that K8s Services do. Use a real HTTP load balancer for this instead -
nginx with a very simple configuration.
@swiatekm swiatekm enabled auto-merge (rebase) February 3, 2022 14:43
@swiatekm swiatekm merged commit 7462575 into main Feb 3, 2022
@swiatekm swiatekm deleted the feat/remote-write-proxy branch February 3, 2022 14:45
@perk-sumo perk-sumo added this to the v2.5 milestone Feb 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants