Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm_promscrape proxy_url use wrong Host header #2794

Closed
jtorkkel opened this issue Jun 28, 2022 · 6 comments
Closed

vm_promscrape proxy_url use wrong Host header #2794

jtorkkel opened this issue Jun 28, 2022 · 6 comments
Labels
bug Something isn't working vmagent

Comments

@jtorkkel
Copy link

jtorkkel commented Jun 28, 2022

When vm use proxy_url the Host header contains proxy server host address, instead of target service host name.
curl and prometheus set target service address to Host header, which is apparently the right way.

proxy_url added this way

- job_name: docker
  proxy_url: http://ap-tools.xxx.net:3128
  static_configs:
    - targets: ['ap-dock.xxx.net:3333'] 
  #kubernetes_sd_configs

request from vm:

GET http://ap-dock.oneadr.net:3333/metrics HTTP/1.1
User-Agent: vm_promscrape
Host: ap-tools.xxx.net:3128
Accept: text/plain;version=0.0.4;q=1,*/*;q=0.1
X-Prometheus-Scrape-Timeout-Seconds: 30.000
Accept-Encoding: gzip

Request from prometheus and using "curl -x http://ap-tools.xxx.net:3128 http://ap-dock.oneadr.net:3333/metrics:"

GET http://ap-dock.oneadr.net:3333/metrics HTTP/1.1
User-Agent: vm_promscrape
Host: ap-tools.xxx.net:3128
Accept: text/plain;version=0.0.4;q=1,*/*;q=0.1
X-Prometheus-Scrape-Timeout-Seconds: 30.000
Accept-Encoding: gzip

proxy_url allow scraping metrics through proxy service. Needed to scrape metrics outside k8s or docker utilizing proxy installed inside the cluster as example.

Some proxies like NGINX are not handling VM request request correctly by default (or require complex config), prometheus request is easier to handle as host header is already correct. Some proxieslike squid can handle VM request, apparently it regenerate host header automatically.

To Reproduce
Add proxy_url and check outgoing Host header value
Steps to reproduce the behavior.

Expected behavior
target service Host header should be added to request

Logs
Check if any warnings or errors were logged by VictoriaMetrics components
or components in communication with VictoriaMetrics (e.g. Prometheus, Grafana).

Version
The line returned when passing --version command line flag to the binary. For example:

$ ./victoria-metrics-prod --version
victoria-metrics-20220621-070434-tags-v1.78.0-0-g091408be6

Used command-line flags
--promscrape.config=prometheus.yml

@YuriKravetc YuriKravetc added the bug Something isn't working label Jun 28, 2022
@f41gh7
Copy link
Contributor

f41gh7 commented Jun 30, 2022

As workaround, try to use streamParse: true for this target.


job_name: docker
proxy_url: http://ap-tools.xxx.net:3128/
stream_parse: true
static_configs:
    targets: ['ap-dock.xxx.net:3333']
    #kubernetes_sd_configs

@valyala
Copy link
Collaborator

valyala commented Jul 6, 2022

The issue must be fixed in the commit f97355d . This commit will be included in the next release. In the meantime it is possible to build vmagent from this commit according to these instructions and verify whether it properly sets Host header when scraping targets via http proxy.

@jtorkkel
Copy link
Author

I will validate once back from holiday.

@valyala
Copy link
Collaborator

valyala commented Jul 14, 2022

@jtorkkel , if you run vmagent in Docker or Kubernetes, then you can try the following vmagent image with the bugfix - docker.io/victoriametrics/vmagent:heads-master-0-gf9500abfe

@valyala
Copy link
Collaborator

valyala commented Jul 14, 2022

FYI, vmagent should properly set Host header when scraping targets via http proxy starting from v1.79.0. Closing the issue as fixed.

@jtorkkel , feel free re-opening this issue or creating new one if you experience the issue in v1.79.0.

@valyala valyala closed this as completed Jul 14, 2022
@jtorkkel
Copy link
Author

jtorkkel commented Aug 8, 2022

verified to work using in 1.79.0 single node version

valyala added a commit that referenced this issue Jan 30, 2024
…Client for scraping targets in non-streaming mode

While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics),
it doesn't work well when scraping targets with big responses such as kube-state-metrics.
In this case it could use big amounts of additional memory comparing to net/http.Client,
since fasthttp.Client reads the full response in memory and then tries re-using the large buffer
for further scrapes.

Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects
and scrape timeouts like the following ones:

- #1945
- #5425
- #2794
- #1017

This should help reducing memory usage for the case when target returns big response
and this response is scraped by fasthttp.Client at first before switching to stream parsing mode
for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape
after reading the response body in memory and noticing that its size exceeds the value passed
to -promscrape.minResponseSizeForStreamParse command-line flag.
Updates #5567

Overrides #4931
valyala added a commit that referenced this issue Jan 30, 2024
…Client for scraping targets in non-streaming mode

While fasthttp.Client uses less CPU and RAM when scraping targets with small responses (up to 10K metrics),
it doesn't work well when scraping targets with big responses such as kube-state-metrics.
In this case it could use big amounts of additional memory comparing to net/http.Client,
since fasthttp.Client reads the full response in memory and then tries re-using the large buffer
for further scrapes.

Additionally, fasthttp.Client-based scraping had various issues with proxying, redirects
and scrape timeouts like the following ones:

- #1945
- #5425
- #2794
- #1017

This should help reducing memory usage for the case when target returns big response
and this response is scraped by fasthttp.Client at first before switching to stream parsing mode
for subsequent scrapes. Now the switch to stream parsing mode is performed on the first scrape
after reading the response body in memory and noticing that its size exceeds the value passed
to -promscrape.minResponseSizeForStreamParse command-line flag.
Updates #5567

Overrides #4931
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working vmagent
Projects
None yet
Development

No branches or pull requests

4 participants