Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create tool to make it easy to share debug information #1551

Closed
grobie opened this Issue Apr 12, 2016 · 13 comments

Comments

Projects
None yet
7 participants
@grobie
Copy link
Member

grobie commented Apr 12, 2016

A common request during debugging issues is to request various pprof information from users, for example in #1549.

In order to make that process easier for users and less time intensive for developers, it'd be great to have a promtool debug share-profile to automate the process of:

  • getting a profile
  • creating SVG/whatever on it
  • optionally uploading and attaching it to a github issue.
@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Apr 12, 2016

I was always thinking of a web tool that we could offer where people could enter their Prometheus URL and it would pull interesting info out of it automatically via AJAX and store it as a permalink somehow. However, that would only work for API endpoints, because those have CORS enabled. Of course, we could add more info to those as well.

@grobie

This comment has been minimized.

Copy link
Member Author

grobie commented Apr 12, 2016

That might be even easier. I think all information of the prometheus server should be exposed via the API.

@grobie grobie changed the title Add debug tools to promtool Create tool to make it easy to share debug information Aug 18, 2016

@chyeh

This comment has been minimized.

Copy link
Contributor

chyeh commented May 3, 2018

Hello, I'm new here! I would like to know if anyone is working on this issue. Any update?

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 3, 2018

@chyeh Hi and welcome! I don't think anyone is currently working on this, but it would still be something that's nice to have.

@chyeh

This comment has been minimized.

Copy link
Contributor

chyeh commented May 3, 2018

@juliusv tks for your reply. After reading through this thread, I feel this issue needs some further discussion. My question is what exactly is debug information that should be exposed? Before implementation, I might need to confirm the spec of the API with you.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented May 6, 2018

Here's an initial list of details that would be useful for debugging:

  • Information that can be gained via Prometheus's query API:
    • All current values of metrics about Prometheus itself (only works if Prom is scraping itself, otherwise we have to scrape its /metrics)
    • Derived expressions on those metrics like rate(prometheus_tsdb_head_samples_appended_total[1m]). Also requires Prom to scrape itself.
    • Information relating to target scrapes like scrape_duration, etc.
  • Information that can be retrieved via other API endpoints:
    • The Prometheus configuration (including rules), available via /api/v1/status/config and /api/v1/status/flags. Rules are not exposed yet via API, but should be.
    • Information about targets. Available via /api/v1/targets.
    • Information about connected Alertmanagers via /api/v1/alertmanagers.
  • Information available via /debug/pprof (usually obtained via go tool pprof):
    • CPU profile
    • Heap profile
    • goroutine dump
    • etc.

So, much of the info can be pulled from the APIs that have CORS, but we might also need to scrape /metrics on the Prometheus itself and fetch info from the /debug/pprof endpoints. Both will likely never be available over the main API. Even if a browser could contact /debug/pprof endpoints, it would be hard to interpret the data without running go tool pprof.

Given that, I think that a command-line tool is the best fit, since it does not have the limitations that a browser has in terms of CORS or being able to connect to a Prometheus server on another network, and it can run other tools like go tool pprof (if available) too.

Making it part of promtool like @grobie initially suggested makes sense then.

@chyeh

This comment has been minimized.

Copy link
Contributor

chyeh commented Jun 3, 2018

@juliusv sorry for the late reply! After a month I'm finally available to switching back to this issue. I've pushed some commits in my own branch and I want to make sure if that's what we need in this issue. In summary, I added two commands:

  • promtool debug metrics <prometheus url>: It prints the text from the page/metrics and saves as metrics.txt.
  • promtool debug pprof <prometheus url>: It generates block.pb.gz, goroutine.pb.gz, heap.pb.gz,
    mutex.pb.gz and threadcreate.pb.gz and also dump the message of each profile in the CLI.

Does it look like what we need here? Let me know if there is any idea. Tks!

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Jun 3, 2018

@chyeh Hey, that looks pretty good already, without reviewing everything in detail yet! While I like the multi-command approach from a cleanliness persepctive, I wonder if it should just be one big command though that pulls all the debug information together. The main goal is for users to run only one easy step that gives us all the necessary info so that we can help them, so the simpler the better. So it would also be good if it created one large archive with everything in the end. What do you think? Maybe if someone people are worried about including certain information (like /metrics), there could still be a command-line flag that disables some of the debug collectors later.

@chyeh

This comment has been minimized.

Copy link
Contributor

chyeh commented Jun 3, 2018

How about the following design:

  • promtool debug: Create debug.tar.gz which includes all the files
  • promtool debug pprof: Create debug.tar.gzwhich includes 5 profiles.
  • promtool debug metrics: Create debug.tar.gz which includes 1 text file.

And I just realized that the /metrics path is configurable so it will need to read the metrics_path field from /api/v1/status/config first. For /debug/pprof, if it's configurable in the future, it will also need the same thing! I will get this done soon.

Also I'll think about how to add the tests and probably do some refactory before submitting a PR.

@juliusv

This comment has been minimized.

Copy link
Member

juliusv commented Jun 3, 2018

@chyeh Great!

promtool debug: Create debug.tar.gz which includes all the files

Maybe promtool debug all or something like that would be more explicit. But once it's implemented it should be easy to change the exact command invocation, and promtool isn't covered under our major version stability guarantees, so we can still figure it out.

@chyeh

This comment has been minimized.

Copy link
Contributor

chyeh commented Jun 9, 2018

@juliusv I just submitted a PR #4247. Forget about metrics_path I mentioned above. That was because I wasn't familiar with the configuration file.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jul 18, 2018

implemented in #4247

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 22, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 22, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.