-
Notifications
You must be signed in to change notification settings - Fork 40
Conversation
13ebdc1
to
8a47540
Compare
added dashboards for system and process data that was recently added to Puppet metrics collector added some new dashboards for ace and bolt data which was recently added to puppet metrics collector Added some new dashboards for deeper dives into some of the jvms and puma services Added a new orchestration services dashboard for ace, bolt and orch performance data. updated spec test with the extra require that was added, and the new dashboards.
|
I think we need to "see" these :) |
|
http://10.234.1.30:3000/dashboards But don't mess anything up... I need to record a demo using that. :) |
|
Demo completed. Go nuts @tkishel ! |
|
These are looking great. The linking on the new dashboard is a bit odd. Some are linked and others are not. Should all of the deeper dives be linked? |
|
@jarretlavallee I tried to link the ones that were most likely to be the next step. Like if you were in orch services, you would go to the ace bolt or orch deep dive, not likely to go to the puppetserver deep dive from there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So these all look great. Below are a few thoughts/questions I had while looking through them. Feel free to disregard any of the following.
- GC Stats: It may be nice to graph the
duration-msas a trend line. Maybe we should drop the scavenge and graph MarkSeepws count and duration? - For things that will be common across multiple compilers, it may be worthwhile to have a sort order in
display. That could help in scenarios where they have 20+ compilers and we want to see which line is on the top at a certain point. - Orchestrator Services: Would it be worthwhile to add the file sync metrics? Since we now use a Jruby pool, it could help explain some patterns. Maybe some of the following on there?
- deploy-queue.length
- puppet-run-time
- average-clone-time
- average-fetch-time
These are pretty great. Thanks for building them.
|
@Sharpie Thoughts on these? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
During the demo, I thought ...
Move the Archive File Sync out of General and Top and into the Deeper Dives group.
Add "Bolt" to the (Bolt) Thread Pool chart title in the Orchestrator dashboard.
Add the Ace Thread Pool chart to the Orchestrator dashboard
deep dive: verify ace bolt thread pool charts (10 vs 100)
|
@jarretlavallee I updated this one GC graph... http://10.234.1.30:3000/d/kI83OvUWk/archive-puppetserver-jvm-performance But I couldn't figure out how to make a trend line. Googling implied I couldn't without munging the data in the db... On the display sort... I can do that. Though I have no data to test, but I think that would mean that it would sort uniquely for each graph. Which would mean the order of servers would be different for each graph. Is that okay? On the file sync stuff. I don't have any data from my runs, so it is hard for me to add those. Maybe should be a separate PR. |
|
@tkishel On the orchestrator thread pool. Ace is already there... I had it use the right side axes so that it could have a different range and be more visible. But maybe that made it hard to find? And what do you mean by verify? |
|
@RandellP That looks good. I made a quick right axis on the GC for the time. How does it look now? http://10.234.1.30:3000/d/kI83OvUWk/archive-puppetserver-jvm-performance?panelId=14&orgId=1&tab=general The other suggestions are fine to leave off for now. We can revisit if the need comes up with a follow up PR. |
removed Scavenge in favor of MarkSweep duration. Gave duration a right side y access set to ms.
|
@jarretlavallee I just pushed up the changes for the gc-stats... I still need to test, but somehow my bolt installation on my laptop got updated and my plan that does the update doesn't work... so I will have to fix that later and test. Need Tom's answers anyway before we merge... :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look good to me 👍
A couple things that could be added:
- Memory usage for Postgres (as a whole, sum PSS for all subprocesses).
- Puppet Server metrics from SERVER-1975
Both of those could be added in a subsequent changeset though as the Postgres stats may require updates to the collector and the Server metrics require the user to be running a version of PE released within the last 6 months.
added dashboards for system and process data that was recently added to Puppet metrics collector
added some new dashboards for ace and bolt data which was recently added to puppet metrics collector
Added some new dashboards for deeper dives into some of the jvms and puma services
Added a new orchestration services dashboard for ace, bolt and orch performance data.