Skip to content
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.

(SLV-788) New and updated dashboards #89

Merged
merged 3 commits into from
Feb 20, 2020
Merged

(SLV-788) New and updated dashboards #89

merged 3 commits into from
Feb 20, 2020

Conversation

RandellP
Copy link
Contributor

@RandellP RandellP commented Feb 7, 2020

added dashboards for system and process data that was recently added to Puppet metrics collector
added some new dashboards for ace and bolt data which was recently added to puppet metrics collector
Added some new dashboards for deeper dives into some of the jvms and puma services
Added a new orchestration services dashboard for ace, bolt and orch performance data.

added dashboards for system and process data that was recently added to Puppet metrics collector
added some new dashboards for ace and bolt data which was recently added to puppet metrics collector
Added some new dashboards for deeper dives into some of the jvms and puma services
Added a new orchestration services dashboard for ace, bolt and orch performance data.
updated spec test with the extra require that was added, and the new dashboards.
@RandellP RandellP changed the title (SLV-788) New and updated dashboards **Do Not Merge** (SLV-788) New and updated dashboards Feb 10, 2020
@tkishel
Copy link
Contributor

tkishel commented Feb 10, 2020

I think we need to "see" these :)

@RandellP
Copy link
Contributor Author

http://10.234.1.30:3000/dashboards But don't mess anything up... I need to record a demo using that. :)

@johnduarte
Copy link

Demo completed. Go nuts @tkishel !

@jarretlavallee
Copy link
Contributor

These are looking great. The linking on the new dashboard is a bit odd. Some are linked and others are not. Should all of the deeper dives be linked?

@RandellP
Copy link
Contributor Author

@jarretlavallee I tried to link the ones that were most likely to be the next step. Like if you were in orch services, you would go to the ace bolt or orch deep dive, not likely to go to the puppetserver deep dive from there.

Copy link
Contributor

@jarretlavallee jarretlavallee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these all look great. Below are a few thoughts/questions I had while looking through them. Feel free to disregard any of the following.

  • GC Stats: It may be nice to graph the duration-ms as a trend line. Maybe we should drop the scavenge and graph MarkSeepws count and duration?
  • For things that will be common across multiple compilers, it may be worthwhile to have a sort order in display. That could help in scenarios where they have 20+ compilers and we want to see which line is on the top at a certain point.
  • Orchestrator Services: Would it be worthwhile to add the file sync metrics? Since we now use a Jruby pool, it could help explain some patterns. Maybe some of the following on there?
    • deploy-queue.length
    • puppet-run-time
    • average-clone-time
    • average-fetch-time

These are pretty great. Thanks for building them.

@jarretlavallee
Copy link
Contributor

@Sharpie Thoughts on these?

Copy link
Contributor

@tkishel tkishel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the demo, I thought ...

Move the Archive File Sync out of General and Top and into the Deeper Dives group.
Add "Bolt" to the (Bolt) Thread Pool chart title in the Orchestrator dashboard.
Add the Ace Thread Pool chart to the Orchestrator dashboard
deep dive: verify ace bolt thread pool charts (10 vs 100)

https://tickets.puppetlabs.com/browse/SLV-793

@RandellP
Copy link
Contributor Author

@jarretlavallee I updated this one GC graph... http://10.234.1.30:3000/d/kI83OvUWk/archive-puppetserver-jvm-performance But I couldn't figure out how to make a trend line. Googling implied I couldn't without munging the data in the db...

On the display sort... I can do that. Though I have no data to test, but I think that would mean that it would sort uniquely for each graph. Which would mean the order of servers would be different for each graph. Is that okay?

On the file sync stuff. I don't have any data from my runs, so it is hard for me to add those. Maybe should be a separate PR.

@RandellP
Copy link
Contributor Author

@tkishel On the orchestrator thread pool. Ace is already there... I had it use the right side axes so that it could have a different range and be more visible. But maybe that made it hard to find? And what do you mean by verify?
I moved file sync to deeper dives and pushed that commit.

@jarretlavallee
Copy link
Contributor

@RandellP That looks good. I made a quick right axis on the GC for the time. How does it look now? http://10.234.1.30:3000/d/kI83OvUWk/archive-puppetserver-jvm-performance?panelId=14&orgId=1&tab=general

The other suggestions are fine to leave off for now. We can revisit if the need comes up with a follow up PR.

removed Scavenge in favor of MarkSweep duration.  Gave duration a right side y access set to ms.
@RandellP
Copy link
Contributor Author

@jarretlavallee I just pushed up the changes for the gc-stats... I still need to test, but somehow my bolt installation on my laptop got updated and my plan that does the update doesn't work... so I will have to fix that later and test. Need Tom's answers anyway before we merge... :)

Copy link
Member

@Sharpie Sharpie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look good to me 👍

A couple things that could be added:

  • Memory usage for Postgres (as a whole, sum PSS for all subprocesses).
  • Puppet Server metrics from SERVER-1975

Both of those could be added in a subsequent changeset though as the Postgres stats may require updates to the collector and the Server metrics require the user to be running a version of PE released within the last 6 months.

@jarretlavallee jarretlavallee merged commit 5975685 into master Feb 20, 2020
@johnduarte johnduarte deleted the SLV-788 branch February 21, 2020 18:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants