Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more detailed hardware metrics to the collection and output of validator-info #1669

Open
WadeBarnes opened this issue Mar 21, 2021 · 4 comments

Comments

@WadeBarnes
Copy link
Member

WadeBarnes commented Mar 21, 2021

The output of validator-info currently returns limited hardware metrics for a given node. To better facilitate node monitoring as well as continuous node compliance monitoring, validator-info should report a more complete set of hardware metrics including:

  • Hard Disk Stats
    • Total size (amount), available (amount), and used (amount and percent) for all volumes.
    • Total size (amount), available (amount), and used (amount and percent) for indy-node specifically.
  • Memory and CPU
    • The CPU and memory use of the node processes is reported, however,
    • The output should be updated to provide more details regarding overall system memory and CPU usage and load.
  • Network Interface IP Binding
    • Information regarding NIC and IP address bindings.
    • To be used to ensure the node has been configured with separate IP addresses, bound to separate NICs, and assigned to different subnets.

All information should be included in the output of validator-info on the node itself, and authenticated calls to the get-validator-info transaction. It appears the results for each are different, with the results of validator-info containing more information.

This is to address HDD, Memory, and CPU resource discussions here; hyperledger/indy-node-monitor#24 (comment)

Requirements:

@lohanspies
Copy link

We should also potentially include RAID 1 detection as this is a technical policy requirement. Unless this will fall into the technical policy checks.

@WadeBarnes
Copy link
Member Author

We should also potentially include RAID 1 detection as this is a technical policy requirement. Unless this will fall into the technical policy checks.

@lohanspies, Would you be able to provide a link to the associated document please?

@lohanspies
Copy link

lohanspies commented Mar 24, 2021

https://sovrin.org/wp-content/uploads/Steward-Technical-and-Organizational-Policies-V2.pdf
Node Technical Policies number 7
"MUST have at least 1 TB, with the ability to grow to 2 TB, of reliable (e.g., RAIDed) disk space, with an adequately sized boot partition."
Don't specifically mention RAID as a requirement though, however it is being checked here - https://github.com/sovrin-foundation/steward-tools/blob/4b746d7d3a3ccd5981c9c984df39dab44258c2fc/steward_tech_check.py#L101

@WadeBarnes
Copy link
Member Author

https://sovrin.org/wp-content/uploads/Steward-Technical-and-Organizational-Policies-V2.pdf
Node Technical Policies number 7
"MUST have at least 1 TB, with the ability to grow to 2 TB, of reliable (e.g., RAIDed) disk space, with an adequately sized boot partition."
Don't specifically mention RAID as a requirement though, however it is being checked here - https://github.com/sovrin-foundation/steward-tools/blob/4b746d7d3a3ccd5981c9c984df39dab44258c2fc/steward_tech_check.py#L101

Collection of that information would be covered under this ticket which is asking for the metrics collected by the script to be integrated into validator-info; #1670

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants