Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up nhc_hw_gather_data function for KNL #47

Closed
wants to merge 1 commit into from

Conversation

NateCrawford
Copy link

Reduces number of access to /proc/cpuinfo in the nhc_hw_gather_data function. For processors like Intel Phi KNL with 256+ threads, the function was taking over 40 seconds to return. Running on a normal file, the function should only take about 4 seconds. Pre-filtering with grep shortens the function runtime further to 1.5s on a 1.4GHz (64 core, 256 thread) KNL.

Reduces number of access to /proc/cpuinfo in the nhc_hw_gather_data function. For processors like Intel Phi KNL with 256+ threads, the function was taking over 40 seconds to return. Running on a normal file, the function should only take about 4 seconds. Pre-filtering with grep shortens the function runtime further to 1.5s on a 1.4GHz (64 core, 256 thread) KNL.
@Obihoernchen
Copy link

This one is great!
On a 4 socket intel system I get the following speedup:

# stock
$ time nhc

real	0m13,799s
user	0m2,123s
sys	0m11,618s

# patched
$ time nhc

real	0m2,098s
user	0m1,956s
sys	0m0,134s

@mej mej modified the milestones: 1.4.3 Release, 1.4.4 Release May 16, 2021
@strus38
Copy link

strus38 commented Jun 30, 2021

Hi
Why this PR is pending for so long?
Is there any particular issue we should be aware of before mergin this PR into a release?

mej added a commit that referenced this pull request Jan 31, 2023
This branch/changeset extensively refactors `lbnl_hw.nhc` as a
possible solution/improvement for the `/proc` file I/O problem (e.g.,

 - Split `nhc_hw_gather_data()` into distinct functions per section,
   allowing more fine-grained control over which `procfs` and/or
   `sysfs` files actually need to be read and parsed in the first
   place;
 - Convert the path+filename for each file being parsed into a
   configuration variable that users can customize as needed;
 - Alter the way NHC is pulling data from each file, by using a single
   `read` invocation, to avoid the
   [`lseek()`/rebuild problems described](#43 (comment))
   by @mattmix in #43.
 - Add support for using a cached copy of the original file (instead
   of reading directly from `/proc` or `/sys`) to avoid the resulting
   poor performance; and
 - Allow for the aforementioned configuration variables to specify a
   process substitution expression (see `bash(1)` for details) rather
   than a path+filename.  This, in turn, permits the user to, for
   example, `grep` out unused lines to minimize parsing time.  (For a
   specific example and rationale, see
   [this comment](#30 (comment))
   from @NateCrawford with relevant performance comparisons.)
 - Improve the unit tests for this module; now that the hardware
   checks are capable of reading from a user-defined location, the
   unit tests feed auto-generated data into the checks via the
    `/dev/stdin` "file" (really a shell variable).  Not only does this
   verify the new "user-specified custom data source" code, it also
   expands the "code coverage" of the unit tests.  The old tests just
   directly assigned the test data to the right NHC-internal
   variables; the new tests cover the parsing code as well, not just
   the checks themselves.

These changes are intended to fix #30, #39, #43, #47, and #118 as well
as some older LANL-internal issues with
[Trinity](https://www.lanl.gov/projects/trinity/specifications.php)
(our Haswell/KNL-based, nineteen-thousand-node HPE/Cray XC40).

And with respect to Trinity, I would be remiss were I to fail to
express my sincere thanks to @grahamvh, my colleague at @lanl and one
of the main sysadmins for that system, who helped me immensely in
brainstorming, devising potential solutions, testing, and providing
critical feedback en route toward finally getting this problem licked!

Feedback on this approach is much appreciated!
@mej
Copy link
Owner

mej commented Jun 6, 2023

Fixed via #121 instead; closing. NHC goes to great lengths to avoid spawning subshells or other processes wherever and whenever possible. The culprit turned out to be the way the kernel and Bash handle /proc filesystem reads, so in lieu of grep, a more efficient, one-shot read-and-store method was chosen instead.

@mej mej closed this Jun 6, 2023
@NateCrawford NateCrawford deleted the patch-1 branch June 2, 2024 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants