Encode help-*.txt file contents into C source code #13144

jsquyres · 2025-03-15T17:55:26Z

Per discussion from https://github.com/open-mpi/ompi/wiki/Meeting-2025-03-14, we would like to make relocating an Open MPI installation easier. Specifically, text help files are something that Open MPI currently has to find in the install tree filesystem at run-time. If, instead, we can slurp these text files into C code, then there's nothing to find at run time.

That being said, even with this change, developers will still maintain help messages in the various help_*.txt files around the code base. Maintaining descriptive, user-friendly help messages in text files (instead of manually hand-coding long strings in C) has proven to be quite useful. We do not want to lose this capability.

This commit adds a step during "make" (in opal/util), those help_*.txt files are encoded into an indexed array of C strings in opal/util/show_help_content.c.

opal_show_help*() then can look up the appropriate help strings via filename / topic tuples, just like it used to -- these strings now just happen to be in C variables instead of text files.

This work is intended for main / v6.0.x -- not for v5.0.x.

rhc54 · 2025-03-15T19:38:31Z

Suggestion: have you looked at the PMIx dictionary files (src/include/pmix_dictionary.[h,c])? I generate those (using contrib/construct_dictionary.py) from the include file from the standard, generating an array of structs using the string key as the index.

I was thinking of doing the same thing here. Let the name of the help file be the first field in the struct, then the topic be the second field, and then the text be the final field. This leaves the "show_help" signature the same so we don't have to edit and change all those usages. Now show_help itself just scans thru the array, checking the first field in each struct for the file name and the second for the topic - and then extracts the corresponding block.

Python isn't much different - just stored the results a little differently.

jsquyres · 2025-03-16T17:55:41Z

Suggestion: ...

Good idea; that's far better than what I did the first time. Let me update the PR to do that...

jsquyres · 2025-03-16T20:10:27Z

Let me update the PR to do that...

Done. Updated the description in the PR to reflect the new scheme.

rhc54 · 2025-03-16T20:50:33Z

This looks good to me - many thanks for doing it!!

jsquyres · 2025-03-18T14:59:09Z

Force pushed to fix an OPAL_DECLSPEC problem and some other minor touch-ups.

rhc54 · 2025-03-21T14:01:18Z

we'll be adding about ~350KB worth of strings to each process.

Ick - probably too large, IMO. Here's what I'm doing in PMIx/PRRTE as a compromise:

running "check-help-strings.pl" (from OMPI's contrib) to identify problems in the help filesystem. Just a quick scan on PMIx revealed lots of unused topics, some unused files(!), and a number of missing (but referenced) topics. Modifying it to identify duplicate topics as well.
Clean all those up
Identify a handful of topics that might relate to being unable to find the install location or initialize the library. Primary complaint is when we hit those errors and the user can't get a useful error output. Put those handful into memory
Leave the rest as a file lookup. Probably consolidate all current files into one pmix-helpfile.txt to reduce the search complexity.
Add running "check-help-strings.pl" to the CI with an error-out if we find unused topics/files, missing referenced topics, or duplicate topics

Will do the same to PRRTE. Little more complicated there as it uses the PMIx show_help subsystem, so I'll have to add a new PMIx function to allow PRRTE to load help topics/strings into the PMIx in-memory array for those few we need there.

If you feel like/want to convert check-help-strings.pl to Python, that's fine with me! I already extended your code to identify duplicate topics - haven't added the source code scan for the other functional pieces.

bwbarrett · 2025-03-21T15:33:02Z

For static linked applications, the 300 KiB is painful. For shared library installations, I'd argue that 300 KiB doesn't make a difference. We should definitely clean up the text files, but in the end, not sure it matters to customers. From a filesystem standpoint, we've just moved 300 KiB from a bunch of text files to the shared library. The total size to install a functioning libmpi.so doesn't really change. And since it's read-only text section in the shared library, it will only exist in memory once. Even on a Raspberry Pi, 300 KiB more memory is noise.

I guess I'm saying we should clean things up, but not go overboard. The estimate is also a little high, because I didn't filter out the copyright headers, and they wouldn't be copied. There's also some things we could theoretically do to claw back 50 KiB or so, like instead of slurping all the text files into the primary shared libraries, slurping the application help files (which are 3 of the 8 largest files) into the applications (prun, palloc, pctrl), which would move 10 KiB out of the various shared libraries.

There are also text files like opal/mca/accelerator/cuda/help-accelerator-cuda.txt that are very duplicative. I think we could pass in the function call name that failed and save a bunch of entries there.

ggouaillardet · 2025-03-21T15:37:58Z

My two cents: an other option to mitigate the memory overhead is to compress the data decompress it in memory only if needed.

bwbarrett · 2025-03-21T15:40:39Z

Compression seems like a lot of work, but also would help. tmp.tar contains all the .txt files for PMIx, PRRTE, OPAL, and OMPI.

% ls -lh tmp.tar 
-rw-r--r-- 1 bbarrett Domain Users 420K Mar 21 15:32 tmp.tar
[bbarrett@ip-10-0-1-96:~/projects/open-mpi]
% gzip tmp.tar 
[bbarrett@ip-10-0-1-96:~/projects/open-mpi]
% ls -l tmp.tar*
-rw-r--r-- 1 bbarrett Domain Users 77903 Mar 21 15:32 tmp.tar.gz

ggouaillardet · 2025-03-21T16:04:18Z

Agree, we should first assess if we really need it.
And if we do, use the compress framework, I suspect xz or even bzip2 with max compression can outperform default gzip.

rhc54 · 2025-03-21T23:44:54Z

I have no issue with putting these into memory. In truth, you should not be counting the PRRTE text as that is only in the PRRTE tools - the apps do not link against libprrte. I've spent some time cleaning up the PMIx help text and will finish that in the upcoming week. It's the lesser footprint of the two projects.

I'll still need to modify PMIx to allow PRRTE to load its help text into the PMIx array. I also need to do something to resolve the rst vs txt issue - need to stop having to do everything twice, but the prior way of trying to generate the txt from the rst wasn't working well enough. Might have a solution - remains to be tried.

I also need to do something to simplify the rst stuff. Really isn't working in terms of managing things right now - the OMPI schizo rst is out-of-date because it is unmaintained, I have to split everything into tiny individual rst files, mpirun's man page is out-of-date, etc. Feels like a lot of extra work on my side for no real benefit as nobody maintains the other end. Need to ponder it a bit, come up with a better strategy.

Bottom line is that this is turning into quite the job, probably going to take another 2-3 weeks to complete. I'm aiming it at PMIx v6 and PRRTE v4 as it will not be backward compatible on either side. We have a break anyway due to the LTO work, so this should be good timing.

jsquyres · 2025-03-23T18:27:33Z

FWIW, I did my own check of the main branch and I came up with a smaller number:

$ wc -c `find . -name help\*.txt | grep -v 3rd-party | grep -v contrib/` 
...
  87252 total

~87K, not 350K.

So where do we want to go with this? I'm not sure that 87K is worth using compression.

rhc54 · 2025-03-23T19:33:16Z

I would avoid compression as it creates dependencies (remember, lack of zlib is our highest show-help incidence) and really isn't worth it. Like I said, you can remove all of the PRRTE text from your count as that only applies to the daemons - and it is the largest amount as it includes all the help for map/rank/bind and other major topics. PMIx really doesn't have very much itself, and that's the portion you'd have in your library.

jsquyres · 2025-03-23T19:34:37Z

87k is just the opal, ompi, and oshmem directories.

rhc54 · 2025-03-23T21:15:44Z

Ah, I see - didn't grok the "-v" option behavior.

FWIW: PMIx has about 30k if you exclude the PMIx tools themselves.
PRRTE has about 130k is you exclude the tool cmd line help text - most of that is in text explaining proc placement.

opal/util/Makefile.am

opal/util/show_help.c

bwbarrett · 2025-03-24T22:50:54Z

Other than a couple nits around the build system, this all looks good to me.

jsquyres · 2025-03-25T00:27:20Z

Rebased
Updated per @bwbarrett's comments
Made all the remaining Makefile.am changes

Per discussion from https://github.com/open-mpi/ompi/wiki/Meeting-2025-03-14, we would like to make relocating an Open MPI installation easier. Specifically, text help files are something that Open MPI currently has to find in the install tree filesystem at run-time. If, instead, we can slurp these text files into C code, then there's nothing to find at run time. That being said, even with this change, developers will still maintain help messages in the various help_*.txt files around the code base. Maintaining descriptive, user-friendly help messages in text files (instead of manually hand-coding long strings in C) has proven to be quite useful. We do not want to lose this capability. This commit adds a step during "make" (in opal/util), those help_*.txt files are encoded into an indexed array of C strings in opal/util/show_help_content.c. opal_show_help*() then can look up the appropriate help strings via filename / topic tuples, just like it used to -- these strings now just happen to be in C variables instead of text files. This work is intended for main / v6.0.x -- not for v5.0.x. Signed-off-by: Jeff Squyres <jeff@squyres.com>

Commit a525e70 converted opal_show_[v]help into global function pointer variables that were re-assigned a few times during the lifecycle of an application. This allows calls top opal_show_[v]help() to emit error messages different ways at different times -- e.g., before various subsystems were initialized, after they were initialized, and after they were shutdown. Long ago, however, the show_help infrastructure was converted to use PMIx under the covers, and this switcharoo was no longer necessary. Specifically: show_help always just calls PMIx and PMIx does the Right Things during at different stages of the process lifecycle. This commit changes opal_show_[v]help back to be a regular function -- not a function pointer. Signed-off-by: Jeff Squyres <jeff@squyres.com>

github-actions bot added the Target: main label Mar 15, 2025

jsquyres added the Target: v6.0.x label Mar 15, 2025

jsquyres added this to the v6.0.0 milestone Mar 15, 2025

jsquyres force-pushed the pr/help-files-as-c-code branch from 30cbe0c to 1cf2f47 Compare March 15, 2025 18:09

github-actions bot removed the Target: v6.0.x label Mar 15, 2025

jsquyres force-pushed the pr/help-files-as-c-code branch from 1cf2f47 to 79d68d7 Compare March 16, 2025 20:07

jsquyres force-pushed the pr/help-files-as-c-code branch from 79d68d7 to 0d40934 Compare March 16, 2025 20:14

jsquyres force-pushed the pr/help-files-as-c-code branch from 9bd226b to 7e4828f Compare March 18, 2025 14:58

jsquyres marked this pull request as ready for review March 24, 2025 22:35

bwbarrett reviewed Mar 24, 2025

View reviewed changes

opal/util/Makefile.am Outdated Show resolved Hide resolved

opal/util/Makefile.am Outdated Show resolved Hide resolved

opal/util/show_help.c Outdated Show resolved Hide resolved

jsquyres force-pushed the pr/help-files-as-c-code branch from 7e4828f to 88a449a Compare March 25, 2025 00:25

jsquyres added 2 commits March 24, 2025 21:11

jsquyres force-pushed the pr/help-files-as-c-code branch from 88a449a to d1bf286 Compare March 25, 2025 01:11

bwbarrett approved these changes Mar 25, 2025

View reviewed changes

jsquyres merged commit 674f46c into open-mpi:main Mar 25, 2025
15 checks passed

jsquyres deleted the pr/help-files-as-c-code branch March 25, 2025 11:11

jsquyres mentioned this pull request Mar 25, 2025

Encoded show_help text causing compiler warnings #13167

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encode help-*.txt file contents into C source code #13144

Encode help-*.txt file contents into C source code #13144

jsquyres commented Mar 15, 2025 •

edited

Loading

rhc54 commented Mar 15, 2025

jsquyres commented Mar 16, 2025

jsquyres commented Mar 16, 2025

rhc54 commented Mar 16, 2025

jsquyres commented Mar 18, 2025

rhc54 commented Mar 21, 2025

bwbarrett commented Mar 21, 2025

ggouaillardet commented Mar 21, 2025

bwbarrett commented Mar 21, 2025

ggouaillardet commented Mar 21, 2025

rhc54 commented Mar 21, 2025

jsquyres commented Mar 23, 2025

rhc54 commented Mar 23, 2025

jsquyres commented Mar 23, 2025

rhc54 commented Mar 23, 2025

bwbarrett commented Mar 24, 2025

jsquyres commented Mar 25, 2025

Encode help-*.txt file contents into C source code #13144

Encode help-*.txt file contents into C source code #13144

Conversation

jsquyres commented Mar 15, 2025 • edited Loading

rhc54 commented Mar 15, 2025

jsquyres commented Mar 16, 2025

jsquyres commented Mar 16, 2025

rhc54 commented Mar 16, 2025

jsquyres commented Mar 18, 2025

rhc54 commented Mar 21, 2025

bwbarrett commented Mar 21, 2025

ggouaillardet commented Mar 21, 2025

bwbarrett commented Mar 21, 2025

ggouaillardet commented Mar 21, 2025

rhc54 commented Mar 21, 2025

jsquyres commented Mar 23, 2025

rhc54 commented Mar 23, 2025

jsquyres commented Mar 23, 2025

rhc54 commented Mar 23, 2025

bwbarrett commented Mar 24, 2025

jsquyres commented Mar 25, 2025

jsquyres commented Mar 15, 2025 •

edited

Loading