Add proof-of-concept markdown-based documentation system#2304
Add proof-of-concept markdown-based documentation system#2304adrianreber merged 15 commits intoopenhpc:4.xfrom
Conversation
Test Results27 files - 12 27 suites - 12 37s ⏱️ -38s Results for commit 2ceeb32. ± Comparison against base commit 0318e82. This pull request removes 36 tests.♻️ This comment has been updated with latest results. |
|
Having issues with mdtoc - the python one does not support the args used but the npm one is named |
|
Using the Go one. Will provide an RPM. |
|
Going through the code to understand it and think big picture. About to leave some comments on some of the things I saw, mostly confirming my understanding and possible improvements - realizing that this is a POC with generated code. |
| comment = self._substitute_variables(line) | ||
| output.write(f"{comment}\n") | ||
|
|
||
| def _substitute_variables(self, cmd: str) -> str: |
There was a problem hiding this comment.
Looks like this is unused and that the variable substitution is done by Jinja2? Is this understanding correct?
| except StopIteration: | ||
| pass | ||
|
|
||
| def _process_refs(self, comment: str) -> str: |
There was a problem hiding this comment.
Does not look like this is not used as well - I assume it is meant to flatten markdown URL's in comments.
| - Markdown for document formatting | ||
| - YAML configuration files for variables | ||
| - Jinja2 templating for includes and conditionals | ||
| - Jinja2 comments (`{# #}`) for script generation markers |
There was a problem hiding this comment.
Is this still true? It seems that you now use HTML comments <!-- -->, which I can understand why since the Jinja2 comments are consumed when producing the final .md file.
Perhaps another method would be to create a manifest of files (list) to "cat" together replacing the steps.md.j2 file - I can see a few downsides to this method also. This process could add the file markers.
There was a problem hiding this comment.
After further investigation html comments seem like the way to go. Claud recommends using the @ as a delimiter for directives as in <!-- @ohpc_command -->
|
Overall it looks very cool and I believe this is the right direction. think we should invest in this more and take the opportunity to rename the include files to something a bit more standard so it is easier to find things, and other general cleanups. One thing I would like to remove is the sms prefix As for using pandoc, we may want to look bit harder for the specific renderer we use (mkdocs and others). There also seems to be some interesting tools for extracting and executing code from markdown. |
Yes. I started to replace all calls to You seem to have a stronger opinion related to the file naming. Happy to follow your suggestions. Should we just merge something in a parallel documentation directory and then we can continue to work both on it? It might pollute the git history a bit, but would maybe be easier. Or we could also use a new branch and do the initial work there.
I was also unsure about the
Sure. As we do not need to post-process the code blocks now as Jinja2 makes all the substitutions early. Definitely could be simplified. |
+1 on Agreed. I think your incremental approach to doing one thing at a time is the way to go, keeps the git changes easy to see/audit. Right now it seems like mostly a "bulk" copy of the docs (I typically do that) then updates from there.
Perhaps a new branch, we could squash commits if we wanted it to make it cleaner we it is merged into 4.x. Although I think one person should do the renaming work. We would also probably need to hand things off between work/commits because the probability of a conflict would be high (coordinate over slack) unless we coordinate. The only strong feeling I have is that the current naming is very bad, so something a bit more deliberate would be good, other than that I don't have strong feelings. Not sure how much near-term time I have to do it given the holidays here.
Agreed.
I spent a bit more time on this, and I think python is the way to go, it would allow some subtle customizations (say our @ohpc directives) that would get bad real quick if we used a standard tools. The code for finding it is simple and we just pull out all the extra/unused stuff from the current code. |
|
@middelkoopt I created a branch for this work at https://github.com/openhpc/ohpc/tree/2025-12-01-markdown If you have any ideas or improvements please use that branch as basis. |
|
A friendly reminder that this PR had no activity for 30 days. |
This commit introduces a new documentation system that replaces LaTeX with Jinja2-templated Markdown as the primary source format. The system can generate PDF (via Pandoc/XeLaTeX), HTML, and recipe.sh installation scripts from a single markdown source. Key components: - Jinja2 templates (.md.j2) with YAML configuration for variable substitution - Modular common templates in poc-markdown/common/ for reusable content - parse_doc_md.py: Python parser to extract recipe scripts from HTML comments - Makefile-based build system with dependencies tracking - Custom Pandoc Lua filter for format-specific rendering - LaTeX header templates for PDF customization - Example implementation for Rocky 10/x86_64/Warewulf4/Slurm Benefits over LaTeX: - Simpler markdown syntax easier to read and maintain - Single source generates multiple output formats (PDF, HTML, recipe.sh) - Better separation of content and presentation - Jinja2 variables reduce duplication across OS/arch variants - HTML comments preserve recipe extraction while keeping content visible Documentation structure: - poc-markdown/common/: Reusable template components and utilities - poc-markdown/rocky10/x86_64/warewulf4/slurm/: Example configuration - poc-markdown/common/figures/: Shared PDF graphics - poc-markdown/*.md: Documentation about the system itself The ohpc_indent directive system has been removed in favor of simpler hardcoded indentation for do-loops only, with all other code at column 0. Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Adrian Reber <areber@redhat.com>
Extends the proof-of-concept markdown-based documentation system to support AlmaLinux 10, following the same structure as Rocky 10. Changes: - Created poc-markdown/almalinux10/x86_64/warewulf4/slurm/ directory structure - Added config.yaml with AlmaLinux 10 specific configuration variables - Copied template files (steps.md.j2, manifest.md.j2, Makefile) from Rocky 10 - Shared common templates in poc-markdown/common/ are reused via symlink - Successfully generates PDF, HTML, and recipe.sh for AlmaLinux 10 Configuration differences from Rocky 10: - baseOS: "AlmaLinux 10" - OSRepo: "AlmaLinux_10" - baseos: "almalinux-10" - baseosshort: "almalinux10" - osimage: "almalinux:10" - os_name: "AlmaLinux 10" All other settings (architecture, provisioner, resource manager, package manager commands) remain identical to Rocky 10. Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Adrian Reber <areber@redhat.com>
Consolidated duplicate variable definitions to use only lowercase variants
where applicable. This improves maintainability and reduces redundancy in
configuration files and Jinja2 templates.
Changes:
- Replaced {{ OHPC }} with {{ ohpc }} (value: "OpenHPC")
- Replaced {{ OSTree }} with {{ ostree }} (value: "EL_10")
- Replaced {{ OSTag }} with {{ ostag }} (value: "el10")
- Replaced {{ provheader }} with {{ provisioner }} (value: "Warewulf4")
- Replaced {{ Warewulf }} with {{ warewulf }} (value: "Warewulf")
- Replaced {{ InfiniBand }} with {{ infiniband }} (value: "InfiniBand")
- Replaced {{ Lustre }} with {{ lustre }} (value: "Lustre")
- Removed duplicate rms_short variable (keeping rmsshort)
Updated files:
- config.yaml files for Rocky 10 and AlmaLinux 10
- 12 common Jinja2 template files
- Generated outputs (PDF, HTML, recipe.sh) verified to work correctly
Signed-off-by: Claude <noreply@anthropic.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Created almalinux_repos.md.j2 based on the original LaTeX source file to provide AlmaLinux-specific repository information. Updated AlmaLinux 10 documentation to use this file instead of the Rocky-specific version. Key differences from Rocky version: - AlmaLinux repository URL: https://repo.almalinux.org/almalinux/10/ - References to "AlmaLinux Extras" instead of "Rocky Extras" - Removed -y flag from first dnf install command (matching LaTeX source) Changes: - Created common/almalinux_repos.md.j2 from common/almalinux_repos.tex - Updated almalinux10/x86_64/warewulf4/slurm/steps.md.j2 to include almalinux_repos.md.j2 instead of rocky_repos.md.j2 - Verified generated outputs (PDF, HTML, recipe.sh) contain correct AlmaLinux-specific content Signed-off-by: Claude <noreply@anthropic.com> Signed-off-by: Adrian Reber <areber@redhat.com>
Updated all POC documentation files to accurately reflect the current
implementation status and correct outdated information.
Key corrections:
- Script generation markers use HTML comments (<!-- -->), not Jinja2
comments ({# #})
- Updated file counts: 74 common templates (not 5)
- Updated output stats: steps.md is 2817 lines/122KB (not 239 lines/8.3KB)
- Added all generated outputs: PDF (259KB, 44 pages), HTML (222KB),
recipe.sh (517 lines)
- Corrected variable names to use lowercase variants (ohpc, ostree,
ostag) instead of uppercase duplicates
- Added comprehensive list of converted components (dev tools, compilers,
MPI, perf tools, libraries, network config, etc.)
- Updated infrastructure components: parse_doc_md.py, update-toc.sh,
header-includes.tex.j2, format-filters.lua, codeblock-styles.css
- Added AlmaLinux 10 recipe to file structure
- Corrected build output details and examples
Files updated:
- README.md: Script marker syntax, current status
- SUMMARY.md: Comprehensive component list, accurate file sizes
- COMPARISON.md: Variable name corrections, comment types
- QUICKSTART.md: Variable name examples
- INDEX.md: Complete current state, all outputs
Signed-off-by: Claude <noreply@anthropic.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
Since Jinja2 already processes all {{ variable }} patterns when
generating steps.md, the _substitute_variables() function in
parse_doc_md.py is redundant. All variable substitution happens during
the Jinja2 template processing phase, before parse_doc_md.py runs.
Removed:
- _substitute_variables() function definition (11 lines)
- All 9 call sites throughout the file
Verified that recipe.sh generation still works correctly with no
unprocessed Jinja2 variables remaining in the output.
Signed-off-by: Adrian Reber <areber@redhat.com>
…md.py
Extended the parse_doc_md.py script to automatically indent commands
within if/else/fi blocks when processing <!-- ohpc_command ... -->
directives.
Changes:
- Added regex patterns to detect if, elif, else, and fi statements
- Implemented _write_indented_command() method to handle block
indentation
- Track indentation level during processing
- Commands inside if/elif/else blocks are indented with 2 spaces
- Properly handle nested if blocks with incremental indentation
Example output:
if [ ! -e ${inputFile} ];then
echo "Error: Unable to access file"
exit 1
else
. ${inputFile}
fi
This makes the generated recipe.sh scripts more readable and follows
standard bash indentation conventions.
Signed-off-by: Adrian Reber <areber@redhat.com>
Since the _substitute_variables() function was removed (all variable substitution now happens via Jinja2), the config file is no longer used by parse_doc_md.py. Removed all config-related code and parameters. Changes: - Removed yaml import (no longer needed) - Removed config_file parameter from __init__() - Removed self.config initialization and YAML loading - Removed --config argument from CLI parser - Updated usage message to remove config reference - Updated both Makefiles to remove --config parameter The script now only requires the markdown file path as input. All variable substitution is handled during the Jinja2 template processing phase, before parse_doc_md.py processes the generated markdown. Signed-off-by: Adrian Reber <areber@redhat.com>
Extended the indentation logic to apply to all commands from code
blocks ([sms]# prompts), not just <!-- ohpc_command ... --> directives.
This ensures that when an if statement is opened via ohpc_command, the
subsequent code block commands are properly indented.
Changes:
- Added indentation to regular SMS prompt commands
- Added indentation to plain comments
- Added indentation to HERE documents and their content
- Added indentation to line continuations and continued lines
- Added indentation to do loops and their body
Example output:
if [[ ${enable_ib} -eq 1 ]];then
dnf -y groupinstall "InfiniBand Support"
# Load IB services
udevadm trigger --type=devices --action=add
systemctl restart rdma-load-modules@infiniband.service
fi
All commands now respect the current indentation level tracked across
both ohpc_command directives and code block commands.
Signed-off-by: Adrian Reber <areber@redhat.com>
The --ci_run parameter and ohpc_ci_comment functionality have not been used for a long time. Removed all related code to simplify the script. Changes: - Removed ci_run parameter from __init__() - Removed ci_comment regex pattern - Removed ci_comment handling code in _process_line() - Removed --ci_run CLI argument from argument parser - Updated usage message to remove ci_run reference The script now has a simpler interface: just provide the markdown file path as the only required argument. Usage: parse_doc_md.py <markdown_file> Signed-off-by: Adrian Reber <areber@redhat.com>
After generating recipe.sh, automatically run shellcheck if it is installed. If shellcheck is not available, show an informational message instead of failing the build. Changes: - Added shellcheck validation step to recipe target in both Makefiles - Check for shellcheck availability with "command -v" - Run shellcheck with "|| true" to prevent build failures - Show informational message if shellcheck is not installed This provides optional static analysis of the generated bash scripts without requiring shellcheck as a mandatory dependency. Developers can benefit from shellcheck warnings to improve script quality while the build continues successfully regardless of shellcheck presence or findings. Signed-off-by: Adrian Reber <areber@redhat.com>
To avoid duplication, moved the Makefile to common/Makefile.common and created symlinks from the recipe directories. This ensures all recipe variants use the same build infrastructure. Changes: - Created common/Makefile.common (copied from rocky10 Makefile) - Replaced rocky10/x86_64/warewulf4/slurm/Makefile with symlink to ../../../../common/Makefile.common - Replaced almalinux10/x86_64/warewulf4/slurm/Makefile with symlink to ../../../../common/Makefile.common Benefits: - Single source of truth for build logic - Easier maintenance (update once, applies everywhere) - Consistent build behavior across all recipe variants - Simpler to add new recipe variants Both Rocky 10 and AlmaLinux 10 builds verified to work correctly with the symlinked Makefile. Signed-off-by: Adrian Reber <areber@redhat.com>
Added double quotes around variables to prevent word splitting and
globbing issues reported by shellcheck.
Changes in template files:
- install_header.md.j2: Quoted ${inputFile} references
- finalize_warewulf4_provisioning.md.j2: Quoted "$CHROOT" paths
- memlimits.md.j2: Quoted "$CHROOT" paths
- oneapi_mountpoint.md.j2: Quoted "$CHROOT" paths
- slurm_pam.md.j2: Quoted "$CHROOT" paths
- syslog.md.j2: Quoted "$CHROOT" paths (6 instances)
- install_slurm.md.j2: Quoted "${slurm_node_config}"
- mpi_slurm.md.j2: Removed raw anchor references (unrelated fix)
Changes in config files (Rocky 10 and AlmaLinux 10):
- chrootaddrepo: Added quotes around "$CHROOT"
- chrootclean: Added quotes around "$CHROOT"
- chrootinstall: Added quotes around "$CHROOT"
- groupchrootinstall: Added quotes around "$CHROOT"
- chrootupgrade: Added quotes around "$CHROOT"
This significantly reduces shellcheck warnings in generated recipe.sh
from 60+ down to 33 SC2086 warnings. The remaining warnings are for
other variables that will be addressed separately.
Signed-off-by: Adrian Reber <areber@redhat.com>
This commit addresses all shellcheck warnings in the generated recipe.sh except SC2129 (which has been added to the exclusion list). Fixed warnings: - SC2181: Changed 'if [ $? -ne 0 ]' to direct exit code check 'if ! dnf repolist | grep -q OpenHPC; then' - SC2155: Split 'export CHROOT=$(...)' into separate declare and assign to avoid masking return values - SC2164: Added '|| exit' to cd command for proper error handling - SC2004: Removed unnecessary $ in arithmetic for loops (changed 'i<$num_computes' to 'i<num_computes') - SC2046: Quoted $(id -u munge) and $(id -g munge) to prevent word splitting - SC2086: Double-quoted all variables to prevent globbing and word splitting throughout templates Changes span multiple template files: - steps.md.j2 (both rocky10 and almalinux10) - warewulf4_mkchroot.md.j2 - clustershell.md.j2 - add_ww4_hosts_finalize.md.j2 - add_ww4_hosts_intro.md.j2 - conman.md.j2 - finalize_warewulf4_provisioning.md.j2 - genders.md.j2 - import_ww4_files.md.j2 - import_ww4_files_slurm.md.j2 - install_provisioning_warewulf4_intro.md.j2 - reset_computes.md.j2 - slurm_startup.md.j2 - warewulf4_setup_centos.md.j2 - Makefile.common (added SC2129 to exclusion list) Signed-off-by: Claude Code <noreply@anthropic.com> Signed-off-by: Adrian Reber <areber@redhat.com>
…ants Update docs.spec to build documentation from markdown sources for the two variants that support the markdown approach (rocky10/x86_64/warewulf4/slurm and almalinux10/x86_64/warewulf4/slurm). Other variants continue using LaTeX. Changes: - Add BuildRequires for markdown toolchain (pandoc, python3-jinja2, python3-pyyaml, mdtoc, texlive-xetex, etc.) - Split build/install sections to handle markdown and LaTeX separately - Add jinja2-render.py script as replacement for jinja2-cli (not available on RHEL 10) - Update Makefile.common to use jinja2-render.py instead of jinja2 Generated with Claude Code (https://claude.ai/code) Signed-off-by: Adrian Reber <areber@redhat.com>
b5306b4 to
2ceeb32
Compare
📦 Package Count Analysis ResultsEnvironment: UBI 10 Container 🏭 Factory RepositoriesStatus: ✅ Factory repositories analysis completed successfully
Analysis performed by OpenHPC Package Count CI |
🚀 CCache Statistics
📊 Detailed StatisticsRHEL (aarch64/gnu15)RHEL (x86_64/gnu15)RHEL (x86_64/intel)🤖 Generated from workflow run |
This commit introduces a new documentation system that replaces LaTeX with Jinja2-templated Markdown as the primary source format. The system can generate PDF (via Pandoc/XeLaTeX), HTML, and recipe.sh installation scripts from a single markdown source.
Key components:
Benefits over LaTeX:
Documentation structure:
The ohpc_indent directive system has been removed in favor of simpler hardcoded indentation for do-loops only, with all other code at column 0.