Skip to content

[llmd] Add a stub for the project#6

Merged
kpouget merged 27 commits intoopenshift-psap:mainfrom
kpouget:llmd
Mar 31, 2026
Merged

[llmd] Add a stub for the project#6
kpouget merged 27 commits intoopenshift-psap:mainfrom
kpouget:llmd

Conversation

@kpouget
Copy link
Copy Markdown
Contributor

@kpouget kpouget commented Mar 26, 2026

Summary by CodeRabbit

  • New Features

    • New llm_d project with CLI (prepare, test, cleanup), presets, vaults, and orchestration config.
    • Durable YAML config system with presets/overrides and scoped artifact-directory management.
    • Improved process/run helpers and support for JSONPath/JWT runtime handling.
  • Bug Fixes / Changes

    • Removed legacy lightweight container definition.
    • CLI now lists only projects with orchestration; background logging/teardown behavior improved.
  • Documentation

    • Added llm_d README and quick-start guidance.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign albertoperdomo2 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Removes Containerfile.light; adds core library modules for config/env/run; introduces a new LLM‑D project scaffold (orchestration code, presets, vaults, config, README, Click CLIs); tweaks CI entrypoint and small test loop; and updates dependencies to include pyjwt and jsonpath_ng.

Changes

Cohort / File(s) Summary
Container Build Removal
Containerfile.light
Deleted the lightweight CentOS Stream 10 container definition and all its build steps.
Core libraries
projects/core/library/config.py, projects/core/library/env.py, projects/core/library/run.py
Added config management (YAML load, JSONPath get/set, presets, variable overrides, reference resolution, save/init), artifact-dir management (ARTIFACT_DIR, NextArtifactDir, TempArtifactDir), and process/subprocess helpers (signal handling, run wrapper, run_and_catch).
LLM‑D project scaffold (docs & config)
projects/llm_d/README.md, projects/llm_d/orchestration/config.yaml, projects/llm_d/orchestration/vaults.yaml
Added project README, orchestration configuration, and vault definitions for a new LLM‑D test-harness.
LLM‑D orchestration code & CLIs
projects/llm_d/orchestration/ci.py, projects/llm_d/orchestration/cli.py, projects/llm_d/orchestration/prepare_llmd.py, projects/llm_d/orchestration/test_llmd.py
New Click CLI entrypoints (prepare/test/pre_cleanup) and orchestration stubs (init/test/prepare/cleanup) that call shared core libs.
LLM‑D presets
projects/llm_d/orchestration/presets.d/presets.yaml, projects/llm_d/orchestration/presets.d/cks.yaml
Added presets including multi-preset support, pvc_rwx and llama-70b presets, and a cks preset that extends others, pins images, sets node selectors/affinities, and toggles capture flags.
CI & tests tweaks
projects/core/ci_entrypoint/prepare_ci.py, projects/core/ci_entrypoint/run_ci.py, projects/jump_ci/testing/test.py
Made tee outputs line-buffered and tee thread a daemon; send_notification early-returns for jump_ci; get_available_projects() now filters for directories containing orchestration/; initialized idx = 0 before an enumerate loop.
Dependencies / packaging
pyproject.toml, projects/core/image/pyproject-lightweight.toml
Replaced pinned jwt==1.3.1 with pyjwt and added jsonpath_ng to project and lightweight image dependencies.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant CLI as "CI CLI (ci/cli)"
  participant Env as "projects.core.library.env"
  participant Config as "projects.core.library.config"
  participant Prepare as "prepare_llmd / test_llmd"
  participant Run as "projects.core.library.run"

  rect rgba(200,200,255,0.5)
    User->>CLI: invoke prepare/test
    CLI->>Env: env.init(orchestration_dir)
    Env-->>CLI: ARTIFACT_DIR set
    CLI->>Config: config.init(orchestration_dir)
    Config-->>CLI: config loaded, presets & overrides applied
  end

  rect rgba(200,255,200,0.5)
    CLI->>Prepare: call prepare()/test()
    Prepare->>Run: run subprocesses/commands
    Run-->>Prepare: return results/logs
    Prepare-->>CLI: return exit code
    CLI->>User: process exits with code
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through configs, presets, and runs,
ARTIFACT_DIR glowing beneath moonlit suns,
CLIs summoned, envs set, tiny scripts in flight,
llm-d seedlings tucked in YAML at night,
A carrot-coded patch — now off I run!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.53% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[llmd] Add a stub for the project' accurately describes the main change: introducing a new LLM-D project with initial scaffolding including configs, orchestration, and presets.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci cks
/cluster cks

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🧹 Nitpick comments (11)
projects/llm_d/orchestration/ci.py (1)

7-10: Remove unused imports.

subprocess, time, and Path are imported but never used.

♻️ Proposed fix
 import sys
-import subprocess
-import time
-from pathlib import Path
 import types
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/ci.py` around lines 7 - 10, The file imports
unused modules: remove the unused imports "subprocess", "time", and "Path" from
the top import block (leave "sys" if it's used elsewhere) by updating the import
line(s) in projects/llm_d/orchestration/ci.py to only import modules actually
referenced; after removing them, run a quick search in the file for
"subprocess", "time", or "Path" to ensure nothing else relies on those symbols
and adjust or import them where truly needed.
projects/llm_d/orchestration/cli.py (2)

7-9: Remove unused imports.

subprocess, time, and Path are imported but never used.

♻️ Proposed fix
 import sys
-import subprocess
-import time
-from pathlib import Path
 import types
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/cli.py` around lines 7 - 9, Remove the unused
imports in projects/llm_d/orchestration/cli.py: delete the top-level import
statements for subprocess, time, and Path since they are not referenced anywhere
in the module (check the module-level code and functions such as any CLI entry
points or helper functions in this file to confirm no usage). Ensure only needed
imports remain to satisfy linters and avoid unused-import warnings.

1-51: Consider consolidating ci.py and cli.py.

Both files define nearly identical Click CLIs with the same three subcommands (prepare, test, pre_cleanup) calling the same phase functions. The only difference is that ci.py has a broken verbose parameter and missing init() call.

Once ci.py is fixed, consider whether both entrypoints are needed or if they can be merged to reduce duplication.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/cli.py` around lines 1 - 51, There is duplicated
Click CLI logic between cli.py and ci.py: both define commands prepare, test,
and pre_cleanup that call prepare_llmd.prepare, test_llmd.test, and
prepare_llmd.cleanup; fix by either consolidating into a single entrypoint or
sync the two files so they match—specifically, in ci.py ensure the broken
verbose parameter is removed or wired correctly, add the missing
test_llmd.init() call (as done in cli.py), and then remove the redundant file or
have both import the shared CLI definitions (commands prepare, test,
pre_cleanup) from one module to avoid duplication.
projects/llm_d/orchestration/prepare_llmd.py (1)

4-4: Remove unused imports.

env and run are imported but not used in this stub.

♻️ Proposed fix
-from projects.core.library import env, config, run
+from projects.core.library import config
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/prepare_llmd.py` at line 4, The import line in
prepare_llmd.py brings in env and run which are unused; remove those unused
symbols so only config (the used symbol) remains imported from
projects.core.library. Update the import statement that currently references
env, config, run to only import config to eliminate the unused-import warning
and keep the module minimal.
projects/core/library/run.py (2)

150-150: Typo: "occured" should be "occurred".

-            logging.warning(f"An exception occured while preparing the '{self.name}' Parallel execution ...")
+            logging.warning(f"An exception occurred while preparing the '{self.name}' Parallel execution ...")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/run.py` at line 150, Fix the typo in the log message
inside the Parallel execution preparation path: change the string passed to
logging.warning in the method referencing self.name from "An exception occured
while preparing the '{self.name}' Parallel execution ..." to use the correct
spelling "occurred" so the message reads "An exception occurred while preparing
the '{self.name}' Parallel execution ...".

1-6: Duplicate import of signal module.

The signal module is imported twice: on line 1 and again on line 6.

Proposed fix
-import sys, os, signal
+import sys, os
 import traceback
 import logging
 logging.getLogger().setLevel(logging.INFO)
 import json
 import signal
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/run.py` around lines 1 - 6, Remove the duplicate import
of the signal module in the top-level imports of run.py: keep the single "import
signal" (already present in the first line with "import sys, os, signal") and
delete the second "import signal" later in the diff so imports are not repeated;
ensure logging, traceback, json and other imports remain unchanged.
projects/core/library/config.py (3)

163-163: Consider renaming print parameter to avoid shadowing the builtin.

The print parameter on get_config() and set_config() shadows Python's built-in print function. While functional, this can cause confusion and issues if someone tries to use print() inside these methods. Consider using log or verbose instead.

Also applies to: 186-186

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/config.py` at line 163, The get_config and set_config
methods currently use a parameter named print which shadows Python's built-in;
rename that parameter to a non-conflicting name like verbose (or log) in both
method signatures (get_config and set_config) and update all internal uses and
any callers to use the new name, preserving the boolean semantics and default
value; ensure docstrings and tests/reference usages reflect the new parameter
name so no runtime references to the old print parameter remain.

163-173: Use exception chaining for clearer tracebacks.

When re-raising a different exception in the except block (line 173), use raise ... from ex to chain the exceptions and preserve context.

Proposed fix
         except IndexError as ex:
             if default_value != ...:
                 if warn:
                     logging.warning(f"get_config: {jsonpath} --> missing. Returning the default value: {default_value}")
                 return default_value
 
             logging.error(f"get_config: {jsonpath} --> {ex}")
-            raise KeyError(f"Key '{jsonpath}' not found in {self.config_path}")
+            raise KeyError(f"Key '{jsonpath}' not found in {self.config_path}") from ex
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/config.py` around lines 163 - 173, In get_config,
preserve the original IndexError context when re-raising by using exception
chaining: change the final raise KeyError in the except block so it uses "raise
KeyError(... ) from ex" (keep the existing message and the caught variable ex)
so the traceback shows the original IndexError; update the raise statement
inside the get_config method accordingly.

7-7: Unused import: subprocess.

The subprocess module is imported but never used in this file.

-import subprocess
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/config.py` at line 7, Remove the unused import of
subprocess from this module; locate the top-level import statement "import
subprocess" in projects/core/library/config.py (or any file-scope imports) and
delete it so the file no longer imports the unused subprocess symbol.
projects/llm_d/README.md (2)

13-13: Minor: Use hyphen in compound adjective.

"OpenShift AI focused" should be "OpenShift AI-focused" when used as a compound adjective before a noun.

-- **AI Platform Specificity**: OpenShift AI focused testing patterns
+- **AI Platform Specificity**: OpenShift AI-focused testing patterns
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/README.md` at line 13, Update the phrase "OpenShift AI
focused" in the README so the compound adjective is hyphenated as "OpenShift
AI-focused" wherever it appears (e.g., the line containing "**AI Platform
Specificity**: OpenShift AI focused testing patterns"); ensure the hyphenated
form is used when the phrase modifies a noun.

17-25: Project structure shows skeleton/ but this file is in llm_d/.

The directory name in the project structure diagram says skeleton/ but this README is located in projects/llm_d/. Consider updating to reflect the actual project name, or clarify that this is a template that users should customize.

-skeleton/
+llm_d/
 ├── orchestration/
 │   └── ci.py          # Main CI script with Click-based CLI
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/README.md` around lines 17 - 25, The project tree in the
README shows "skeleton/" which is inconsistent with this README being inside
projects/llm_d; update the diagram to use "llm_d/" (or clearly mark it as a
template) so it matches the actual project name and location: edit the README.md
content where the ASCII tree begins and replace "skeleton/" with "llm_d/" (or
add a parenthetical like "skeleton/ (rename to llm_d/)" if you intend it as a
template) to keep the documentation accurate; ensure the rest of the entries
(orchestration/, README.md, config.yaml, tests/, scripts/) remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/core/library/config.py`:
- Around line 269-284: In __get_config_path, the check "if not config_file_src"
is invalid because pathlib.Path is always truthy; change the check to test the
file's existence (e.g., if not config_file_src.exists()) and raise the
ValueError if the source config file doesn't exist; ensure you reference the
variables config_file_src and config_path_final in the updated logic so the
function still returns (config_path_final, config_file_src) after copying or
reloading.
- Around line 157-161: apply_presets_from_project_args can raise TypeError
because ensure_core_fields sets self.config["project"]["args"] = None; change
apply_presets_from_project_args to safely handle a None or non-iterable value
from get_config("project.args", print=False) by retrieving the value into a
variable, treating None as an empty list (or returning early), and only
iterating when the value is iterable (e.g., list/tuple); reference the
apply_presets_from_project_args method and get_config call and ensure you do not
attempt to iterate over self.config["project"]["args"] when it's None.

In `@projects/core/library/env.py`:
- Around line 73-74: next_artifact_index() is racy when used by concurrent
callers (it just counts ARTIFACT_DIR contents), so update NextArtifactDir usage
or implement a safe allocation: either (A) require and document that callers
must pass a shared threading.Lock and a shared counter (counter_p) for parallel
execution, or (B) make next_artifact_index() atomic by attempting to create
candidate directories under ARTIFACT_DIR using an exclusive operation (e.g.,
loop incrementing an index and call os.mkdir / Path.mkdir with exist_ok=False or
use an OS-level atomic rename) until the create succeeds, or wrap the
count+create in a file lock around ARTIFACT_DIR; reference the functions
next_artifact_index() and NextArtifactDir and the ARTIFACT_DIR constant when
applying the change so parallel processes/threads cannot return duplicate
indices.
- Around line 58-68: In __exit__, the global declaration uses the wrong name;
change the declaration from global _ARTIFACT_DIR to global ARTIFACT_DIR so the
function references the actual module-level variable ARTIFACT_DIR when restoring
state; update the global statement in the __exit__ method accordingly (symbols:
__exit__, ARTIFACT_DIR, __set_artifact_dir, get_tls_artifact_dir).
- Around line 49-54: Implement a thread-local artifact-dir accessor and fix the
typo and race: add a module-level threading.local() (e.g., _TLS) and implement
get_tls_artifact_dir() to return getattr(_TLS, "artifact_dir", None); update
__set_artifact_dir(name) to set _TLS.artifact_dir = name and also assign the
module global ARTIFACT_DIR (fix the typo to use global ARTIFACT_DIR, not
_ARTIFACT_DIR). To avoid races when touching process-wide
os.environ["ARTIFACT_DIR"] inside TempArtifactDir.__enter__/__exit__, guard
those os.environ writes with a threading.Lock (or remove os.environ mutation and
rely on thread-local ARTIFACT_DIR), and ensure
TempArtifactDir.__enter__/__exit__ call get_tls_artifact_dir() and
__set_artifact_dir() so each thread has its own artifact dir without clobbering
others.

In `@projects/core/library/run.py`:
- Around line 147-173: The Parallel context in __exit__ uses
joblib.Parallel(backend="threading") but tasks call env.NextArtifactDir which
mutates global env.ARTIFACT_DIR and os.environ and is not thread-safe; change
the Parallel backend to a process-based backend (e.g., "multiprocessing" or
"loky") or ensure each task obtains its artifact dir without touching shared
globals (call env.NextArtifactDir inside each child process or refactor
NextArtifactDir to return a safe per-task path without setting globals). Also
replace "raise e" in the except block with a bare "raise" to preserve the
original traceback when re-raising exceptions from joblib.Parallel.
- Around line 69-80: In _dict_to_run_toolbox_args, use the serialized value
`val` instead of the raw `v` for dict/list branches (replace f"--{k}=\"{v}\""
with an arg that embeds `val`), and fix the single-quote escaping in the scalar
branch by replacing single quotes with a backslash-escaped single quote (use
replace("'", "\\'")) so the arg uses the escaped `val` rather than the original
`v`; ensure both branches build arg from `val` (e.g., arg = f"--{k}='{val}'" or
equivalent shell-safe quoting).

In `@projects/llm_d/orchestration/ci.py`:
- Around line 48-49: The CLI entrypoint misses initializing the test_llmd
environment before running the CLI, which causes prepare_llmd.prepare() calls to
config.project.get_config(...) to fail; fix by invoking test_llmd.init() (or the
appropriate init() function that sets up config.project) prior to calling cli()
so the configuration is initialized before any subcommand (e.g., prepare) runs.
- Around line 17-21: The cli function declares a verbose parameter but no Click
option provides it, causing a runtime error; either add a Click option decorator
(e.g., add `@click.option`('--verbose', '-v', is_flag=True, default=False,
help=...) directly above `@click.pass_context`) so Click will populate the verbose
argument, or remove the verbose parameter from the cli function signature and
any usage of it (and delete the corresponding flag references) if verbosity
isn't needed; update any downstream code that expects ctx.obj.verbose
accordingly.

In `@projects/llm_d/orchestration/config.yaml`:
- Line 166: Fix the typo in the comment for the config flag
"--max-model-len=40960": change "udpdate" to "update" so the comment reads "keep
in 5th position or update the presets"; update only the comment text near the
"--max-model-len=40960" entry.
- Around line 28-33: The entry for the "Node Feature Discovery" operator
(catalog: redhat-operators, operator: nfd, namespace: openshift-nfd) uses
deploy_cr: 1 which is an integer; change it to the boolean true to match other
entries and keep types consistent — update the deploy_cr value for this record
from 1 to true.

In `@projects/llm_d/orchestration/prepare_llmd.py`:
- Around line 6-13: In prepare() and cleanup() replace deprecated
logger.warn(...) with logger.warning(...), remove or keep the pass as needed,
and ensure each function returns an explicit integer exit code (e.g., return 0
for success, non-zero for failure) so callers can use sys.exit(exit_code);
specifically update the prepare() function (which reads
config.project.get_config("prepare.namespace.name")) to log with logger.warning
and return an int, and update cleanup() to use logger.warning and return an int.

In `@projects/llm_d/orchestration/test_llmd.py`:
- Around line 11-13: Replace the deprecated logger.warn call in the test
function with logger.warning and make the function return an explicit exit code
integer (e.g., 0 for success or non-zero for failure) so callers can use its
return value with sys.exit(); locate the test function and update the call to
logger.warning("Hello test") and add a final return statement like return 0 (or
appropriate non-zero on failure).

---

Nitpick comments:
In `@projects/core/library/config.py`:
- Line 163: The get_config and set_config methods currently use a parameter
named print which shadows Python's built-in; rename that parameter to a
non-conflicting name like verbose (or log) in both method signatures (get_config
and set_config) and update all internal uses and any callers to use the new
name, preserving the boolean semantics and default value; ensure docstrings and
tests/reference usages reflect the new parameter name so no runtime references
to the old print parameter remain.
- Around line 163-173: In get_config, preserve the original IndexError context
when re-raising by using exception chaining: change the final raise KeyError in
the except block so it uses "raise KeyError(... ) from ex" (keep the existing
message and the caught variable ex) so the traceback shows the original
IndexError; update the raise statement inside the get_config method accordingly.
- Line 7: Remove the unused import of subprocess from this module; locate the
top-level import statement "import subprocess" in
projects/core/library/config.py (or any file-scope imports) and delete it so the
file no longer imports the unused subprocess symbol.

In `@projects/core/library/run.py`:
- Line 150: Fix the typo in the log message inside the Parallel execution
preparation path: change the string passed to logging.warning in the method
referencing self.name from "An exception occured while preparing the
'{self.name}' Parallel execution ..." to use the correct spelling "occurred" so
the message reads "An exception occurred while preparing the '{self.name}'
Parallel execution ...".
- Around line 1-6: Remove the duplicate import of the signal module in the
top-level imports of run.py: keep the single "import signal" (already present in
the first line with "import sys, os, signal") and delete the second "import
signal" later in the diff so imports are not repeated; ensure logging,
traceback, json and other imports remain unchanged.

In `@projects/llm_d/orchestration/ci.py`:
- Around line 7-10: The file imports unused modules: remove the unused imports
"subprocess", "time", and "Path" from the top import block (leave "sys" if it's
used elsewhere) by updating the import line(s) in
projects/llm_d/orchestration/ci.py to only import modules actually referenced;
after removing them, run a quick search in the file for "subprocess", "time", or
"Path" to ensure nothing else relies on those symbols and adjust or import them
where truly needed.

In `@projects/llm_d/orchestration/cli.py`:
- Around line 7-9: Remove the unused imports in
projects/llm_d/orchestration/cli.py: delete the top-level import statements for
subprocess, time, and Path since they are not referenced anywhere in the module
(check the module-level code and functions such as any CLI entry points or
helper functions in this file to confirm no usage). Ensure only needed imports
remain to satisfy linters and avoid unused-import warnings.
- Around line 1-51: There is duplicated Click CLI logic between cli.py and
ci.py: both define commands prepare, test, and pre_cleanup that call
prepare_llmd.prepare, test_llmd.test, and prepare_llmd.cleanup; fix by either
consolidating into a single entrypoint or sync the two files so they
match—specifically, in ci.py ensure the broken verbose parameter is removed or
wired correctly, add the missing test_llmd.init() call (as done in cli.py), and
then remove the redundant file or have both import the shared CLI definitions
(commands prepare, test, pre_cleanup) from one module to avoid duplication.

In `@projects/llm_d/orchestration/prepare_llmd.py`:
- Line 4: The import line in prepare_llmd.py brings in env and run which are
unused; remove those unused symbols so only config (the used symbol) remains
imported from projects.core.library. Update the import statement that currently
references env, config, run to only import config to eliminate the unused-import
warning and keep the module minimal.

In `@projects/llm_d/README.md`:
- Line 13: Update the phrase "OpenShift AI focused" in the README so the
compound adjective is hyphenated as "OpenShift AI-focused" wherever it appears
(e.g., the line containing "**AI Platform Specificity**: OpenShift AI focused
testing patterns"); ensure the hyphenated form is used when the phrase modifies
a noun.
- Around line 17-25: The project tree in the README shows "skeleton/" which is
inconsistent with this README being inside projects/llm_d; update the diagram to
use "llm_d/" (or clearly mark it as a template) so it matches the actual project
name and location: edit the README.md content where the ASCII tree begins and
replace "skeleton/" with "llm_d/" (or add a parenthetical like "skeleton/
(rename to llm_d/)" if you intend it as a template) to keep the documentation
accurate; ensure the rest of the entries (orchestration/, README.md,
config.yaml, tests/, scripts/) remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1fa272a5-fcbe-4e4f-9b35-cc314c13ff1f

📥 Commits

Reviewing files that changed from the base of the PR and between 7839e2f and 240b2ec.

📒 Files selected for processing (13)
  • Containerfile.light
  • projects/core/library/config.py
  • projects/core/library/env.py
  • projects/core/library/run.py
  • projects/llm_d/README.md
  • projects/llm_d/orchestration/ci.py
  • projects/llm_d/orchestration/cli.py
  • projects/llm_d/orchestration/config.yaml
  • projects/llm_d/orchestration/prepare_llmd.py
  • projects/llm_d/orchestration/presets.d/cks.yaml
  • projects/llm_d/orchestration/presets.d/presets.yaml
  • projects/llm_d/orchestration/test_llmd.py
  • projects/llm_d/orchestration/vaults.yaml
💤 Files with no reviewable changes (1)
  • Containerfile.light

Comment thread projects/core/library/config.py
Comment thread projects/core/library/config.py
Comment thread projects/core/library/env.py Outdated
Comment thread projects/core/library/env.py Outdated
Comment thread projects/core/library/env.py
Comment thread projects/llm_d/orchestration/ci.py
Comment thread projects/llm_d/orchestration/config.yaml
Comment thread projects/llm_d/orchestration/config.yaml Outdated
Comment thread projects/llm_d/orchestration/prepare_llmd.py
Comment thread projects/llm_d/orchestration/test_llmd.py Outdated
@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🔴 Test of 'jump_ci pre-cleanup' failed after 00 hours 00 minutes 13 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args: []
project.name: cks

Failure indicator: Empty. (See run.log)

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
projects/llm_d/orchestration/ci.py (1)

48-49: ⚠️ Potential issue | 🔴 Critical

Initialize runtime before invoking CLI.

At Line 49, cli() is called without test_llmd.init(). This can crash prepare when config access occurs before setup.

🐛 Proposed fix
 if __name__ == "__main__":
+    test_llmd.init()
     cli()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/ci.py` around lines 48 - 49, Call
test_llmd.init() before invoking the CLI to ensure runtime/config is
initialized; modify the main entry so that before calling cli() you invoke
test_llmd.init() (or the appropriate initialization function used in this
module) to prevent prepare/config access from running before setup, referencing
the test_llmd.init() call and the existing cli() invocation to locate where to
insert the init call.
🧹 Nitpick comments (1)
projects/llm_d/orchestration/ci.py (1)

17-45: Avoid duplicating command wiring that already exists in projects/llm_d/orchestration/cli.py.

This file mirrors the same Click group/subcommands, which creates drift risk (the missing init regression came from that drift). Consider reusing a shared command group.

♻️ Refactor direction
-@click.group()
-@click.pass_context
-def cli(ctx):
-    """Skeleton Example Project CI Operations for TOPSAIL-NG."""
-    ctx.ensure_object(types.SimpleNamespace)
-
-@cli.command()
-@click.pass_context
-def prepare(ctx):
-    ...
-
-@cli.command()
-@click.pass_context
-def test(ctx):
-    ...
-
-@cli.command()
-@click.pass_context
-def pre_cleanup(ctx):
-    ...
+from projects.llm_d.orchestration.cli import cli
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/ci.py` around lines 17 - 45, The file duplicates
the Click command group and commands already defined in
projects/llm_d/orchestration/cli.py (cli(), prepare(), test(), pre_cleanup()),
causing drift; fix it by removing the duplicated definitions and reusing the
shared command group from the existing module — import the Click group (or the
module exposing it) from projects/llm_d/orchestration/cli.py and register or
reference the existing prepare(), test(), and cleanup handlers
(prepare_llmd.prepare, test_llmd.test, prepare_llmd.cleanup) via that shared
group instead of redefining cli(), prepare(), test(), and pre_cleanup() here.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@projects/llm_d/orchestration/ci.py`:
- Around line 48-49: Call test_llmd.init() before invoking the CLI to ensure
runtime/config is initialized; modify the main entry so that before calling
cli() you invoke test_llmd.init() (or the appropriate initialization function
used in this module) to prevent prepare/config access from running before setup,
referencing the test_llmd.init() call and the existing cli() invocation to
locate where to insert the init call.

---

Nitpick comments:
In `@projects/llm_d/orchestration/ci.py`:
- Around line 17-45: The file duplicates the Click command group and commands
already defined in projects/llm_d/orchestration/cli.py (cli(), prepare(),
test(), pre_cleanup()), causing drift; fix it by removing the duplicated
definitions and reusing the shared command group from the existing module —
import the Click group (or the module exposing it) from
projects/llm_d/orchestration/cli.py and register or reference the existing
prepare(), test(), and cleanup handlers (prepare_llmd.prepare, test_llmd.test,
prepare_llmd.cleanup) via that shared group instead of redefining cli(),
prepare(), test(), and pre_cleanup() here.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c9b523de-cebd-4197-9133-d33a1b2bb0eb

📥 Commits

Reviewing files that changed from the base of the PR and between 240b2ec and 828aa65.

📒 Files selected for processing (2)
  • projects/jump_ci/testing/test.py
  • projects/llm_d/orchestration/ci.py
✅ Files skipped from review due to trivial changes (1)
  • projects/jump_ci/testing/test.py

@kpouget kpouget force-pushed the llmd branch 4 times, most recently from f8f21de to 61af970 Compare March 26, 2026 18:46
@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci cks
/cluster cks

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🔴 Test of 'jump_ci pre-cleanup' failed after 00 hours 10 minutes 14 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args: []
project.name: cks

Failure indicator: Empty. (See run.log)

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci llm_d cks
/cluster cks

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🔴 Test of 'jump_ci pre-cleanup' failed after 00 hours 01 minutes 01 seconds 🔴

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

Failure indicator: Empty. (See run.log)

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci llm_d cks
/cluster cks

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (7)
projects/core/library/run.py (2)

58-59: UTF-8 decode may raise on binary output.

If a command produces non-UTF-8 output (e.g., binary data), .decode("utf8") will raise UnicodeDecodeError. If this is intentional (callers should use decode_stdout=False for binary), consider documenting it. Otherwise, add errors="replace" or errors="backslashreplace" for resilience.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/run.py` around lines 58 - 59, The current decode calls
will raise UnicodeDecodeError for non-UTF-8 (binary) output; update the decoding
of proc.stdout and proc.stderr (the branches that check
capture_stdout/decode_stdout and capture_stderr/decode_stderr) to pass an errors
handler (e.g., errors="replace" or errors="backslashreplace") to .decode() so
binary or invalid sequences don’t raise, or alternately document that
decode_stdout/decode_stderr must be False for binary—choose the former for
resilience and apply it to both proc.stdout.decode(...) and
proc.stderr.decode(...).

45-47: Consider multi-line format if linter is enforced.

Ruff flagged E701 (multiple statements on one line) for these conditionals. If your CI enforces this rule, refactor to multi-line:

Multi-line format
-    if capture_stdout: args["stdout"] = subprocess.PIPE
-    if capture_stderr: args["stderr"] = subprocess.PIPE
-    if check: args["check"] = True
+    if capture_stdout:
+        args["stdout"] = subprocess.PIPE
+    if capture_stderr:
+        args["stderr"] = subprocess.PIPE
+    if check:
+        args["check"] = True
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/run.py` around lines 45 - 47, Refactor the single-line
conditionals in run.py so they use multi-line if blocks: replace the one-line
forms "if capture_stdout: args['stdout'] = subprocess.PIPE", "if capture_stderr:
args['stderr'] = subprocess.PIPE", and "if check: args['check'] = True" with
standard multi-line blocks (if ...: newline indented assignment) to satisfy Ruff
E701; locate these statements around the use of args, capture_stdout,
capture_stderr, check and subprocess.PIPE and update them accordingly.
projects/llm_d/README.md (1)

17-25: Add language specifier to fenced code block.

The code block showing the project structure lacks a language specifier. Adding one improves syntax highlighting and documentation quality.

📝 Proposed fix
-```
+```text
 skeleton/
 ├── orchestration/
 │   └── ci.py          # Main CI script with Click-based CLI
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/README.md` around lines 17 - 25, Update the fenced code block
in the README that shows the project skeleton to include a language specifier
(e.g., use ```text or ```bash) so syntax highlighting renders correctly; locate
the block in projects/llm_d/README.md (the triple-backtick block containing the
directory tree) and add the specifier immediately after the opening backticks.
projects/core/library/config.py (2)

1-3: Setting global logging level at import time may have side effects.

Line 2 sets the root logger level to INFO at module import time. This affects all loggers in the application and may override logging configuration set elsewhere. Consider moving this to an explicit initialization function or removing it if the caller is expected to configure logging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/config.py` around lines 1 - 3, The module-level call
logging.getLogger().setLevel(logging.INFO) in config.py sets the root logger at
import time and may unintentionally override application logging; remove this
import-time side effect by deleting that line and instead provide an explicit
initializer such as an init_logging or configure_logging function (e.g., in
config module) that sets the root logger level when called, or leave
configuration entirely to the application caller; update any callers to invoke
the new init function where appropriate and keep the module imports free of
global logging configuration.

163-183: Consider renaming print parameter to avoid shadowing builtin.

The print parameter shadows Python's builtin print function. While this works because the function doesn't need the builtin, it can cause confusion and is flagged by linters. Consider renaming to log or should_print.

♻️ Proposed fix
-    def get_config(self, jsonpath, default_value=..., warn=True, print=True, handled_secretly=False):
+    def get_config(self, jsonpath, default_value=..., warn=True, log_value=True, handled_secretly=False):
         try:
             value = jsonpath_ng.parse(jsonpath).find(self.config)[0].value
         except IndexError as ex:
             if default_value != ...:
                 if warn:
                     logging.warning(f"get_config: {jsonpath} --> missing. Returning the default value: {default_value}")
                 return default_value
 
             logging.error(f"get_config: {jsonpath} --> {ex}")
-            raise KeyError(f"Key '{jsonpath}' not found in {self.config_path}")
+            raise KeyError(f"Key '{jsonpath}' not found in {self.config_path}") from ex
 
         if isinstance(value, str) and value.startswith("*$@"):
-            print = False
+            log_value = False
 
         value = self.resolve_reference(value, handled_secretly)
 
-        if print and not handled_secretly:
+        if log_value and not handled_secretly:
             logging.info(f"get_config: {jsonpath} --> {value}")
 
         return value

Note: This change would require updating all call sites that use print=False.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/config.py` around lines 163 - 183, Rename the boolean
parameter named print on the get_config method to avoid shadowing the built-in
print; for example change it to should_print (update the signature def
get_config(self, jsonpath, default_value=..., warn=True, should_print=True,
handled_secretly=False)), update all references inside the method (checks like
if should_print and not handled_secretly and any assignments that set
print=False), and update every call site that passes print=False/True to use the
new parameter name to preserve behavior; keep the other logic, including the
conditional that disables logging for secret values, unchanged.
projects/llm_d/orchestration/ci.py (1)

7-11: Unused imports can be removed.

subprocess, time, and Path are imported but not used in this file.

♻️ Proposed fix
 import sys
-import subprocess
-import time
-from pathlib import Path
 import types
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/ci.py` around lines 7 - 11, Remove the unused
imports from the top of the file: delete subprocess, time, and Path from the
import list so only used modules (e.g., sys and types) remain; update the import
statement in projects/llm_d/orchestration/ci.py accordingly and run the
linter/pytest to confirm no missing references.
projects/core/ci_entrypoint/prepare_ci.py (1)

147-148: Daemon thread may truncate logs on unexpected exits.

Changing to daemon=True prevents the process from hanging if cleanup is missed, but daemon threads are terminated abruptly when the main thread exits. If sys.exit() is called without going through postchecks() or shutdown_dual_output(), buffered log data may be lost.

The existing cleanup paths (signal handlers, exception handler, postchecks) do call shutdown_dual_output(), so this is mostly mitigated. However, consider whether all exit paths are covered, particularly early exits before postchecks runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/ci_entrypoint/prepare_ci.py` around lines 147 - 148, The thread
started with threading.Thread(target=communicate, daemon=True) may be killed
abruptly and truncate buffered logs; change it to a non-daemon thread (remove
daemon=True) and ensure it is cleanly joined during shutdown: register an atexit
handler or call daemon_thread.join(timeout) from existing shutdown paths (signal
handlers, exception handler, and postchecks()) and/or invoke
shutdown_dual_output() before any early sys.exit(), so the communicate thread
has a chance to flush and exit cleanly; reference the thread creation
(communicate) and existing shutdown_dual_output()/postchecks() and update those
handlers to join the thread.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@projects/core/library/run.py`:
- Around line 1-7: Remove the duplicate import of the signal module in the
top-level imports of run.py: keep a single "import signal" and delete the
redundant one so only one import statement references the signal symbol; do not
change other imports (leave itertools as-is if intended for future use) and run
the linter to confirm no remaining duplicate-import warnings.

In `@projects/llm_d/orchestration/ci.py`:
- Around line 26-29: The CI wrapper functions prepare(), test(), and cleanup()
call helper modules but pass their return values directly to sys.exit even
though those helpers (e.g., prepare_llmd.prepare, test_llmd.test,
cleanup_llmd.cleanup) may return None; change each wrapper to capture the
helper's return value into a variable, coerce None to a non-zero exit code
(e.g., 1) or validate it's an int, and then call sys.exit(exit_code) so failures
aren't silently treated as success.

In `@projects/llm_d/README.md`:
- Around line 1-3: The README title currently reads "Skeleton Project"; update
the top-level header to reflect this repository's actual project name (e.g.,
change "# Skeleton Project" to "# LLM-D Project" or a similar project-specific
title) so the README matches the projects/llm_d identity.

---

Nitpick comments:
In `@projects/core/ci_entrypoint/prepare_ci.py`:
- Around line 147-148: The thread started with
threading.Thread(target=communicate, daemon=True) may be killed abruptly and
truncate buffered logs; change it to a non-daemon thread (remove daemon=True)
and ensure it is cleanly joined during shutdown: register an atexit handler or
call daemon_thread.join(timeout) from existing shutdown paths (signal handlers,
exception handler, and postchecks()) and/or invoke shutdown_dual_output() before
any early sys.exit(), so the communicate thread has a chance to flush and exit
cleanly; reference the thread creation (communicate) and existing
shutdown_dual_output()/postchecks() and update those handlers to join the
thread.

In `@projects/core/library/config.py`:
- Around line 1-3: The module-level call
logging.getLogger().setLevel(logging.INFO) in config.py sets the root logger at
import time and may unintentionally override application logging; remove this
import-time side effect by deleting that line and instead provide an explicit
initializer such as an init_logging or configure_logging function (e.g., in
config module) that sets the root logger level when called, or leave
configuration entirely to the application caller; update any callers to invoke
the new init function where appropriate and keep the module imports free of
global logging configuration.
- Around line 163-183: Rename the boolean parameter named print on the
get_config method to avoid shadowing the built-in print; for example change it
to should_print (update the signature def get_config(self, jsonpath,
default_value=..., warn=True, should_print=True, handled_secretly=False)),
update all references inside the method (checks like if should_print and not
handled_secretly and any assignments that set print=False), and update every
call site that passes print=False/True to use the new parameter name to preserve
behavior; keep the other logic, including the conditional that disables logging
for secret values, unchanged.

In `@projects/core/library/run.py`:
- Around line 58-59: The current decode calls will raise UnicodeDecodeError for
non-UTF-8 (binary) output; update the decoding of proc.stdout and proc.stderr
(the branches that check capture_stdout/decode_stdout and
capture_stderr/decode_stderr) to pass an errors handler (e.g., errors="replace"
or errors="backslashreplace") to .decode() so binary or invalid sequences don’t
raise, or alternately document that decode_stdout/decode_stderr must be False
for binary—choose the former for resilience and apply it to both
proc.stdout.decode(...) and proc.stderr.decode(...).
- Around line 45-47: Refactor the single-line conditionals in run.py so they use
multi-line if blocks: replace the one-line forms "if capture_stdout:
args['stdout'] = subprocess.PIPE", "if capture_stderr: args['stderr'] =
subprocess.PIPE", and "if check: args['check'] = True" with standard multi-line
blocks (if ...: newline indented assignment) to satisfy Ruff E701; locate these
statements around the use of args, capture_stdout, capture_stderr, check and
subprocess.PIPE and update them accordingly.

In `@projects/llm_d/orchestration/ci.py`:
- Around line 7-11: Remove the unused imports from the top of the file: delete
subprocess, time, and Path from the import list so only used modules (e.g., sys
and types) remain; update the import statement in
projects/llm_d/orchestration/ci.py accordingly and run the linter/pytest to
confirm no missing references.

In `@projects/llm_d/README.md`:
- Around line 17-25: Update the fenced code block in the README that shows the
project skeleton to include a language specifier (e.g., use ```text or ```bash)
so syntax highlighting renders correctly; locate the block in
projects/llm_d/README.md (the triple-backtick block containing the directory
tree) and add the specifier immediately after the opening backticks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c79eb4ea-7e0a-47e8-b1d2-8285517a77ca

📥 Commits

Reviewing files that changed from the base of the PR and between 828aa65 and 81abe78.

📒 Files selected for processing (16)
  • projects/core/ci_entrypoint/prepare_ci.py
  • projects/core/ci_entrypoint/run_ci.py
  • projects/core/library/config.py
  • projects/core/library/env.py
  • projects/core/library/run.py
  • projects/jump_ci/testing/test.py
  • projects/llm_d/README.md
  • projects/llm_d/orchestration/ci.py
  • projects/llm_d/orchestration/cli.py
  • projects/llm_d/orchestration/config.yaml
  • projects/llm_d/orchestration/prepare_llmd.py
  • projects/llm_d/orchestration/presets.d/cks.yaml
  • projects/llm_d/orchestration/presets.d/presets.yaml
  • projects/llm_d/orchestration/test_llmd.py
  • projects/llm_d/orchestration/vaults.yaml
  • pyproject.toml
✅ Files skipped from review due to trivial changes (7)
  • projects/jump_ci/testing/test.py
  • pyproject.toml
  • projects/llm_d/orchestration/prepare_llmd.py
  • projects/llm_d/orchestration/vaults.yaml
  • projects/llm_d/orchestration/presets.d/presets.yaml
  • projects/llm_d/orchestration/presets.d/cks.yaml
  • projects/llm_d/orchestration/config.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • projects/llm_d/orchestration/test_llmd.py
  • projects/llm_d/orchestration/cli.py

Comment on lines +1 to +7
import sys, os, signal
import traceback
import logging
logging.getLogger().setLevel(logging.INFO)
import json
import signal
import itertools
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate signal import.

signal is imported on both line 1 and line 6. Remove one of the duplicates.

Proposed fix
-import sys, os, signal
+import sys, os
 import traceback
 import logging
 logging.getLogger().setLevel(logging.INFO)
 import json
 import signal
-import itertools

Note: itertools also appears unused but wasn't flagged by the linter, so it may be reserved for future use.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import sys, os, signal
import traceback
import logging
logging.getLogger().setLevel(logging.INFO)
import json
import signal
import itertools
import sys, os
import traceback
import logging
logging.getLogger().setLevel(logging.INFO)
import json
import signal
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/core/library/run.py` around lines 1 - 7, Remove the duplicate import
of the signal module in the top-level imports of run.py: keep a single "import
signal" and delete the redundant one so only one import statement references the
signal symbol; do not change other imports (leave itertools as-is if intended
for future use) and run the linter to confirm no remaining duplicate-import
warnings.

Comment on lines +26 to +29
def prepare(ctx):
"""Prepare phase - Set up environment and dependencies."""
exit_code = prepare_llmd.prepare()
sys.exit(exit_code)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Exit codes will always be 0 - failures are silently masked.

The prepare_llmd.prepare() function lacks an explicit return statement, so it returns None. When passed to sys.exit(None), this is treated as exit code 0 (success). The same issue affects test() (lines 36-37) and cleanup() (lines 44-45).

This means CI will report success even when these operations fail internally.

🐛 Proposed fix: Ensure functions return explicit exit codes

In prepare_llmd.py:

 def prepare():
     ns = config.project.get_config("prepare.namespace.name")
     logger.warning(f"Hello prepare {ns}")
-    pass
+    return 0

 def cleanup():
     logger.warning("Hello cleanup")
-    pass
+    return 0

In test_llmd.py:

 def test():
     logger.warning("Hello test")
-    pass
+    return 0
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/orchestration/ci.py` around lines 26 - 29, The CI wrapper
functions prepare(), test(), and cleanup() call helper modules but pass their
return values directly to sys.exit even though those helpers (e.g.,
prepare_llmd.prepare, test_llmd.test, cleanup_llmd.cleanup) may return None;
change each wrapper to capture the helper's return value into a variable, coerce
None to a non-zero exit code (e.g., 1) or validate it's an int, and then call
sys.exit(exit_code) so failures aren't silently treated as success.

Comment thread projects/llm_d/README.md Outdated
Comment on lines +1 to +3
# Skeleton Project

This is a template/skeleton project that demonstrates how to create a new project within the **TOPSAIL-NG** test harness framework.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update title to reflect the actual project.

The README title says "Skeleton Project" but this file is located in projects/llm_d/. Consider updating the title to "LLM-D Project" or similar to reflect the actual project.

📝 Proposed fix
-# Skeleton Project
+# LLM-D Project

-This is a template/skeleton project that demonstrates how to create a new project within the **TOPSAIL-NG** test harness framework.
+This project implements LLM-D testing within the **TOPSAIL-NG** test harness framework. It follows the standard project template structure.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Skeleton Project
This is a template/skeleton project that demonstrates how to create a new project within the **TOPSAIL-NG** test harness framework.
# LLM-D Project
This project implements LLM-D testing within the **TOPSAIL-NG** test harness framework. It follows the standard project template structure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@projects/llm_d/README.md` around lines 1 - 3, The README title currently
reads "Skeleton Project"; update the top-level header to reflect this
repository's actual project name (e.g., change "# Skeleton Project" to "# LLM-D
Project" or a similar project-specific title) so the README matches the
projects/llm_d identity.

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🟢 Test of 'jump_ci test' succeeded after 00 hours 00 minutes 54 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci llm_d cks
/cluster cks

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🟢 Test of 'llm_d test' succeeded after 00 hours 00 minutes 00 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🟢 Test of 'jump_ci test' succeeded after 00 hours 00 minutes 57 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 26, 2026

/test llmd-jump-ci llm_d cks
/cluster cks

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 26, 2026

🟢 Test of 'llm_d test' succeeded after 00 hours 00 minutes 00 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 27, 2026

/test llmd-jump-ci llm_d cks
/cluster cks

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 27, 2026

🟢 Test of 'llm_d test' succeeded after 00 hours 00 minutes 00 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
project.args:
- cks
project.name: llm_d

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 31, 2026

/test llmd-jump-ci llm_d cks
/cluster cks
/only test

@topsail-bot
Copy link
Copy Markdown

topsail-bot bot commented Mar 31, 2026

🟢 Test of 'llm_d test' succeeded after 00 hours 00 minutes 06 seconds 🟢

• Link to the test results.

• No reports index generated...

Test configuration:

cluster.name: cks
exec_list._only_: true
exec_list.test: true
project.args:
- cks
project.name: llm_d

@kpouget
Copy link
Copy Markdown
Contributor Author

kpouget commented Mar 31, 2026

good, merging this as the first early version of the toolbox !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant