Skip to content

More container fixes, board selection#555

Merged
ramseymcgrath merged 9 commits intomainfrom
more-container-fixes
Feb 27, 2026
Merged

More container fixes, board selection#555
ramseymcgrath merged 9 commits intomainfrom
more-container-fixes

Conversation

@ramseymcgrath
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings February 27, 2026 01:16
@roomote-v0
Copy link
Copy Markdown

roomote-v0 Bot commented Feb 27, 2026

Rooviewer Clock   See task

Reviewed commit c933018. No issues found. The changes include important bug fixes and code improvements:

  • Fixed vfio_helpers buffer padding to use full 40-byte length
  • Removed dead decode_table_size lambda that was incorrectly adding 1 to already-decoded table_size values
  • Improved FPGA part regex to handle suffixed parts (e.g., xczu3eg-sbva484-1-e)
  • Added proper argument validation for --vivado-timeout
  • Updated tests to reflect corrected behavior
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Comment thread src/utils/template_validator.py Fixed
@ramseymcgrath ramseymcgrath linked an issue Feb 27, 2026 that may be closed by this pull request
@github-actions
Copy link
Copy Markdown

E2E Integration Test Summary

Workflow Run: 837
Commit: af38e74
Branch: 555/merge
Triggered by: pull_request
Timestamp: Fri Feb 27 01:18:41 UTC 2026

Test Results

End-to-End Tests

E2E Tests: FAILED
⏸️ Security Scan: SKIPPED
⏸️ Performance Analysis: SKIPPED

Artifacts

  • Test reports and logs available in workflow artifacts
  • Artifacts retained for 7-30 days depending on type

@github-actions
Copy link
Copy Markdown

E2E Integration Test Summary

Workflow Run: 839
Commit: 0174b9a
Branch: 555/merge
Triggered by: pull_request
Timestamp: Fri Feb 27 01:20:57 UTC 2026

Test Results

End-to-End Tests

E2E Tests: FAILED
⏸️ Security Scan: SKIPPED
⏸️ Performance Analysis: SKIPPED

Artifacts

  • Test reports and logs available in workflow artifacts
  • Artifacts retained for 7-30 days depending on type

@roomote-v0
Copy link
Copy Markdown

roomote-v0 Bot commented Feb 27, 2026

Rooviewer Clock   See task

Review complete. The PR addresses container build issues from #548-#554 with security improvements, bug fixes, and code cleanup. No issues were identified that require changes.

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on improving container/VFIO robustness and correctness across the build + orchestration flow, while tightening several PCI/MSI-X/config-space handling behaviors and updating related tests.

Changes:

  • Harden VFIO + container execution paths (ioctl constants, container mounts/permissions, safer temp script handling, safer subprocess usage, env var cleanup).
  • Fix/align PCI capability, MSI-X, and config-space parsing/validation behavior (table_size semantics, PM D3hot bit, extended config sizing, capability traversal guards).
  • Improve Vivado integration and board resolution (SystemVerilog header handling, board path discovery mapping, IP XCI speed-grade patch hook).

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/test_unified_flow_orchestration.py Update env var assertions to verify cleanup/restoration after build.
tests/test_shell_unit.py Align test expectations with Shell.run() using shell=False.
tests/test_pci_capability_pruning.py Update expected PM capability word behavior after PME-bit clearing changes.
tests/test_orchestration_local.py Update env var assertions to verify cleanup/restoration after local templating.
tests/test_msix_bar_validator_comprehensive.py Align tests with decoded table_size semantics (no extra decode).
tests/test_device_config.py Align tests with validate_hex_id() always treating strings as hex for PCI IDs.
tests/test_config_space_manager.py Update tests for revision-id preservation and 4096-byte config parsing.
tests/test_build_additions.py Remove _recheck_vfio_bindings test corresponding to removed method.
src/vivado_handling/pcileech_build_integration.py Add XCI speed-grade patch hook; treat .svh as SystemVerilog design units in Vivado TCL.
src/vivado_handling/ip_lock_resolver.py Introduce patch_xci_speed_grade() helper for pre-build IP unlock workflow.
src/utils/template_validator.py Prefer authoritative board mapping via BoardDiscovery/RepoManager before path heuristics.
src/templating/template_renderer.py Replace floating log2 floor with integer arithmetic (bit_length).
src/templating/sv_context_builder.py Preserve class_code/revision_id from nested config_space when present.
src/templates/sv/pcileech_cfgspace.coe.j2 Adjust MSI-X support macro output behavior used for conditional rendering.
src/templates/sv/pcileech_bar_impl_device.sv.j2 Avoid multiple drivers by merging reset/write always blocks; improve 1-byte read packing.
src/shell.py Switch from shell=True to shell=False + shlex.split for safer command execution.
src/pci_capability/processor.py Correct standard capability ID handling; fix MSI control bit-masking; add circular-chain guard in removal traversal.
src/pci_capability/msix_bar_validator.py Treat table_size as already-decoded (no decode_table_size application).
src/pci_capability/constants.py Correct PM D3hot support bit constant (bit 9).
src/pci_capability/_pruning.py Preserve PM version/support fields while clearing PME bits; ensure D3hot set correctly.
src/device_clone/writemask_generator.py Fix extended-cap “next” pointer interpretation; add bounds check for capability offsets.
src/device_clone/msix_capability.py Clarify alignment semantics; fix indentation so validation errors print only when present.
src/device_clone/device_config.py Make PCI ID parsing consistent: non-0x strings are treated as hex.
src/device_clone/config_space_manager.py Preserve revision ID 0x00; parse full 4096-byte extended config space; avoid overwriting donor values.
src/cli/vfio_helpers.py Change ioctl argument handling for device name; adjust fd return handling and cleanup flow.
src/cli/vfio_constants.py Correct VFIO ioctl numbers to match kernel uapi definitions (use _IO where appropriate).
src/cli/container.py Harden container mount behavior (debugfs RO) and keep argument-vector execution.
src/build.py Improve CLI constraints for --vivado-jobs; adjust help text for timeout.
src/version.py Bump version metadata to 0.14.11.
pcileech.py Rework container-image rebuild policy, forward additional flags into container runs, add env var cleanup in local templating, harden remediation script creation, switch to importlib.metadata.
Containerfile Restrict sudo scope, validate package install during build, adjust PYTHONPATH to avoid import conflicts.
CHANGELOG.rst Add entries for versions 0.14.6–0.14.11.
.dockerignore Include packaging metadata files in build context by no longer ignoring them.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +117 to +163
def patch_xci_speed_grade(
ip_root: Path,
target_fpga_part: str,
logger=None,
prefix: str = "VIVADO",
) -> int:
"""Patch XCI files whose SPEEDGRADE doesn't match the target FPGA part.

Vivado locks IP cores when the XCI's SPEEDGRADE differs from the
project's target part. This pre-build step rewrites the value so
Vivado can open/regenerate the cores without manual intervention.

Returns the number of files patched.
"""
import re as _re

logger = logger or get_logger(__name__)

# Extract speed grade from part string (e.g. "xc7a100tfgg484-1" -> "-1")
match = _re.search(r"(-\d+)$", target_fpga_part)
if not match:
log_warning_safe(
logger,
safe_format(
"Cannot extract speed grade from part '{part}'",
part=target_fpga_part,
),
prefix=prefix,
)
return 0
target_grade = match.group(1)

ip_dirs = _discover_ip_dirs(ip_root)
patched = 0
for ip_dir in ip_dirs:
for xci_file in ip_dir.glob("*.xci"):
try:
text = xci_file.read_text(encoding="utf-8")
# Match "SPEEDGRADE": [ { "value": "-2" } ]
new_text, n = _re.subn(
r'("SPEEDGRADE"\s*:\s*\[\s*\{\s*"value"\s*:\s*)"(-?\d+)"',
rf'\1"{target_grade}"',
text,
)
if n > 0 and new_text != text:
xci_file.write_text(new_text, encoding="utf-8")
patched += 1
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patch_xci_speed_grade() is new behavior that rewrites XCI files; there are existing unit tests for repair_ip_artifacts() in this module but none covering this new patching logic. Add unit tests that (a) verify SPEEDGRADE is patched when mismatched and (b) verify no change when already correct / when no speed grade can be parsed.

Copilot uses AI. Check for mistakes.
Comment on lines +102 to 105
# Validate table size -- msix.py already decodes the N-1 encoded
# value to N, so use it directly (do NOT apply decode_table_size).
table_size = msix_cap.get("table_size", 0)
if not (MSIX_MIN_TABLE_ENTRIES <= table_size <= MSIX_MAX_TABLE_ENTRIES):
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After switching to use the already-decoded table_size, the module-level decode_table_size helper is now unused, which is confusing (it suggests callers should still decode). Either remove it or add a brief comment clarifying it is legacy/unused to avoid accidental double-decoding later.

Copilot uses AI. Check for mistakes.
Comment thread src/build.py Outdated
type=int,
default=3600,
help="Timeout for Vivado operations in seconds (default: 3600)",
help="Timeout for Vivado operations in seconds (min 60, default: 3600)",
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The help text says --vivado-timeout has a minimum of 60 seconds, but argparse doesn’t enforce this (negative/very small values are accepted). Either enforce the minimum (e.g., via a custom type/validation) or adjust the help string to match actual behavior.

Suggested change
help="Timeout for Vivado operations in seconds (min 60, default: 3600)",
help="Timeout for Vivado operations in seconds (default: 3600)",

Copilot uses AI. Check for mistakes.
Comment thread pcileech.py
Comment on lines 918 to 930
def _ensure_container_image(runtime: str, logger, tag: str = "pcileech-fwgen") -> bool:
"""Ensure the container image exists; build it if missing."""
"""Always rebuild the container image to ensure latest code is used."""
try:
inspect = subprocess.run(
[runtime, "image", "inspect", tag],
capture_output=True,
text=True,
)
if inspect.returncode == 0:
# Image exists - warn user about potential staleness
age_days = _get_image_age_days(runtime, tag)
if age_days is not None and age_days > 7:
log_warning_safe(
logger,
safe_format(
"Container image '{tag}' is {days} days old. "
"If you experience issues, rebuild with: "
"{rt} system prune -a -f && {rt} build -t {tag} -f Containerfile .",
tag=tag,
days=age_days,
rt=runtime,
),
prefix="BUILD",
)
else:
log_info_safe(
logger,
safe_format(
"Using existing container image: {tag}",
tag=tag,
),
prefix="BUILD",
)
return True
# Build image
log_info_safe(
logger,
safe_format("Building container image: {tag}", tag=tag),
safe_format("Building container image: {tag} (always rebuilds for consistency)", tag=tag),
prefix="BUILD",
)
build = subprocess.run(
[runtime, "build", "-t", tag, "-f", "Containerfile", "."],
[runtime, "build", "--no-cache", "-t", tag, "-f", "Containerfile", "."],
cwd=str(project_root),
)
return build.returncode == 0
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always rebuilding the container image with --no-cache makes every container-based run pay the full build cost and can significantly slow down normal workflows/CI. Consider restoring the previous “build if missing (or stale)” behavior, or gating --no-cache behind an explicit flag/env var (e.g., --rebuild-image / PCILEECH_REBUILD_IMAGE=1).

Copilot uses AI. Check for mistakes.
Comment on lines 18 to 23
{# Uses set+caller pattern instead of macro output to return a true boolean #}
{% macro is_msix_supported() %}
{{ (msix_config is defined) and (msix_config and (msix_config.is_supported if hasattr(msix_config, 'is_supported') else (msix_config.num_vectors|default(0) > 0))) }}
{%- if (msix_config is defined) and msix_config and (msix_config.is_supported if hasattr(msix_config, 'is_supported') else (msix_config.num_vectors|default(0) > 0)) -%}
True
{%- endif -%}
{% endmacro %}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims this macro uses a “set+caller pattern”, but the implementation just emits the literal text True (or nothing). Update the comment to accurately describe the approach (macro output used as a truthy/falsy string) to avoid misleading future edits.

Copilot uses AI. Check for mistakes.
Comment thread src/cli/vfio_helpers.py Outdated
Comment on lines +561 to +562
# Null-terminate and pad to fixed length for the ioctl
name_buf = name_bytes + b"\x00"
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name_buf is not actually padded to the fixed 40-byte buffer length expected by the VFIO_GROUP_GET_DEVICE_FD ioctl argument (comment says it is, but the code only appends a single NUL). Passing a shorter bytes object risks ioctl reading past the provided buffer and can produce EFAULT/undefined behavior. Pad the buffer to 40 bytes (NUL-terminated) before calling ioctl, and keep the existing max-length check.

Suggested change
# Null-terminate and pad to fixed length for the ioctl
name_buf = name_bytes + b"\x00"
# Null-terminate and pad to fixed 40-byte length for the ioctl
name_buf = name_bytes + b"\x00" * (40 - len(name_bytes))

Copilot uses AI. Check for mistakes.
Comment thread src/vivado_handling/ip_lock_resolver.py Outdated
Comment on lines +135 to +147
# Extract speed grade from part string (e.g. "xc7a100tfgg484-1" -> "-1")
match = _re.search(r"(-\d+)$", target_fpga_part)
if not match:
log_warning_safe(
logger,
safe_format(
"Cannot extract speed grade from part '{part}'",
part=target_fpga_part,
),
prefix=prefix,
)
return 0
target_grade = match.group(1)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speed-grade extraction only matches parts ending with -<digits>; it will fail for valid Xilinx part strings like xczu3eg-sbva484-1-e that appear in BoardDiscovery.BOARD_CONFIGS, causing the patch step to be silently skipped. Update the regex/parsing to handle suffixes (e.g., -1-e, -2-i, etc.) and still extract the speed grade portion.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

E2E Integration Test Summary

Workflow Run: 841
Commit: 03cd8fc
Branch: 555/merge
Triggered by: pull_request
Timestamp: Fri Feb 27 01:28:22 UTC 2026

Test Results

End-to-End Tests

E2E Tests: FAILED
⏸️ Security Scan: SKIPPED
⏸️ Performance Analysis: SKIPPED

Artifacts

  • Test reports and logs available in workflow artifacts
  • Artifacts retained for 7-30 days depending on type

roomote-v0[bot]
roomote-v0 Bot previously approved these changes Feb 27, 2026
Copilot AI review requested due to automatic review settings February 27, 2026 01:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@ramseymcgrath ramseymcgrath merged commit 4347458 into main Feb 27, 2026
5 of 6 checks passed
@ramseymcgrath ramseymcgrath deleted the more-container-fixes branch February 27, 2026 01:36
@github-actions
Copy link
Copy Markdown

E2E Integration Test Summary

Workflow Run: 843
Commit: ef02e3a
Branch: 555/merge
Triggered by: pull_request
Timestamp: Fri Feb 27 01:37:21 UTC 2026

Test Results

End-to-End Tests

E2E Tests: FAILED
⏸️ Security Scan: SKIPPED
⏸️ Performance Analysis: SKIPPED

Artifacts

  • Test reports and logs available in workflow artifacts
  • Artifacts retained for 7-30 days depending on type

@github-actions
Copy link
Copy Markdown

E2E Integration Test Summary

Workflow Run: 845
Commit: 659bb53
Branch: 555/merge
Triggered by: pull_request
Timestamp: Fri Feb 27 01:40:44 UTC 2026

Test Results

End-to-End Tests

E2E Tests: FAILED
⏸️ Security Scan: SKIPPED
⏸️ Performance Analysis: SKIPPED

Artifacts

  • Test reports and logs available in workflow artifacts
  • Artifacts retained for 7-30 days depending on type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants