Skip to content

sdk%fix(codeql): deal with inconsistent caches, allocate half of available memory, fix CSV parsing bug, add --cache/-c#6

Merged
kwvg merged 6 commits into
dashpay:developfrom
kwvg:cql_fix
Jun 7, 2026
Merged

sdk%fix(codeql): deal with inconsistent caches, allocate half of available memory, fix CSV parsing bug, add --cache/-c#6
kwvg merged 6 commits into
dashpay:developfrom
kwvg:cql_fix

Conversation

@kwvg
Copy link
Copy Markdown
Collaborator

@kwvg kwvg commented Jun 6, 2026

Additional Information

Breaking Changes

None expected.

How Has This Been Tested?

./contrib/lint/all_lint.py

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional tests
  • I have made corresponding changes to the documentation (note: N/A)
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@kwvg kwvg added this to the 0.1 milestone Jun 6, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR centralizes workspace discovery and resource estimation in contrib/lint/common.py, refactors lint_codeql.py to manage database/results lifecycles, changes CodeQL CSV diagnostics parsing, updates CodeQL predicates, and migrates other lint scripts to use root_dir().

Changes

Lint infrastructure refactoring with shared environment utilities

Layer / File(s) Summary
Shared environment utilities foundation
contrib/lint/common.py
Adds root_dir() to locate the workspace root by scanning for Cargo workspace layout, usable_threads() to compute a conservative thread count, and usable_mem() plus _physical_ram_bytes() to detect physical RAM on Linux (via /proc/meminfo), macOS (via sysctl hw.memsize), and Windows (via PowerShell/WMI), raising RuntimeError when detection fails or platform is unsupported.
CodeQL script refactoring with new utilities
contrib/lint/lint_codeql.py
Refactors imports and CLI, adds _workspace_dirs() context manager for persistent-vs-temporary db/results handling and _parse_args(argv); replaces CSV DictReader with headerless csv.reader and row-shape validation, normalizes messages and constructs uri paths, integrates root_dir(), usable_threads(), and usable_mem() into codeql database create/analyze flow, and enforces manifest postconditions (raising RuntimeError if the manifest is missing).
CodeQL query/source-line predicate updates
contrib/codeql/import.ql, contrib/codeql/lib/policy.qll
Updates fileCfgLines/source-line predicate to materialize sourceLineContent and match #[cfg...] text; changes implementsSerdeTrait to scan source lines with regex matching and adds a fallback branch that matches explicit impl serde::<trait> for <Type> text.
Lightweight lint script migrations to root_dir()
contrib/lint/lint_javascript.py, contrib/lint/lint_python.py, contrib/lint/lint_semgrep.py
Each script now calls root_dir() from contrib/lint/common.py instead of performing per-script upward searches (find_up/is_workspace_root/pyproject.toml/workspace Cargo.toml`).
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description is related to the changeset, referencing dependency status, testing, and checklist completion relevant to the CodeQL/lint runner updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title references fixing CodeQL issues and mentions key changes (cache handling, memory allocation, CSV parsing, and --cache option), which aligns well with the substantial refactoring across multiple lint files to centralize root_dir discovery and improve resource management.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 6, 2026

Warning

This pull request may have conflicts, please coordinate with the authors of these pull requests.

Potential conflicts

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
contrib/lint/common.py (1)

116-125: 💤 Low value

Add ValueError to the WMIC exception handling for consistency.

The macOS path (lines 98-100) catches ValueError from int() conversion, but the WMIC fallback does not. If WMIC produces malformed output, int() on line 123 would raise an uncaught ValueError instead of falling through to the RuntimeError on line 126.

Suggested fix
     try:
       out = subprocess.check_output(
         ["wmic", "computersystem", "get",  # noqa: S607
          "TotalPhysicalMemory", "/value"],
       ).decode()
       for line in out.splitlines():
         if line.startswith("TotalPhysicalMemory="):
           return int(line.split("=", 1)[1].strip())
-    except (FileNotFoundError, subprocess.CalledProcessError):
+    except (FileNotFoundError, subprocess.CalledProcessError, ValueError):
       pass
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/lint/common.py` around lines 116 - 125, The WMIC fallback block in
contrib/lint/common.py currently only catches FileNotFoundError and
subprocess.CalledProcessError but can raise ValueError when int() parsing fails;
update the except clause for the WMIC branch (the try around
subprocess.check_output and the loop that returns int(line.split(...))) to also
include ValueError so malformed WMIC output falls through to the final
RuntimeError, e.g., add ValueError to the exception tuple that currently
references FileNotFoundError and subprocess.CalledProcessError.
contrib/lint/lint_codeql.py (3)

173-183: 💤 Low value

Clarify --skip-cache help text.

The help text "rebuild the database from scratch" could be misinterpreted as rebuilding the persistent cache. In reality, --skip-cache bypasses the cache entirely by using a temporary directory that is discarded afterward.

Consider rephrasing to "bypass cache using temporary database" or "skip persistent cache" for clarity.

♻️ Proposed help text clarification
   parser.add_argument(
     "-s",
     "--skip-cache",
     action="store_true",
-    help="rebuild the database from scratch",
+    help="bypass cache using a temporary database",
   )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/lint/lint_codeql.py` around lines 173 - 183, Update the help string
for the --skip-cache argument in _parse_args to clarify it does not rebuild the
persistent cache but instead bypasses it by using a temporary, discarded
database; change the help text to something like "bypass persistent cache and
use a temporary database (discarded after run)" so callers of _parse_args and
the parser.add_argument("--skip-cache") option reflect the correct behavior.

242-247: ⚡ Quick win

Consider cache directory migration or cleanup guidance.

The cache structure has changed to use .cache/db and .cache/results subdirectories (see lines 168-170). If developers have existing caches from before this PR, they may encounter issues with incompatible directory structures.

Consider adding a check to detect and clean up old cache formats, or document the need to manually clear old caches after this change.

🔄 Proposed cache cleanup check
   cache_dir = query_dir / ".cache"
+  # Clean up old cache format (pre-migration)
+  old_db_manifest = cache_dir / "codeql-database.yml"
+  if old_db_manifest.is_file():
+    print("Detected old cache format, cleaning up...", file=sys.stderr)
+    shutil.rmtree(cache_dir, ignore_errors=True)
 
   with _workspace_dirs(cache_dir, skip_cache=args.skip_cache) as (
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/lint/lint_codeql.py` around lines 242 - 247, Before calling
_workspace_dirs (where cache_dir is defined), add a check that detects legacy
cache layouts (e.g., .cache containing files/directories but missing the new
.cache/db and .cache/results subdirs) and either migrate or remove them when
args.skip_cache is false; implement this using the existing cache_dir variable
and args.skip_cache flag, performing safe migration of files into cache_dir/"db"
and cache_dir/"results" or deleting old entries and logging the action, then
proceed to call _workspace_dirs to yield active_db and results_dir.

147-171: ⚡ Quick win

Consider logging cleanup failures for observability.

The cleanup at line 166 uses shutil.rmtree(..., ignore_errors=True), which silently suppresses errors like permission issues or file locks. While this prevents the script from crashing during cleanup, it may leave temporary directories behind without any indication.

Consider logging cleanup failures to stderr so administrators can detect and address accumulating temp directories.

♻️ Proposed enhancement for cleanup observability
     finally:
-      shutil.rmtree(tmp, ignore_errors=True)
+      try:
+        shutil.rmtree(tmp)
+      except OSError as e:
+        print(f"warning: cleanup failed for {tmp}: {e}", file=sys.stderr)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/lint/lint_codeql.py` around lines 147 - 171, The cleanup in
_workspace_dirs currently calls shutil.rmtree(tmp, ignore_errors=True) which
swallows failures; change this to attempt rmtree without ignore_errors and catch
exceptions, then log the failure (including the tmp path and exception details)
to stderr or the module logger so administrators can detect lingering temp dirs;
specifically update the finally block in _workspace_dirs to wrap
shutil.rmtree(tmp) in try/except and use logging.error or sys.stderr.write with
the path and exception information.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@contrib/lint/common.py`:
- Around line 116-125: The WMIC fallback block in contrib/lint/common.py
currently only catches FileNotFoundError and subprocess.CalledProcessError but
can raise ValueError when int() parsing fails; update the except clause for the
WMIC branch (the try around subprocess.check_output and the loop that returns
int(line.split(...))) to also include ValueError so malformed WMIC output falls
through to the final RuntimeError, e.g., add ValueError to the exception tuple
that currently references FileNotFoundError and subprocess.CalledProcessError.

In `@contrib/lint/lint_codeql.py`:
- Around line 173-183: Update the help string for the --skip-cache argument in
_parse_args to clarify it does not rebuild the persistent cache but instead
bypasses it by using a temporary, discarded database; change the help text to
something like "bypass persistent cache and use a temporary database (discarded
after run)" so callers of _parse_args and the
parser.add_argument("--skip-cache") option reflect the correct behavior.
- Around line 242-247: Before calling _workspace_dirs (where cache_dir is
defined), add a check that detects legacy cache layouts (e.g., .cache containing
files/directories but missing the new .cache/db and .cache/results subdirs) and
either migrate or remove them when args.skip_cache is false; implement this
using the existing cache_dir variable and args.skip_cache flag, performing safe
migration of files into cache_dir/"db" and cache_dir/"results" or deleting old
entries and logging the action, then proceed to call _workspace_dirs to yield
active_db and results_dir.
- Around line 147-171: The cleanup in _workspace_dirs currently calls
shutil.rmtree(tmp, ignore_errors=True) which swallows failures; change this to
attempt rmtree without ignore_errors and catch exceptions, then log the failure
(including the tmp path and exception details) to stderr or the module logger so
administrators can detect lingering temp dirs; specifically update the finally
block in _workspace_dirs to wrap shutil.rmtree(tmp) in try/except and use
logging.error or sys.stderr.write with the path and exception information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a7e1c045-1c13-4153-9053-18e76646600c

📥 Commits

Reviewing files that changed from the base of the PR and between 4e93394 and ae2b110.

📒 Files selected for processing (5)
  • contrib/lint/common.py
  • contrib/lint/lint_codeql.py
  • contrib/lint/lint_javascript.py
  • contrib/lint/lint_python.py
  • contrib/lint/lint_semgrep.py

@kwvg
Copy link
Copy Markdown
Collaborator Author

kwvg commented Jun 7, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 7, 2026

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
contrib/codeql/lib/policy.qll (1)

149-158: 💤 Low value

Minor: Regex may not match generic impl declarations.

The regex pattern on line 156-157 uses \b word boundaries around the type name, which won't match generic types like impl serde::Serialize for Container<T> since < isn't a word character. This is likely acceptable since this is a fallback for extractor limitations, but be aware it may miss some generic manual impls.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/codeql/lib/policy.qll` around lines 149 - 158, The regex used in the
exists(...) clause (via sourceLineContent(...).regexpMatch(...)) uses \b around
t.getName().getText() which fails to match generic impls like "impl
serde::Serialize for Container<T>"; update the pattern to allow an optional
generic parameter or non-word boundary after the type name (e.g., permit '<' or
end/non-word boundary) so regexpMatch will match both plain and generic impls;
adjust the concatenation that builds the pattern using traitName and
t.getName().getText() accordingly so fileRelPath/sourceLineContent still find
manual impls for serde traits.
contrib/lint/lint_codeql.py (1)

154-164: 💤 Low value

PR title mentions --skip-cache/-s but implementation uses --cache/-c.

The PR title describes adding a --skip-cache flag, but the implementation uses --cache with opposite semantics (opt-in caching vs opt-out). The current implementation (opt-in caching) is a sensible default—just ensure the PR description is updated to reflect the actual behavior.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@contrib/lint/lint_codeql.py` around lines 154 - 164, The PR title and
description reference a `--skip-cache`/`-s` flag but the actual implementation
in the _parse_args function uses `--cache`/`-c` with opposite semantics (opt-in
caching). Since the implementation is correct and sensible, update the PR title
and description to accurately reflect the actual `--cache`/`-c` flag behavior
instead of the `--skip-cache` terminology mentioned in the PR metadata.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@contrib/codeql/lib/policy.qll`:
- Around line 149-158: The regex used in the exists(...) clause (via
sourceLineContent(...).regexpMatch(...)) uses \b around t.getName().getText()
which fails to match generic impls like "impl serde::Serialize for
Container<T>"; update the pattern to allow an optional generic parameter or
non-word boundary after the type name (e.g., permit '<' or end/non-word
boundary) so regexpMatch will match both plain and generic impls; adjust the
concatenation that builds the pattern using traitName and t.getName().getText()
accordingly so fileRelPath/sourceLineContent still find manual impls for serde
traits.

In `@contrib/lint/lint_codeql.py`:
- Around line 154-164: The PR title and description reference a
`--skip-cache`/`-s` flag but the actual implementation in the _parse_args
function uses `--cache`/`-c` with opposite semantics (opt-in caching). Since the
implementation is correct and sensible, update the PR title and description to
accurately reflect the actual `--cache`/`-c` flag behavior instead of the
`--skip-cache` terminology mentioned in the PR metadata.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0b490ff3-13d0-4d2c-a25b-01b4f8612e1c

📥 Commits

Reviewing files that changed from the base of the PR and between ae2b110 and dfe3bb2.

📒 Files selected for processing (7)
  • contrib/codeql/import.ql
  • contrib/codeql/lib/policy.qll
  • contrib/lint/common.py
  • contrib/lint/lint_codeql.py
  • contrib/lint/lint_javascript.py
  • contrib/lint/lint_python.py
  • contrib/lint/lint_semgrep.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • contrib/lint/lint_semgrep.py
  • contrib/lint/lint_javascript.py
  • contrib/lint/common.py

@kwvg kwvg changed the title sdk%fix(codeql): deal with inconsistent caches, allocate half of available memory, fix CSV parsing bug, add --skip-cache/-s sdk%fix(codeql): deal with inconsistent caches, allocate half of available memory, fix CSV parsing bug, add --cache/-c Jun 7, 2026
@kwvg kwvg merged commit bb4707c into dashpay:develop Jun 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant