Skip to content

feat: cloud pricing, org hierarchy, GitHub ETL, extensions#15

Merged
jeremyeder merged 5 commits intomainfrom
feat/cloud-pricing-and-extensions
Apr 15, 2026
Merged

feat: cloud pricing, org hierarchy, GitHub ETL, extensions#15
jeremyeder merged 5 commits intomainfrom
feat/cloud-pricing-and-extensions

Conversation

@jeremyeder
Copy link
Copy Markdown
Contributor

Summary

  • Cloud pricing: scripts/fetch_pricing.py fetches from public AWS pricing bulk files (no auth) + hardcoded Anthropic published Claude model rates. Includes discount factor support and ROSA cluster cost estimation.
  • Org hierarchy tools: get_direct_reports, get_org_tree, get_management_chain — recursive CTE queries against the person table for traversing manager relationships.
  • GitHub ETL: scripts/fetch_github.py --org YOUR_ORG fetches repos, commits, PRs, issues, releases, code frequency. Includes 6 MCP tools for querying GitHub data.
  • Multi-DB attachment: Extensible pattern for attaching pricing.db, github.db (read-only, when present). Schema resources expose DDL + row counts.
  • Extension loader: extensions/ directory for local-only plugins not tracked upstream (e.g., org-specific integrations).
  • Security hooks: check-no-secrets.sh scans for API keys/tokens, check-protected-files.sh guards governance/ directory.
  • Catalog refresh: scripts/refresh_catalog.py --refresh fetches fresh data from configured sources.

Test plan

  • uv run scripts/fetch_pricing.py --claude-only adds Claude model pricing to pricing.db
  • uv run scripts/fetch_pricing.py --skip-rosa populates pricing.db with AWS public pricing
  • uv run scripts/fetch_github.py --org YOUR_ORG populates github.db
  • MCP server starts: uv run mcp_server.py
  • cloud_pricing_lookup tool returns results
  • github_org_summary tool returns results
  • get_org_tree / get_direct_reports / get_management_chain work
  • scripts/test.sh passes
  • No PII, org-specific data, or credentials in any committed file

🤖 Generated with Claude Code

Ambient Code Bot and others added 4 commits April 14, 2026 21:48
Adds scripts/fetch_pricing.py which fetches from public AWS pricing
bulk files (no auth) and hardcoded Anthropic published rates.
Includes discount factor support and ROSA cluster cost estimation.

Also adds:
- Multi-DB attachment pattern (pricing.db, extensible)
- Extension loader for local-only plugins
- Pricing MCP tools (cloud_pricing_lookup, rosa_cluster_costs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…hain)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add check-no-secrets.sh: scans staged files for API keys, tokens,
  passwords, private keys, and bearer tokens
- Add check-protected-files.sh: blocks changes to governance/ paths
  (configurable via PROTECTED_PATHS env var, bypass with ALLOW_PROTECTED=1)
- Add refresh_catalog.py: scans data/, renames files to date-prefix
  convention, maintains DATA_CATALOG.yaml, and optionally fetches from
  Jira and web URLs via --refresh flag
- Update .pre-commit-config.yaml to use external hook scripts
- Update CLAUDE.md with refresh_catalog.py command

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: f112a782-957b-4c2e-9e43-48bf60e977fd

📥 Commits

Reviewing files that changed from the base of the PR and between 43dc833 and e38fed9.

📒 Files selected for processing (4)
  • mcp_server.py
  • scripts/fetch_github.py
  • scripts/fetch_pricing.py
  • scripts/refresh_catalog.py

Walkthrough

This pull request adds large data-fetching scripts (GitHub and cloud pricing), a catalog refresh utility, MCP server extensions exposing org, pricing, and GitHub query tools, pre-commit hook scripts to detect secrets and protect files, startup extension loading, and test script checks for optional DBs.

Changes

Cohort / File(s) Summary
Commands & Docs
\.claude/CLAUDE.md
Added three uv run commands to run scripts/fetch_pricing.py, scripts/fetch_github.py --org YOUR_ORG, and scripts/refresh_catalog.py --refresh.
Git & Hooks Config
\.gitignore, .pre-commit-config.yaml
Ignored extensions/; replaced inline no-secrets hook with scripts/check-no-secrets.sh (language → script, removed exclude); added protected-files local hook running scripts/check-protected-files.sh.
Pre-commit Hook Scripts
scripts/check-no-secrets.sh, scripts/check-protected-files.sh
New bash hooks: secret-pattern grep scanner; protected-paths blocker (defaults to governance/) with ALLOW_PROTECTED bypass.
GitHub data ingestion
scripts/fetch_github.py
New large Python script that builds/maintains data/github.db via gh api calls: schema, views, throttling/retries, incremental/upsert logic, per-repo processing (repos, languages, contributors, commits, PRs/reviews, issues, releases, code frequency), and CLI options (--org, --repo, --since, --full, --skip-commits).
Cloud pricing ingestion
scripts/fetch_pricing.py
New Python script that builds/updates data/pricing.db from AWS bulk JSON, hardcoded Claude rates, and optional ROSA/oc discovery; creates pricing schema/views, computes discounted/hourly/monthly costs, supports region/service filters and CLI flags, and upserts rows.
Catalog refresh utility
scripts/refresh_catalog.py
New script to rebuild data/DATA_CATALOG.yaml, optionally refresh upstream sources (jira, web-pdf, csv), detect content changes via hashes, atomically update files, normalize filenames to YYYY-MM-DD-kebab-case.ext, and preserve manual metadata.
MCP server extensions
mcp_server.py
Added conditional ATTACH of optional data/pricing.db and data/github.db; new MCP resources gps://pricing-schema, gps://github-schema; org-hierarchy tools (get_direct_reports, get_org_tree, get_management_chain); cloud pricing tools (cloud_pricing_lookup, rosa_cluster_costs); GitHub query tools (github_org_summary, search_github_*, github_code_stats); and dynamic loading of extensions/*.py.
Tests & CI scripts
scripts/test.sh
When data/pricing.db/data/github.db exist, run PRAGMA integrity_check and basic table existence queries, failing on non-OK results; integrates with existing test summary behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant MCP as MCP Server
  participant PDB as pricing.db
  participant GDB as github.db
  participant Ext as extensions/*.py

  Client->>MCP: request (e.g., cloud_pricing_lookup / github query / gps://pricing-schema)
  MCP-->>PDB: SELECT schema/queries (if attached)
  MCP-->>GDB: SELECT schema/queries (if attached)
  MCP->>Ext: Invoke extension hooks (if loaded)
  MCP-->>Client: JSON / DDL / error response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main features added: cloud pricing, org hierarchy tools, GitHub ETL, and extension support.
Description check ✅ Passed The PR description is comprehensive and directly related to the changeset, covering all major additions including pricing scripts, org tools, GitHub ETL, extension loader, security hooks, and catalog refresh.
Docstring Coverage ✅ Passed Docstring coverage is 84.06% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cloud-pricing-and-extensions

Comment @coderabbitai help to get the list of available commands and usage tips.

# Handle collision: if target already exists, skip
new_path = DATA_DIR / new_name
if new_path.exists() and new_path != filepath:
print(f" SKIP (target exists): {old_name} -> {new_name}")
renames[old_name] = new_name

if dry_run:
print(f" Would rename: {old_name} -> {new_name}")
print(f" Would rename: {old_name} -> {new_name}")
else:
filepath.rename(new_path)
print(f" Renamed: {old_name} -> {new_name}")
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mcp_server.py (1)

1-1: ⚠️ Potential issue | 🟡 Minor

CI failure: Ruff format check failed.

The pipeline indicates this file needs reformatting. Run:

uv run --extra dev ruff format mcp_server.py
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@mcp_server.py` at line 1, The file contains a formatting mismatch (starts
with "#!/usr/bin/env python3"); run the ruff formatter to fix style issues by
executing the suggested command (uv run --extra dev ruff format mcp_server.py)
or apply equivalent ruff formatting so the file passes the Ruff format check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@mcp_server.py`:
- Around line 946-971: The repo search currently issues an unbounded query and
then does an N+1 per-repo language lookup (see the SELECT into repos and the
loop that queries github.gh_repo_language), so modify the function to
accept/validate a limit parameter and add a LIMIT clause to the main repo SQL,
and replace the per-repo language query with a single batched query: collect
repo_id values from the fetched repos, run one SELECT repo_id, language, bytes
FROM github.gh_repo_language WHERE repo_id IN (...) ORDER BY bytes DESC, then
build a langs_by_repo map and assign r["languages"] =
langs_by_repo.get(r["repo_id"], []) for each repo (use the same
params/placeholder style and _rows_to_dicts-like formatting for consistency).

In `@scripts/check-no-secrets.sh`:
- Around line 8-12: The script scripts/check-no-secrets.sh currently greps every
file from the for-loop (for f in "$@") and can produce false positives for known
safe files; add an exclusion check before running grep to skip filenames or
paths such as .env.example, .pre-commit-config.yaml, .gitignore and doc files
(e.g., *.md, *.mdx) so the grep is only run on relevant source/config files, or
alternatively re-introduce the exclude pattern into .pre-commit-config.yaml;
implement the skip by matching $f against an array or case pattern and continue
the loop if matched (preserving the existing echo "  $f" and found variable
behavior).

In `@scripts/fetch_github.py`:
- Around line 1-5: Run ruff format on the script to fix formatting issues
flagged by CI: execute "ruff format scripts/fetch_github.py" (or run your
project's formatter) to reformat the file header and body so it passes "ruff
format --check"; after formatting, verify the shebang and comment block remain
intact and commit the updated scripts/fetch_github.py.
- Around line 122-124: The 5xx detection loop is inefficient; replace the
any(str(c) in stderr for c in range(500, 600)) check with a direct pattern or
numeric extraction from stderr (e.g., use a regex like '\b5\d{2}\b' to detect
any 5xx status) or parse an HTTP/status code variable if available, then keep
the same backoff logic using attempt and time.sleep; update the conditional in
the retry block that references stderr and attempt so it reliably matches 5xx
responses without iterating the whole range.
- Around line 186-189: The current fragile concat (in the block checking `if
paginate and body.startswith("["):`) naively replaces "][` and `]\n[` which can
corrupt JSON when those sequences appear in string values; instead, split `body`
into separate JSON array tokens and parse each safely (e.g., locate balanced
brackets or split on `]\n[`/`][` then wrap/ensure valid arrays), json.loads each
array and extend a single result list, or switch to invoking GH with `--jq` to
produce newline-delimited objects and parse line-by-line—update the code that
mutates `body` to implement one of these safe merges using `json.loads` rather
than simple string replace.

In `@scripts/fetch_pricing.py`:
- Around line 1-7: CI failed ruff format check for the Python script starting
with "#!/usr/bin/env python3"; run the code formatter (ruff format) on that
script to apply the style fixes, re-check with `ruff format --check`, and commit
the formatted file so the pipeline passes.
- Around line 66-75: The CLAUDE_PRICING constant is hardcoded without a
verification date; update the declaration for CLAUDE_PRICING to include a
comment above it stating the date (e.g., "verified YYYY-MM-DD") and optionally
the Anthropic pricing URL or source; ensure the comment is placed immediately
above the CLAUDE_PRICING list so future maintainers can see when the values were
last confirmed.

In `@scripts/refresh_catalog.py`:
- Around line 1-5: CI is failing ruff format --check for the script; run the
formatter (ruff format) on the script (scripts/refresh_catalog.py) to fix
formatting issues, stage and commit the updated file so the formatted
shebang/comments and any minor whitespace changes pass the ruff check.
- Around line 127-138: get_birth_date currently uses macOS-only `stat -f %SB`
and should be made cross-platform: update get_birth_date to first try native
Python file metadata (Path(filepath).stat().st_birthtime when present), then if
that isn't available attempt GNU stat (`stat --format=%w <file>`) via
subprocess.run (check output is not "-" and parse YYYY-MM-DD), and only then
fall back to datetime.now().strftime("%Y-%m-%d"); use sys.platform or
shutil.which("stat") to decide behavior and adjust exception handling around
subprocess.run to return the fallback on error. Ensure you reference the
existing get_birth_date function and subprocess.run usage when making changes.

---

Outside diff comments:
In `@mcp_server.py`:
- Line 1: The file contains a formatting mismatch (starts with "#!/usr/bin/env
python3"); run the ruff formatter to fix style issues by executing the suggested
command (uv run --extra dev ruff format mcp_server.py) or apply equivalent ruff
formatting so the file passes the Ruff format check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 496ed0af-6e7a-4ebb-8d81-be5c016a7af3

📥 Commits

Reviewing files that changed from the base of the PR and between 1a71b35 and 43dc833.

📒 Files selected for processing (10)
  • .claude/CLAUDE.md
  • .gitignore
  • .pre-commit-config.yaml
  • mcp_server.py
  • scripts/check-no-secrets.sh
  • scripts/check-protected-files.sh
  • scripts/fetch_github.py
  • scripts/fetch_pricing.py
  • scripts/refresh_catalog.py
  • scripts/test.sh

Comment thread mcp_server.py
Comment on lines +8 to +12
for f in "$@"; do
if grep -qE '(API_TOKEN|API_SECRET|API_KEY|PASSWORD|PRIVATE_KEY|Bearer |client_secret|-----BEGIN .* KEY-----)' "$f" 2>/dev/null; then
echo " $f"
found=1
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Missing exclusion logic may cause false positives.

The previous hook excluded files like .env.example, .pre-commit-config.yaml, and .gitignore. This script scans all text files without exclusions, so commits touching documentation that mentions "API_KEY" or "PASSWORD" conceptually will be blocked.

Consider either:

  1. Adding an exclusion list within the script, or
  2. Re-adding the exclude pattern to .pre-commit-config.yaml
Example in-script exclusion
 found=0
 for f in "$@"; do
+    # Skip known safe files
+    case "$f" in
+        *.example|.pre-commit-config.yaml|.gitignore|*.md) continue ;;
+    esac
     if grep -qE '(API_TOKEN|API_SECRET|API_KEY|PASSWORD|PRIVATE_KEY|Bearer |client_secret|-----BEGIN .* KEY-----)' "$f" 2>/dev/null; then
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for f in "$@"; do
if grep -qE '(API_TOKEN|API_SECRET|API_KEY|PASSWORD|PRIVATE_KEY|Bearer |client_secret|-----BEGIN .* KEY-----)' "$f" 2>/dev/null; then
echo " $f"
found=1
fi
for f in "$@"; do
# Skip known safe files
case "$f" in
*.example|.pre-commit-config.yaml|.gitignore|*.md) continue ;;
esac
if grep -qE '(API_TOKEN|API_SECRET|API_KEY|PASSWORD|PRIVATE_KEY|Bearer |client_secret|-----BEGIN .* KEY-----)' "$f" 2>/dev/null; then
echo " $f"
found=1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/check-no-secrets.sh` around lines 8 - 12, The script
scripts/check-no-secrets.sh currently greps every file from the for-loop (for f
in "$@") and can produce false positives for known safe files; add an exclusion
check before running grep to skip filenames or paths such as .env.example,
.pre-commit-config.yaml, .gitignore and doc files (e.g., *.md, *.mdx) so the
grep is only run on relevant source/config files, or alternatively re-introduce
the exclude pattern into .pre-commit-config.yaml; implement the skip by matching
$f against an array or case pattern and continue the loop if matched (preserving
the existing echo "  $f" and found variable behavior).

Comment thread scripts/fetch_github.py
Comment thread scripts/fetch_github.py
Comment on lines +122 to +124
if any(str(c) in stderr for c in range(500, 600)):
time.sleep(2 ** (attempt + 1))
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Inefficient 5xx error detection.

Iterating over range(500, 600) and checking string containment is awkward. Consider a regex or direct status code extraction.

-            if any(str(c) in stderr for c in range(500, 600)):
+            if re.search(r'\b5\d{2}\b', stderr):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/fetch_github.py` around lines 122 - 124, The 5xx detection loop is
inefficient; replace the any(str(c) in stderr for c in range(500, 600)) check
with a direct pattern or numeric extraction from stderr (e.g., use a regex like
'\b5\d{2}\b' to detect any 5xx status) or parse an HTTP/status code variable if
available, then keep the same backoff logic using attempt and time.sleep; update
the conditional in the retry block that references stderr and attempt so it
reliably matches 5xx responses without iterating the whole range.

Comment thread scripts/fetch_github.py
Comment on lines +186 to +189
if paginate and body.startswith("["):
# gh --paginate concatenates arrays as ][
# or sometimes just newline-separated arrays
body = body.replace("]\n[", ",").replace("][", ",")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fragile JSON array concatenation.

The gh --paginate output handling assumes ][ or ]\n[ never appears in JSON string values. While unlikely for GitHub API data, a malformed commit message or issue body containing these sequences could corrupt parsing.

A safer approach would be to use --jq to flatten at the source or to parse each JSON array separately and merge in Python:

Safer alternative
-        # Paginated output may be multiple JSON arrays concatenated
-        if paginate and body.startswith("["):
-            # gh --paginate concatenates arrays as ][
-            # or sometimes just newline-separated arrays
-            body = body.replace("]\n[", ",").replace("][", ",")
+        # Paginated output: multiple JSON arrays, parse each separately
+        if paginate and body.startswith("["):
+            import re
+            arrays = re.split(r'(?<=\])\s*(?=\[)', body)
+            merged = []
+            for arr_str in arrays:
+                try:
+                    merged.extend(json.loads(arr_str))
+                except json.JSONDecodeError:
+                    pass
+            return merged
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/fetch_github.py` around lines 186 - 189, The current fragile concat
(in the block checking `if paginate and body.startswith("["):`) naively replaces
"][` and `]\n[` which can corrupt JSON when those sequences appear in string
values; instead, split `body` into separate JSON array tokens and parse each
safely (e.g., locate balanced brackets or split on `]\n[`/`][` then wrap/ensure
valid arrays), json.loads each array and extend a single result list, or switch
to invoking GH with `--jq` to produce newline-delimited objects and parse
line-by-line—update the code that mutates `body` to implement one of these safe
merges using `json.loads` rather than simple string replace.

Comment thread scripts/fetch_pricing.py
Comment thread scripts/fetch_pricing.py
Comment on lines +66 to +75
CLAUDE_PRICING = [
{"model": "claude-opus-4", "input_per_mtok": 15.00, "output_per_mtok": 75.00},
{"model": "claude-sonnet-4", "input_per_mtok": 3.00, "output_per_mtok": 15.00},
{"model": "claude-haiku-4", "input_per_mtok": 0.80, "output_per_mtok": 4.00},
{"model": "claude-3-5-sonnet", "input_per_mtok": 3.00, "output_per_mtok": 15.00},
{"model": "claude-3-5-haiku", "input_per_mtok": 0.80, "output_per_mtok": 4.00},
{"model": "claude-3-opus", "input_per_mtok": 15.00, "output_per_mtok": 75.00},
{"model": "claude-3-sonnet", "input_per_mtok": 3.00, "output_per_mtok": 15.00},
{"model": "claude-3-haiku", "input_per_mtok": 0.25, "output_per_mtok": 1.25},
]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider documenting the pricing source date.

The hardcoded Claude pricing should include a comment with the date it was last verified against Anthropic's pricing page, making it easier to track staleness.

 # Published Anthropic Claude model pricing (per million tokens, USD)
-# Source: https://www.anthropic.com/pricing — update when pricing changes
+# Source: https://www.anthropic.com/pricing
+# Last verified: 2026-04-XX — update when pricing changes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/fetch_pricing.py` around lines 66 - 75, The CLAUDE_PRICING constant
is hardcoded without a verification date; update the declaration for
CLAUDE_PRICING to include a comment above it stating the date (e.g., "verified
YYYY-MM-DD") and optionally the Anthropic pricing URL or source; ensure the
comment is placed immediately above the CLAUDE_PRICING list so future
maintainers can see when the values were last confirmed.

Comment thread scripts/refresh_catalog.py
Comment on lines +127 to +138
def get_birth_date(filepath: Path) -> str:
"""Get the file's birth date as YYYY-MM-DD using stat."""
try:
result = subprocess.run(
["stat", "-f", "%SB", "-t", "%Y-%m-%d", str(filepath)],
capture_output=True,
text=True,
check=True,
)
return result.stdout.strip()
except subprocess.CalledProcessError:
return datetime.now().strftime("%Y-%m-%d")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

stat -f %SB is macOS-specific.

This syntax fails on Linux. The fallback to datetime.now() works, but Linux users won't get accurate birth dates. Consider detecting the platform:

Cross-platform approach
 def get_birth_date(filepath: Path) -> str:
     """Get the file's birth date as YYYY-MM-DD using stat."""
+    import platform
     try:
-        result = subprocess.run(
-            ["stat", "-f", "%SB", "-t", "%Y-%m-%d", str(filepath)],
-            capture_output=True,
-            text=True,
-            check=True,
-        )
-        return result.stdout.strip()
+        if platform.system() == "Darwin":
+            result = subprocess.run(
+                ["stat", "-f", "%SB", "-t", "%Y-%m-%d", str(filepath)],
+                capture_output=True, text=True, check=True,
+            )
+            return result.stdout.strip()
+        else:
+            # Linux: use mtime as fallback (birth time often unavailable)
+            mtime = filepath.stat().st_mtime
+            return datetime.fromtimestamp(mtime).strftime("%Y-%m-%d")
     except subprocess.CalledProcessError:
         return datetime.now().strftime("%Y-%m-%d")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/refresh_catalog.py` around lines 127 - 138, get_birth_date currently
uses macOS-only `stat -f %SB` and should be made cross-platform: update
get_birth_date to first try native Python file metadata
(Path(filepath).stat().st_birthtime when present), then if that isn't available
attempt GNU stat (`stat --format=%w <file>`) via subprocess.run (check output is
not "-" and parse YYYY-MM-DD), and only then fall back to
datetime.now().strftime("%Y-%m-%d"); use sys.platform or shutil.which("stat") to
decide behavior and adjust exception handling around subprocess.run to return
the fallback on error. Ensure you reference the existing get_birth_date function
and subprocess.run usage when making changes.

- Sanitize exception output in Jira fetch to prevent credential leakage
  in tracebacks (CodeQL clear-text logging fix)
- Add presence-only credential status logging (set/not set, never values)
- Add limit parameter and LIMIT clause to search_github_repos
- Replace N+1 per-repo language queries with single batched query
- Auto-format all scripts with ruff format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jeremyeder jeremyeder merged commit 041b343 into main Apr 15, 2026
4 of 8 checks passed
@jeremyeder jeremyeder deleted the feat/cloud-pricing-and-extensions branch April 15, 2026 02:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants