Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds AI-assisted evaluation summarization pipeline including GitHub Actions workflow, OpenAI-compatible LLM client, database model for storing narrative summaries, evaluation processing module with async concurrency control, and transformation logic to integrate summaries into the database. Changes
Sequence DiagramsequenceDiagram
participant User
participant Main as main.py
participant Crawl as start_crawl
participant Summarize as summarize_evals
participant Cache as Disk Cache
participant LLMClient as LLMClient
participant OpenAI as OpenAI API
participant Database as Database
User->>Main: Run with --summarize-evals
Main->>Crawl: start_crawl(args)
Crawl->>Database: Fetch/process course data
Crawl-->>Main: Return seasons list
Main->>Summarize: summarize_evals(seasons, api_key)
loop For each season
Summarize->>Cache: Load parsed_evaluations/{season}.json
Summarize->>Cache: Load existing evaluation_summaries/{season}.json
loop For each course with 3+ comments
Summarize->>LLMClient: complete(narrative_comments)
LLMClient->>OpenAI: POST /chat/completions
OpenAI-->>LLMClient: AI-generated summary
LLMClient-->>Summarize: Return summary text
end
Summarize->>Cache: Write evaluation_summaries/{season}.json
end
Summarize->>Database: Insert/update EvaluationNarrativeSummary records
Database-->>Main: Summaries stored
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @.github/workflows/summarize-evals.yml:
- Around line 29-32: Update the GitHub Actions step that currently uses
actions/setup-python@v4 to the recommended v5; locate the workflow step with the
name "Set up Python 3.12" and change the uses value from actions/setup-python@v4
to actions/setup-python@v5 while keeping the existing python-version: "3.12"
configuration intact.
- Around line 42-49: The "Summarize Evaluations" workflow step currently
interpolates `${{ inputs.seasons }}` directly into the shell which risks command
injection; change the step to set an environment variable (e.g., SEASONS) from
`${{ inputs.seasons }}` and then call python main.py with --seasons "$SEASONS"
(preserving other envs like OPENAI_API_KEY and SENTRY_URL), ensuring the value
is passed as a single quoted/escaped argument instead of being expanded into the
shell command.
- Around line 10-13: Add an explicit least-privilege permissions block to the
summarize_evals job to avoid default broad permissions; inside the job
definition (jobs -> summarize_evals) add a permissions mapping (for example at
minimum permissions: contents: read) or other exact scopes the job requires so
the workflow no longer relies on default permissions.
In `@ferry/summarize/summarize_evals.py`:
- Around line 83-94: The code assumes response.choices[0].message.content is
always a string and calls .strip(), which can raise AttributeError if content is
None; update the return logic in summarize_evals.py (the block using
openai_client.chat.completions.create and returning
response.choices[0].message.content.strip()) to safely handle None by checking
the value (e.g., assign content = response.choices[0].message.content or "" /
fallback text) and then call .strip() on the confirmed string, optionally
logging or raising a clear error if content is unexpectedly missing.
🧹 Nitpick comments (3)
ferry/summarize/summarize_evals.py (1)
14-14: Unused import:ujson.The
ujsonimport appears unused in this module. JSON operations are handled by the importedload_cache_jsonandsave_cache_jsonfunctions fromferry.crawler.cache.♻️ Proposed fix
-import ujsonmain.py (1)
136-142: Consider usingValueErrorinstead ofassertfor runtime validation.Using
assertfor input validation can be bypassed if Python runs with-O(optimizations enabled). AValueErrorprovides a cleaner user experience and is more robust.♻️ Proposed fix
if args.summarize_evals: - assert args.openai_api_key, "OpenAI API key is required for --summarize-evals" + if not args.openai_api_key: + raise ValueError("OpenAI API key is required for --summarize-evals") await summarize_evals( seasons=seasons, data_dir=args.data_dir, openai_api_key=args.openai_api_key, )pyproject.toml (1)
30-30: Consider pinning theopenaiversion for consistency and reproducibility.All other dependencies use exact version pinning (e.g.,
==4.13.3), butopenai>=1.0.0uses a minimum version constraint. This inconsistency could lead to non-reproducible builds across environments—if a future OpenAI SDK release introduces behavioral changes, different versions might be installed in different builds. Either pin to a specific version matching your requirements or document the intentional flexibility if loose pinning is necessary for compatibility.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
ferry/transform/import_evaluations.py (1)
53-104: Consider extracting the shared mapping logic.The
season_crn_to_course_idmapping (lines 65-75) duplicates the same logic frommatch_evaluations_to_courses(lines 21-31). Consider extracting this to a shared helper function to reduce duplication.♻️ Example refactor
+def _build_season_crn_to_course_id( + listings: pd.DataFrame, +) -> dict[str, dict[int, float]]: + """Build a nested dict mapping (season_code, crn) -> course_id.""" + return ( + listings[["season_code", "course_id", "crn"]] + .groupby("season_code") + .apply( + lambda x: x[["crn", "course_id"]] + .set_index("crn")["course_id"] + .astype(float) + .to_dict() + ) + .to_dict() + )Then use
_build_season_crn_to_course_id(listings)in bothmatch_evaluations_to_coursesandimport_evaluation_summaries.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/transform/import_evaluations.py` around lines 53 - 104, The code duplicates the season_crn_to_course_id construction inside import_evaluation_summaries; extract that logic into a shared helper named e.g. _build_season_crn_to_course_id(listings) that returns the same dict used now, and replace the inline construction in both import_evaluation_summaries and match_evaluations_to_courses with calls to _build_season_crn_to_course_id(listings), keeping the existing behavior and types (including the int casting and to_dict conversion).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ferry/summarize/summarize_evals.py`:
- Around line 151-155: The CRN may be an int in course_eval but
CourseSummary.crn is declared as str, so normalize CRNs to strings everywhere:
convert course_eval["crn"] to str when building the returned dict (the block
returning {"crn": course_eval["crn"], ...}) and also ensure any places that add
or compare CRNs (e.g., the membership check using already_done and the addition
where c["crn"] is used) convert values to str first; update instances involving
CourseSummary creation, the return value from summarize function, and the
membership logic that checks c["crn"] not in already_done to use str(c["crn"])
so types match consistently.
- Around line 28-34: Update the invalid model constant OPENAI_MODEL to the
correct production model name "gpt-4.1-mini-2025-04-14" (replace the current
"gpt-4.1-mini") so runtime API calls succeed, and remove the development cap by
setting MAX_COURSES_PER_SEASON to None (instead of 10) to avoid limiting
production summarization; change these two constants (OPENAI_MODEL and
MAX_COURSES_PER_SEASON) in summarize_evals.py accordingly.
---
Nitpick comments:
In `@ferry/transform/import_evaluations.py`:
- Around line 53-104: The code duplicates the season_crn_to_course_id
construction inside import_evaluation_summaries; extract that logic into a
shared helper named e.g. _build_season_crn_to_course_id(listings) that returns
the same dict used now, and replace the inline construction in both
import_evaluation_summaries and match_evaluations_to_courses with calls to
_build_season_crn_to_course_id(listings), keeping the existing behavior and
types (including the int casting and to_dict conversion).
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
ferry/database/__init__.pyferry/database/models.pyferry/database/sync_db_courses_old.pyferry/database/sync_db_evals.pyferry/summarize/summarize_evals.pyferry/transform/__init__.pyferry/transform/import_evaluations.py
💤 Files with no reviewable changes (2)
- ferry/database/sync_db_courses_old.py
- ferry/database/sync_db_evals.py
There was a problem hiding this comment.
♻️ Duplicate comments (2)
.github/workflows/summarize-evals.yml (2)
31-34:⚠️ Potential issue | 🟠 MajorUpgrade
actions/setup-pythonto@v5.Line [32] still uses
actions/setup-python@v4, which is flagged by actionlint as too old for current runners.🔧 Proposed fix
- name: Set up Python 3.12 - uses: actions/setup-python@v4 + uses: actions/setup-python@v5 with: python-version: "3.12"#!/bin/bash set -euo pipefail # Verify outdated setup-python usage in workflows rg -n 'uses:\s*actions/setup-python@v4' .github/workflows🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/summarize-evals.yml around lines 31 - 34, Update the workflow step that sets up Python to use the latest setup-python action: replace the uses value "actions/setup-python@v4" with "actions/setup-python@v5" in the step named "Set up Python 3.12" so the workflow uses the supported action version; keep the python-version: "3.12" line unchanged.
44-51:⚠️ Potential issue | 🟠 MajorAvoid direct shell interpolation of workflow input.
Line [50] passes
${{ inputs.seasons }}directly into a shell command. This is injection-prone and can break argument boundaries.🛡️ Proposed fix
- name: Summarize Evaluations env: OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }} + SEASONS: ${{ inputs.seasons }} run: | python main.py \ -f config/release_summarize_evals.yml \ - --seasons ${{ inputs.seasons }} \ + --seasons "$SEASONS" \ --sentry-url "${{ secrets.SENTRY_URL }}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/summarize-evals.yml around lines 44 - 51, The workflow step named "Summarize Evaluations" is directly interpolating ${{ inputs.seasons }} into the shell command (python main.py ... --seasons ${{ inputs.seasons }}), which is injection-prone; move the input into an environment variable (e.g., add SEASONS: ${{ inputs.seasons }} alongside OPENAI_API_KEY) and update the python invocation to use the env var with proper quoting (--seasons "$SEASONS") so the --seasons argument is passed safely and preserves boundaries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.github/workflows/summarize-evals.yml:
- Around line 31-34: Update the workflow step that sets up Python to use the
latest setup-python action: replace the uses value "actions/setup-python@v4"
with "actions/setup-python@v5" in the step named "Set up Python 3.12" so the
workflow uses the supported action version; keep the python-version: "3.12" line
unchanged.
- Around line 44-51: The workflow step named "Summarize Evaluations" is directly
interpolating ${{ inputs.seasons }} into the shell command (python main.py ...
--seasons ${{ inputs.seasons }}), which is injection-prone; move the input into
an environment variable (e.g., add SEASONS: ${{ inputs.seasons }} alongside
OPENAI_API_KEY) and update the python invocation to use the env var with proper
quoting (--seasons "$SEASONS") so the --seasons argument is passed safely and
preserves boundaries.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
ferry/summarize/summarize_evals.py (1)
28-34:⚠️ Potential issue | 🔴 CriticalFix invalid model name and remove development cap before merging.
These issues were flagged in a previous review and remain unaddressed:
- Line 28:
"gpt-4.1-mini"is not a valid OpenAI model name. Use"gpt-4.1-mini-2025-04-14"or another valid model identifier.- Line 34:
MAX_COURSES_PER_SEASON = 10is a development/testing artifact that will limit production summarization to only 10 courses per season. Set toNonefor production.🔧 Proposed fix
# OpenAI model to use for summarization. -OPENAI_MODEL = "gpt-4.1-mini" +OPENAI_MODEL = "gpt-4.1-mini-2025-04-14" # Maximum concurrent API requests to avoid rate-limit pressure. MAX_CONCURRENT_REQUESTS = 10 # Cap courses per season for testing (set to None for no limit). -MAX_COURSES_PER_SEASON = 10 +MAX_COURSES_PER_SEASON = None,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/summarize/summarize_evals.py` around lines 28 - 34, Update the invalid model constant OPENAI_MODEL to a valid model identifier (e.g., "gpt-4.1-mini-2025-04-14") so API calls use a supported model, and remove the development cap by setting MAX_COURSES_PER_SEASON to None (instead of 10) so production summarization isn’t limited; adjust these constants where they are defined (OPENAI_MODEL and MAX_COURSES_PER_SEASON) in summarize_evals.py.
🧹 Nitpick comments (2)
ferry/summarize/summarize_evals.py (2)
234-235: Verbose per-summary output may flood logs in production.Line 235 prints each individual summary as it's generated. With many courses and multiple questions per course, this could produce excessive output. Consider reducing verbosity or making it conditional (e.g., via a
--verboseflag orlogging.debug).♻️ Suggested refinement
if result is not None: new_summaries.append(result) - for ns in result["narrative_summaries"]: - print(f" [{result['season']}] crn={result['crn']} {ns['question_code']}: {ns['summary']}") + logging.debug( + "Summarised crn=%s: %d questions", + result["crn"], + len(result["narrative_summaries"]), + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/summarize/summarize_evals.py` around lines 234 - 235, The loop that prints each per-summary line (iterating over result["narrative_summaries"] and using print(f" [{result['season']}] crn={result['crn']} {ns['question_code']}: {ns['summary']}")) is too verbose for production; change it to honor a verbosity setting by replacing the print with a logging.debug (or logging.info) call and wire that to a command-line --verbose flag (or an existing verbosity/config variable) in summarize_evals.py so detailed per-summary lines are only emitted when verbose mode is enabled.
146-153: Consider narrowing the exception type.Catching bare
Exception(flagged by static analysis) is intentional here for resilience, but could mask unexpected errors. Consider catchingopenai.OpenAIError(or its subclasses) specifically, while letting truly unexpected exceptions propagate.♻️ Optional refinement
+from openai import OpenAIError + # ... in _summarize_course: - except Exception as exc: + except (OpenAIError, ValueError) as exc: logging.warning( "Failed to summarize %s/%s crn=%s: %s",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/summarize/summarize_evals.py` around lines 146 - 153, Narrow the except clause in the summarization try/except block to catch OpenAI-specific errors instead of a bare Exception: replace "except Exception as exc" with an exception handler for openai.OpenAIError (or its relevant subclasses) around the code that calls the OpenAI API, log the error with the same message including course_eval and question_code, and allow other unexpected exceptions to propagate (or re-raise them) so they aren't silently swallowed; locate the block referencing course_eval.get("season"), question_code, and course_eval.get("crn") in summarize_evals.py to make this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@ferry/summarize/summarize_evals.py`:
- Around line 28-34: Update the invalid model constant OPENAI_MODEL to a valid
model identifier (e.g., "gpt-4.1-mini-2025-04-14") so API calls use a supported
model, and remove the development cap by setting MAX_COURSES_PER_SEASON to None
(instead of 10) so production summarization isn’t limited; adjust these
constants where they are defined (OPENAI_MODEL and MAX_COURSES_PER_SEASON) in
summarize_evals.py.
---
Nitpick comments:
In `@ferry/summarize/summarize_evals.py`:
- Around line 234-235: The loop that prints each per-summary line (iterating
over result["narrative_summaries"] and using print(f" [{result['season']}]
crn={result['crn']} {ns['question_code']}: {ns['summary']}")) is too verbose for
production; change it to honor a verbosity setting by replacing the print with a
logging.debug (or logging.info) call and wire that to a command-line --verbose
flag (or an existing verbosity/config variable) in summarize_evals.py so
detailed per-summary lines are only emitted when verbose mode is enabled.
- Around line 146-153: Narrow the except clause in the summarization try/except
block to catch OpenAI-specific errors instead of a bare Exception: replace
"except Exception as exc" with an exception handler for openai.OpenAIError (or
its relevant subclasses) around the code that calls the OpenAI API, log the
error with the same message including course_eval and question_code, and allow
other unexpected exceptions to propagate (or re-raise them) so they aren't
silently swallowed; locate the block referencing course_eval.get("season"),
question_code, and course_eval.get("crn") in summarize_evals.py to make this
change.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
ferry/ai/client.py (1)
32-32: Unnecessarysuper().__init__()call.
LLMClientdoesn't inherit from any custom class (only implicitly fromobject), so thesuper().__init__()call is unnecessary.♻️ Suggested fix
def __init__( self, api_key: str, *, base_url: str | None = None, model: str = DEFAULT_MODEL, ) -> None: - super().__init__() self._client = self._create_client(api_key=api_key, base_url=base_url) self.model = model🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/ai/client.py` at line 32, The constructor of LLMClient contains an unnecessary super().__init__() call since LLMClient does not inherit from any base class other than object; remove the super().__init__() invocation from LLMClient.__init__ to clean up the constructor and avoid redundant calls (search for the LLMClient class and its __init__ method to locate and delete the super().__init__() line).ferry/summarize/summarize_evals.py (2)
99-118: Tasks are created but awaited sequentially, losing parallelism.The code creates tasks with
asyncio.create_task()but then awaits them one-by-one in a sequential loop. This pattern doesn't fully leverage async parallelism since each task completes before the next is checked. While the semaphore allows up to 10 concurrent requests, the sequential awaiting here means only one question per course is processed at a time.Consider using
asyncio.gather()to await all tasks concurrently:♻️ Proposed refactor
if not tasks: return None - for question_code, question_text, task in tasks: - try: - summary_text = await task + results = await asyncio.gather(*[t[2] for t in tasks], return_exceptions=True) + + for (question_code, question_text, _), result in zip(tasks, results): + if isinstance(result, Exception): + logging.warning( + "Failed to summarize %s/%s crn=%s: %s", + course_eval.get("season"), + question_code, + course_eval.get("crn"), + result, + ) + continue + + if result: summaries.append( { "question_code": question_code, "question_text": question_text, - "summary": summary_text, + "summary": result, } ) - except Exception as exc: - logging.warning( - "Failed to summarize %s/%s crn=%s: %s", - course_eval.get("season"), - question_code, - course_eval.get("crn"), - exc, - )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/summarize/summarize_evals.py` around lines 99 - 118, The loop currently awaits each asyncio.create_task sequentially, losing parallelism; change the logic that creates tasks (tasks: list[tuple[str, str, asyncio.Task[str]]]) to await all tasks concurrently using asyncio.gather on the task objects (e.g., gather(*(t for _, _, t in tasks))), then iterate the gathered results and pair each summary result back to its question_code and question_text to build the return value; ensure you handle exceptions from gather (use return_exceptions=True or wrap handling) so a failing summary doesn't cancel others, and preserve the use of semaphore and _summarize_comments.
219-224: Courses are processed sequentially, limiting throughput.The
forloop withawait _summarize_course()processes one course at a time. Combined with the sequential task awaiting inside_summarize_course, this significantly limits throughput despite having a semaphore for 10 concurrent requests.Consider batching course processing to better utilize the semaphore:
♻️ Proposed approach
# Process courses in batches to better utilize concurrency async def process_batch(batch): return await asyncio.gather( *[_summarize_course(llm, c, semaphore) for c in batch], return_exceptions=True ) batch_size = MAX_CONCURRENT_REQUESTS for i in tqdm(range(0, len(to_process), batch_size), desc=f"Summarizing {season}"): batch = to_process[i:i + batch_size] results = await process_batch(batch) for result in results: if result is not None and not isinstance(result, Exception): new_summaries.append(result)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ferry/summarize/summarize_evals.py` around lines 219 - 224, The loop over to_process calls await _summarize_course sequentially, underutilizing the semaphore; change to batch-based concurrent execution by grouping to_process into chunks (size matching your semaphore/MAX_CONCURRENT_REQUESTS) and use asyncio.gather to run [_summarize_course(llm, c, semaphore) for c in batch] concurrently, then iterate gather results, skip Exceptions and None, and append valid results to new_summaries (ensuring you handle return_exceptions=True and filter exceptions); reference functions/variables: _summarize_course, semaphore, to_process, new_summaries, and the progress bar (tqdm) to show batch progress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@main.py`:
- Around line 136-144: Replace the runtime assertion with explicit validation:
in the block that runs summarize_evals (check for args.summarize_evals), replace
the line "assert args.openai_api_key, ..." with an if-statement that checks "if
not args.openai_api_key:" and then terminate with a clear error, e.g. raise
SystemExit("API key is required for --summarize-evals") or call sys.exit(1)
after printing to stderr; keep the call to summarize_evals (and its parameters:
seasons, data_dir=args.data_dir, api_key=args.openai_api_key,
model=args.llm_model or DEFAULT_MODEL, base_url=args.llm_base_url) unchanged.
---
Nitpick comments:
In `@ferry/ai/client.py`:
- Line 32: The constructor of LLMClient contains an unnecessary
super().__init__() call since LLMClient does not inherit from any base class
other than object; remove the super().__init__() invocation from
LLMClient.__init__ to clean up the constructor and avoid redundant calls (search
for the LLMClient class and its __init__ method to locate and delete the
super().__init__() line).
In `@ferry/summarize/summarize_evals.py`:
- Around line 99-118: The loop currently awaits each asyncio.create_task
sequentially, losing parallelism; change the logic that creates tasks (tasks:
list[tuple[str, str, asyncio.Task[str]]]) to await all tasks concurrently using
asyncio.gather on the task objects (e.g., gather(*(t for _, _, t in tasks))),
then iterate the gathered results and pair each summary result back to its
question_code and question_text to build the return value; ensure you handle
exceptions from gather (use return_exceptions=True or wrap handling) so a
failing summary doesn't cancel others, and preserve the use of semaphore and
_summarize_comments.
- Around line 219-224: The loop over to_process calls await _summarize_course
sequentially, underutilizing the semaphore; change to batch-based concurrent
execution by grouping to_process into chunks (size matching your
semaphore/MAX_CONCURRENT_REQUESTS) and use asyncio.gather to run
[_summarize_course(llm, c, semaphore) for c in batch] concurrently, then iterate
gather results, skip Exceptions and None, and append valid results to
new_summaries (ensuring you handle return_exceptions=True and filter
exceptions); reference functions/variables: _summarize_course, semaphore,
to_process, new_summaries, and the progress bar (tqdm) to show batch progress.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
config/release_summarize_evals.ymlferry/ai/__init__.pyferry/ai/client.pyferry/args_parser.pyferry/summarize/__init__.pyferry/summarize/summarize_evals.pymain.py
🚧 Files skipped from review as they are similar to previous changes (1)
- config/release_summarize_evals.yml
| if args.summarize_evals: | ||
| assert args.openai_api_key, "API key is required for --summarize-evals" | ||
| await summarize_evals( | ||
| seasons=seasons, | ||
| data_dir=args.data_dir, | ||
| api_key=args.openai_api_key, | ||
| model=args.llm_model or DEFAULT_MODEL, | ||
| base_url=args.llm_base_url, | ||
| ) |
There was a problem hiding this comment.
Replace assert with explicit error handling for API key validation.
Using assert for runtime validation is problematic because assertions can be disabled when Python runs with optimization flags (-O or -OO). This would silently bypass the API key check in optimized builds.
🛡️ Proposed fix
if args.summarize_evals:
- assert args.openai_api_key, "API key is required for --summarize-evals"
+ if not args.openai_api_key:
+ raise SystemExit("API key is required for --summarize-evals")
await summarize_evals(
seasons=seasons,
data_dir=args.data_dir,
api_key=args.openai_api_key,
model=args.llm_model or DEFAULT_MODEL,
base_url=args.llm_base_url,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@main.py` around lines 136 - 144, Replace the runtime assertion with explicit
validation: in the block that runs summarize_evals (check for
args.summarize_evals), replace the line "assert args.openai_api_key, ..." with
an if-statement that checks "if not args.openai_api_key:" and then terminate
with a clear error, e.g. raise SystemExit("API key is required for
--summarize-evals") or call sys.exit(1) after printing to stderr; keep the call
to summarize_evals (and its parameters: seasons, data_dir=args.data_dir,
api_key=args.openai_api_key, model=args.llm_model or DEFAULT_MODEL,
base_url=args.llm_base_url) unchanged.
Summary by CodeRabbit
Release Notes
New Features
Chores