#20-add boost-usage-tracker app by henry0816191 · Pull Request #71 · cppalliance/boost-data-collector

henry0816191 · 2026-02-25T20:07:23Z

Summary by CodeRabbit

New Features
- Boost Usage Tracker app added: repo monitoring, Boost header/version detection, import tools, CLI commands for content/star monitoring, and yearly repo counts by language.
Admin
- Admin interfaces added to manage tracked repositories, usages, and missing-header records.
Models / Migrations
- New models and migrations for external repos, usages, missing-header temp records, and yearly repo counts.
Services
- ORM-backed service layer with create/update/upsert and bulk operations.
Tests
- Extensive unit and integration tests for models, services, importers, and commands.
Documentation
- Schema and service API docs updated; env example enhanced with tracker configuration.

- add db update logic from file - applied multi-processing for different rate limit of github search api and content scraping - applied bulk processing by repo for database updating

…d files

coderabbitai · 2026-02-25T20:07:31Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new Django app boost_usage_tracker with models, migrations, services, repo/code searchers, post-processing, management commands, CSV/JSON loaders, admin registration, tests, and docs to discover and record Boost C++ usage and yearly repo counts by language.

Changes

Cohort / File(s)	Summary
Models & Migrations `boost_usage_tracker/models.py`, `boost_usage_tracker/migrations/0001_initial.py`, `boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py`	Adds `BoostExternalRepository`, `BoostUsage` (now allows nullable `boost_header`), and `BoostMissingHeaderTmp`; creates unique constraints (including conditional uniqueness), indexes, timestamps, and ordering.
App Registration & Admin `boost_usage_tracker/apps.py`, `boost_usage_tracker/admin.py`, `config/settings.py`, `conftest.py`	AppConfig added and `boost_usage_tracker` registered in INSTALLED_APPS; three ModelAdmin classes registered; pytest plugin entry added for fixtures.
Search & Detection `boost_usage_tracker/repo_searcher.py`, `boost_usage_tracker/boost_searcher.py`	Repo search with date-range splitting and recursive splitting; code-search producer/consumer flow for Boost includes; file fetching with GraphQL/REST fallbacks; header extraction and version inference.
Post-processing & Services `boost_usage_tracker/post_process.py`, `boost_usage_tracker/services.py`	Header resolution (exact and suffix), bulk resolution/upserts for `BoostUsage`, missing-header handling via `BoostMissingHeaderTmp`, get-or-create/update functions, and bulk excepted marking.
Management Commands & Workflow `boost_usage_tracker/management/commands/...`, `workflow/management/commands/run_all_collectors.py`	Commands to run tracker tasks (monitor by pushed date or stars), update created-repos-by-language, import/update DB targets; added `run_boost_usage_tracker` to collector execution order.
CSV/JSON Loaders & Updaters `boost_usage_tracker/db_from_file.py`, `boost_usage_tracker/update_git_account.py`, `boost_usage_tracker/update_repository_from_csv.py`, `boost_usage_tracker/update_githubfile_from_csv.py`, `boost_usage_tracker/update_boostusage_from_csv.py`, `boost_usage_tracker/update_created_repos_by_language.py`	Utilities to load/upsert GitHubAccount JSON, update repositories, GitHubFile, BoostUsage from CSV, and compute/yearly upserts of repo counts by language (reads REPO_COUNT_LANGUAGES env or CLI).
GitHub Activity additions `github_activity_tracker/models.py`, `github_activity_tracker/migrations/0002_createdreposbylanguage.py`, `github_activity_tracker/services.py`	Adds `CreatedReposByLanguage` model, migration, unique constraint (language, year), and service function to create/update yearly counts.
Tests & Fixtures `boost_usage_tracker/tests/`, `github_activity_tracker/tests/`	Extensive pytest/Django tests and fixtures covering models, services, CSV/JSON loaders, repo-count logic, and integration points.
Docs & Examples `docs/*`, `.env.example`	Service API and schema docs updated; `.env.example` documents `REPO_COUNT_LANGUAGES`.
Minor cleanup `discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py`	Small cleanup in migration forward function (removed unused references).

Sequence Diagram(s)

sequenceDiagram
    participant CLI as Management Cmd
    participant RepoSearch as repo_searcher
    participant CodeSearch as boost_searcher
    participant GitHub as GitHub API
    participant PostProc as post_process
    participant DB as Database

    CLI->>RepoSearch: search_repos_with_date_splitting(start,end,language,min_stars)
    RepoSearch->>GitHub: REST /search/repositories (paged)
    GitHub-->>RepoSearch: paginated repo results
    RepoSearch->>RepoSearch: split ranges if total>1000
    RepoSearch-->>CLI: list[RepoSearchResult]

    CLI->>CodeSearch: search_boost_include_files_batch(repos)
    loop per-repo (batched)
      CodeSearch->>GitHub: GraphQL code search for includes
      GitHub-->>CodeSearch: file paths
      CodeSearch->>GitHub: fetch file content & commit date (REST/GraphQL)
      GitHub-->>CodeSearch: file content + commit metadata
      CodeSearch-->>CLI: FileSearchResult
    end

    CLI->>PostProc: process_single_repo(repo, file_results, last_commit_dt)
    PostProc->>DB: resolve BoostFile matches (exact & suffix)
    DB-->>PostProc: BoostFile or none
    PostProc->>DB: create/update BoostExternalRepository
    PostProc->>DB: bulk_create_or_update_boost_usage / mark_usages_excepted_bulk
    PostProc->>DB: create BoostMissingHeaderTmp for unresolved headers
    DB-->>PostProc: commit summaries
    PostProc-->>CLI: processing outcome

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Poem

🐰 I sniff the headers, hop through code,
I tally repos on every road,
Missing includes I gently store,
Versions found, then counted more,
Hooray — Boost tracked, data bestowed!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '#20-add boost-usage-tracker app' clearly describes the main change: introduction of a new Django application (boost-usage-tracker). It is concise, specific, and directly related to the primary objective of the pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 80.36% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

henry0816191 · 2026-02-25T23:50:38Z

@coderabbitai review

coderabbitai · 2026-02-25T23:50:47Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py (1)

14-25: ⚠️ Potential issue | 🟠 Major

Pin ORM operations to the migration database alias.

Lines 14-25 use the default manager for reads and writes inside RunPython, while line 40 uses schema_editor.execute() on the migration-specific connection. In multi-database setups, this creates inconsistent migration state: ORM operations read/write to the default database while the SQL remapping targets the migration database.

Add db_alias = schema_editor.connection.alias at the start of migrate_users_forward(), then apply .using(db_alias) to all ORM operations:

Proposed fix

 def migrate_users_forward(apps, schema_editor):
     """Create DiscordProfile for each DiscordUser, remap DiscordMessage.author_id."""
     DiscordUser = apps.get_model("discord_activity_tracker", "DiscordUser")
     BaseProfile = apps.get_model("cppa_user_tracker", "BaseProfile")
     DiscordProfile = apps.get_model("cppa_user_tracker", "DiscordProfile")
+    db_alias = schema_editor.connection.alias
 
     # Build mapping: old DiscordUser.pk → new DiscordProfile.pk
     pk_map = {}
-    for du in DiscordUser.objects.all():
+    for du in DiscordUser.objects.using(db_alias).all().iterator():
         # Create BaseProfile row first (multi-table inheritance)
-        bp = BaseProfile.objects.create(type="discord")
+        bp = BaseProfile.objects.using(db_alias).create(type="discord")
         # Create DiscordProfile row
-        DiscordProfile.objects.create(
+        DiscordProfile.objects.using(db_alias).create(
             baseprofile_ptr_id=bp.pk,
             discord_user_id=du.user_id,
             username=du.username,
             display_name=du.display_name,
             avatar_url=du.avatar_url,
             is_bot=du.is_bot,
         )
         pk_map[du.pk] = bp.pk

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py`
around lines 14 - 25, In migrate_users_forward(), pin all ORM operations to the
migration DB by first getting db_alias = schema_editor.connection.alias and then
call .using(db_alias) on every ORM access and write (e.g.,
DiscordUser.objects.using(db_alias).all(),
BaseProfile.objects.using(db_alias).create(...),
DiscordProfile.objects.using(db_alias).create(...)); make sure any other ORM
calls in that function also use .using(db_alias) so reads/writes target the same
migration-specific connection as schema_editor.execute().

🧹 Nitpick comments (12)

boost_usage_tracker/update_boostusage_from_csv.py (3)
32-35: _parse_datetime is duplicated across CSV loaders.

An identical helper exists in boost_usage_tracker/update_repository_from_csv.py (lines 48-51). Consider extracting it into a shared utility (e.g., boost_usage_tracker/utils.py) to reduce duplication.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_boostusage_from_csv.py` around lines 32 - 35, The
helper function _parse_datetime is duplicated in update_boostusage_from_csv.py
and update_repository_from_csv.py; extract it into a shared module (e.g., create
boost_usage_tracker/utils.py with the _parse_datetime implementation) and update
both modules to import _parse_datetime from boost_usage_tracker.utils instead of
defining it inline; ensure the signature and behavior (handling None/empty
strings and calling parse_datetime) remain identical and run tests/imports to
confirm no breaking changes.
82-88: Silently skipped rows are invisible in the result dict.

When a row has empty owner, repo_name, file_path, or boost_header_name, it's continued without incrementing any counter or logging. The caller has no way to know rows were dropped. Consider adding a skipped_empty counter or a debug log.
Proposed fix
+                if not owner or not repo_name or not file_path or not boost_header_name:
+                    result.setdefault("skipped_empty", 0)
+                    result["skipped_empty"] += 1
                     continue
And initialise "skipped_empty": 0 in the result dict on Line 68.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_boostusage_from_csv.py` around lines 82 - 88, The
CSV loop silently drops rows when owner/repo_name/file_path/boost_header_name
are empty; initialize a skipped counter in the result dict (add "skipped_empty":
0 to the result structure constructed where results are initialized) and
increment result["skipped_empty"] inside the for row in reader block right
before the continue; optionally add a debug log there mentioning the row
index/contents to aid troubleshooting.
111-113: BoostFile lookup by github_file__filename may match across repos.

GitHubFile.filename is not globally unique — multiple repos can have a file with the same name. For Boost headers this is unlikely to cause issues in practice, but a more precise filter (e.g., constraining to the Boost library repo) would be safer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_boostusage_from_csv.py` around lines 111 - 113,
The lookup for boost_header using
BoostFile.objects.filter(github_file__filename=...) can return matches from
other repos because GitHubFile.filename is not unique; update the query that
sets boost_header to also filter by the Boost repo (e.g., add a condition on the
related repo field such as github_file__repository__full_name or
github_file__repo__full_name == "boostorg/boost" or the appropriate repo
identifier) so the filter is constrained to the Boost library repository before
calling .first().
boost_usage_tracker/models.py (1)
78-78: excepted_at is a DateField while all other *_at fields are DateTimeField.

This loses time-of-day precision and is inconsistent with created_at, updated_at, and last_commit_date. If this is intentional (only the date matters), consider adding a brief comment to explain why. Otherwise, switch to DateTimeField for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/models.py` at line 78, The excepted_at field is defined
as models.DateField which drops time precision and is inconsistent with other
timestamp fields (e.g., created_at, updated_at, last_commit_date); change
excepted_at to models.DateTimeField(null=True, blank=True) to preserve
time-of-day data (and run the migration), or if the date-only semantics are
intentional add a short comment above the excepted_at declaration explaining why
it differs.
boost_usage_tracker/update_githubfile_from_csv.py (2)
26-32: _parse_bool is duplicated across CSV loaders.

The identical _parse_bool helper exists in update_repository_from_csv.py (lines 40-46) and here. Consider extracting it into a shared utility module (e.g., a common csv_helpers.py) to avoid drift between copies.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_githubfile_from_csv.py` around lines 26 - 32,
Duplicate helper _parse_bool found in update_githubfile_from_csv.py and
update_repository_from_csv.py; extract it into a new shared module (e.g.,
csv_helpers.py) and replace both copies with imports. Create csv_helpers.py
exporting _parse_bool (or parse_bool) with the same logic, update
update_githubfile_from_csv.py and update_repository_from_csv.py to import the
function from that module, and remove the duplicated definitions so both files
use the single shared helper.
79-106: Silently skipping rows with missing required fields.

Lines 83-84 skip rows where owner, repo_name, or file_path are empty without any logging. The skipped_no_repo counter (line 90) is tracked for missing repos, but completely empty rows are invisible. A debug-level log or a counter in the result dict would aid troubleshooting malformed CSVs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_githubfile_from_csv.py` around lines 79 - 106, The
loop reading CSV rows currently silently continues when owner, repo_name, or
file_path are empty; update the for-row logic (the block using variables owner,
repo_name, file_path and the result dict) to increment a new counter like
result["skipped_missing_fields"] and emit a debug log (including the row index
or the raw row dict) when any required field is missing, so malformed/empty rows
are visible alongside the existing skipped_no_repo handling; keep the existing
repo lookup and create_or_update_github_file flow unchanged.
boost_usage_tracker/tests/test_services.py (2)
269-269: Unused unpacked variables flagged by Ruff (RUF059).

Same pattern as in test_models.py — prefix unused unpacked variables with _:

Line 269: usage2 → _usage2

Line 530: tmp2 → _tmp2

Line 579: usage → _usage

Line 595: tmp → _tmp

Also applies to: 530-530, 579-579, 595-595
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_services.py` at line 269, Replace the unused
unpacked variables by prefixing them with an underscore where
create_or_update_boost_usage and similar calls unpack tuples: change usage2 to
_usage2 (the call to services.create_or_update_boost_usage), tmp2 to _tmp2,
usage to _usage, and tmp to _tmp so the unused values are explicitly ignored and
Ruff RUF059 is satisfied; ensure only the variables actually used remain without
the underscore prefix.
474-493: select_related test doesn't verify query count.

The test docstring states it verifies select_related behavior, but merely accessing usage.boost_header and usage.file_path doesn't fail when select_related is absent — it just triggers lazy loads. Use django.test.utils.CaptureQueriesContext or pytest-django's django_assert_num_queries fixture to actually assert no additional queries are executed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_services.py` around lines 474 - 493, The test
test_get_active_usages_for_repo_select_related currently only accesses related
fields but doesn't assert query count; update the test to wrap access to
usage.boost_header and usage.file_path with a query-assertion context (e.g.,
django.test.utils.CaptureQueriesContext or pytest-django's
django_assert_num_queries) and assert that zero additional queries are executed
when calling services.get_active_usages_for_repo(ext_repo) and then accessing
those related attributes; keep references to the test function name and the
service function get_active_usages_for_repo so the change is scoped to this
test.
boost_usage_tracker/tests/test_models.py (1)
156-156: Unused unpacked variables flagged by Ruff (RUF059).

Multiple test functions unpack values they never use. Prefix them with _ to satisfy the linter and signal intent:

Line 156: tmp → _tmp

Line 199: usage2 → _usage2

Lines 263/268: tmp1, c1, tmp2, c2 → _tmp1, _c1, _tmp2, _c2

Line 386: tmp2 → _tmp2

Lines 448/453: c1, c2 → _c1, _c2

Line 478: tmp → _tmp

Also applies to: 199-199, 263-268
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_models.py` at line 156, Several tests unpack
multiple values from calls like get_or_create_missing_header_usage and other
helpers but leave some variables unused; rename those unused unpacked variables
to have a leading underscore to satisfy Ruff (RUF059). For example, in the call
to get_or_create_missing_header_usage change usage, tmp, _ = ... to usage, _tmp,
_ = ... (or usage, _tmp, _ = get_or_create_missing_header_usage), and apply the
same pattern for the other occurrences listed: change usage2 → _usage2; tmp1,
c1, tmp2, c2 → _tmp1, _c1, _tmp2, _c2; tmp2 → _tmp2; c1, c2 → _c1, _c2; and tmp
→ _tmp. Locate these in the test functions surrounding the symbols
get_or_create_missing_header_usage and similar unpacking sites and rename the
unused variables accordingly.
boost_usage_tracker/post_process.py (2)
218-221: Broad except Exception swallows errors silently.

While re-raising ConnectionException and RateLimitException is correct, the catch-all on line 220 means programming bugs (e.g., KeyError, AttributeError) will only appear as warnings in the log, and stats will be returned with partial/zero data — potentially hiding data-loss issues. Consider logging at error level with exc_info=True so stack traces are captured.
Proposed improvement
     except (ConnectionException, RateLimitException):
         raise
     except Exception as e:  # pylint: disable=broad-exception-caught
-        logger.warning("Failed post-processing %s: %s", repo_full_name, e)
+        logger.error("Failed post-processing %s: %s", repo_full_name, e, exc_info=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/post_process.py` around lines 218 - 221, The broad except
in post_process (catching Exception) currently logs only a warning and swallows
programming errors; change the except Exception as e handler so it logs at error
level with full stack trace (use logger.error(..., exc_info=True) and include
repo_full_name in the message) and then re-raise the exception so bugs
(KeyError/AttributeError etc.) aren't silently hidden; keep the existing except
(ConnectionException, RateLimitException) re-raise behavior unchanged.
39-53: Suffix fallback issues N DB queries per unresolved header path.

_resolve_boost_header runs up to len(parts) queries per path. For deeply nested headers this adds up. Since this is the fallback after a bulk exact-match miss, the volume should be low, but consider adding a local cache across calls if the same unresolved path appears in multiple repos within a batch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/post_process.py` around lines 39 - 53,
_resolve_boost_header performs up to O(len(parts)) DB queries per call for the
same unresolved header; add a local cache to avoid repeated queries across calls
by decorating _resolve_boost_header with an appropriate cache (e.g.,
functools.lru_cache(maxsize=4096)) or by introducing a module-level dict that
maps header_path -> BoostFile|None, keyed by the header_path string; ensure the
cache stores either the BoostFile instance or None and is consulted before
running the query loop (and consider evicting/clearing the cache between batches
if repository state changes are a concern).
boost_usage_tracker/update_git_account.py (1)
26-59: Near-identical helper functions duplicated from db_from_file.py.

_load_json_records_from_path and _load_all_json_records are completely duplicated between files. get_github_account_dir is nearly identical except it uses GITHUB_ACCOUNT_SUBDIR (public) versus _GITHUB_ACCOUNT_SUBDIR (private). Consider extracting these shared helpers to a common module to eliminate duplication, or if db_from_file.py is the canonical location, import from there (noting the constant naming difference that would need alignment).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_git_account.py` around lines 26 - 59, The three
helpers get_github_account_dir, _load_json_records_from_path, and
_load_all_json_records are duplicated from db_from_file.py; extract them into a
shared module (e.g., boost_usage_tracker.utils or
boost_usage_tracker.file_utils) and have update_git_account.py import and use
them instead of redefining; ensure the directory-constant naming is reconciled
(either rename GITHUB_ACCOUNT_SUBDIR to match _GITHUB_ACCOUNT_SUBDIR or export
the canonical constant from db_from_file.py and update callers) and keep the
same function signatures for _load_json_records_from_path and
_load_all_json_records so callers like get_github_account_dir continue to work.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.env.example:
- Around line 86-89: The "Celery" header was accidentally placed inside the "App
credentials" block and lost its surrounding separator bars; fix by restoring the
standard section header pattern: end the "App credentials" separator block, then
create a new standalone Celery section using the same "===...===" separator
lines above and below the line "# Celery (optional; for scheduled
run_all_collectors)". Ensure the comment text remains unchanged and follows the
exact pattern used by other sections so "Celery" is its own top-level section.

In `@boost_usage_tracker/boost_searcher.py`:
- Around line 265-276: _build_boost_include_query currently trims the repo list
to fit MAX_CODE_SEARCH_QUERY_LEN but the caller search_boost_include_files_batch
never retries the remaining repos, causing silent skips; change
search_boost_include_files_batch to iterate over the full repo_full_names in a
loop: repeatedly call _build_boost_include_query on the yet-unprocessed slice of
repo_full_names (e.g., repo_full_names[i:]), process the returned query and
repos, and advance i by len(repos) until all repos are handled; also handle the
edge case where a single repo produces a query longer than
MAX_CODE_SEARCH_QUERY_LEN by logging/warning and skipping that repo to avoid an
infinite loop.

In `@boost_usage_tracker/db_from_file.py`:
- Around line 96-104: The loop over records assumes each item is a dict and
calls rec.get, which will raise AttributeError for nulls/strings/numbers; before
accessing rec.get in the for rec in records loop, check that rec is a mapping
(e.g., isinstance(rec, dict)) and if not, logger.warning("Skipping non-dict
record: %r", rec) and continue; keep the existing raw_id/try-except conversion
logic (raw_id, gid, logger) unchanged so only non-dict entries are skipped
safely.

In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py`:
- Around line 374-388: The date parsing for options["until"] and
options["since"] can raise ValueError and should not abort the command; wrap the
datetime.strptime calls for computing until and since in try/except ValueError
blocks (or use helper parsing) so that if parsing options["until"] fails you set
until = now, and if parsing options["since"] fails you set since = until -
timedelta(days=1); update the code that sets the until and since variables
(referencing now, until, since and options) to apply these fallbacks and keep
tzinfo=timezone.utc as before.

In
`@boost_usage_tracker/management/commands/run_update_created_repos_by_language.py`:
- Around line 77-93: The success message block that writes summary using
self.style.SUCCESS should switch to self.style.WARNING when the result indicates
errors; update the writer logic around the block that formats the result (the
call using self.stdout.write(... self.style.SUCCESS(...).format(...))) to choose
self.style.WARNING whenever result contains errors (e.g., non-empty
result.get("errors") or when earlier error paths were taken), and mirror the
same change at the similar summary write later (the other occurrence referenced
by the reviewer). Ensure you preserve the existing formatted fields (requested,
processed, missing, start, end, stars, rows, created, updated) but select
WARNING style conditionally instead of always using SUCCESS.

In `@boost_usage_tracker/management/commands/run_update_db.py`:
- Around line 128-142: The handle method currently returns integers to indicate
success/failure which Django ignores; change it to raise
django.core.management.base.CommandError when result contains errors: after
calling runner(source) and detecting result.get("errors"), aggregate the error
messages from result["errors"] and raise CommandError with that text instead of
writing errors and returning 1; keep the successful path to call
formatter(result) and write the success message via
self.stdout.write(self.style.SUCCESS(msg)) and then return None (or simply end)
so the command exits with success. Ensure you import CommandError and reference
the existing handle, TARGETS, runner, and formatter symbols when updating the
code.

In
`@boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py`:
- Around line 65-68: Typo in the index name: change "bost_missing_tmp_usage_id"
to "boost_missing_tmp_usage_id" both in the migration AddIndex entry that
targets model_name="boostmissingheadertmp" (the migrations.AddIndex(...
index=models.Index(... name="bost_missing_tmp_usage_id") ) and in the model
definition for BoostMissingHeaderTmp where the Index with name
"bost_missing_tmp_usage_id" is declared (likely in Meta.indexes); update both
occurrences so the DB schema and model metadata use
"boost_missing_tmp_usage_id".

In `@boost_usage_tracker/models.py`:
- Around line 127-129: The index name in the Index declaration is
misspelled—update the models.Index entry in the indexes list (the
models.Index(fields=["usage"], name="bost_missing_tmp_usage_id") line) to use
"boost_missing_tmp_usage_id" instead of "bost_missing_tmp_usage_id", and then
regenerate the migration so the corrected name is applied before any migration
is run.

In `@boost_usage_tracker/repo_searcher.py`:
- Around line 157-172: The code can raise UnboundLocalError because second_diff
is only assigned inside the else branch; ensure second_diff is defined before
it's referenced in the final condition. Edit the logic in repo_searcher.py
around the date-splitting block: initialize second_diff (e.g., second_diff = 0)
before the if/else, or restructure so the final condition checks the existence
of second_range/second_diff safely; update the variables used by the subsequent
calls to _process_date_range (and related variables first_range_1/2,
second_range_1/2) so they are always defined or guarded before use.

In `@boost_usage_tracker/services.py`:
- Around line 235-260: The loop currently appends every matched BoostUsage to
to_update even if no fields changed; change the logic in the loop that iterates
(bh_id, fp_id), usage in existing_map.items() to detect real changes before
appending: compute new values for last_commit_date and excepted_at (e.g.,
new_last = key_to_item[key][2], new_excepted = None) and only if
usage.last_commit_date != new_last or usage.excepted_at != new_excepted then set
the fields on usage, set usage.updated_at later, and append to_update; leave
unchanged rows out of to_update so BoostUsage.objects.bulk_update only updates
truly modified rows and updated_count = len(to_update).
- Around line 184-193: When an existing BoostUsage row is re-detected (i.e.
get_or_create returned an existing record via usage, _), the code updates
last_commit_date but doesn't clear the excepted_at flag so the usage can remain
incorrectly excepted; update the branch that handles existing records (where _
is False) to set usage.excepted_at = None (and any related excepted_* fields if
applicable) before saving and include "excepted_at" in the update_fields list
along with "last_commit_date" and "updated_at" so the excepted state is cleared
when a usage is reactivated.

In `@boost_usage_tracker/tests/test_update_created_repos_by_language.py`:
- Around line 14-24: The test
test_update_created_repos_by_language_requires_languages currently uses
patch.dict("os.environ", {}, clear=False) which does not remove
REPO_COUNT_LANGUAGES (LANGUAGES_ENV_KEY) from the environment; update the patch
to guarantee the env var is absent by either using patch.dict("os.environ", {},
clear=True) or explicitly setting the key to an empty value using the module
constant (e.g., patch.dict("os.environ", {LANGUAGES_ENV_KEY: ""}, clear=False)
after importing LANGUAGES_ENV_KEY), so that update_created_repos_by_language
falls back to no languages and the assertions reliably exercise the error path.

In `@boost_usage_tracker/update_created_repos_by_language.py`:
- Around line 58-77: The early error return blocks around parsing languages
(where source_csv, language_names, _parse_languages_csv, and LANGUAGES_ENV_KEY
are used) return a payload missing keys present in the success path; update
those error returns to include the full stable schema (e.g., include
languages_requested, start_year, end_year, years, created, updated,
rows_processed, and errors) so callers always receive the same shape; apply the
same fix to the other early-return blocks mentioned (the ones near the other
_parse_languages_csv/validation checks) to ensure consistent payload across all
paths.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 116-118: The log message incorrectly says "Skipping" even though
get_or_create_github_account was already called and the record upserted; either
validate username before calling get_or_create_github_account (move the username
== "" check above the call that performs the upsert) or change the warning text
to reflect post-upsert state (e.g., "Warning: record has empty username for
owner: %s") in update_git_account.py, referencing the username and owner
variables and keeping created_count/updated_count logic intact.
- Around line 84-99: Loop handling in update_git_account.py assumes owner is a
dict and calls owner.get("id") which raises AttributeError when rec["owner"] is
null; add a guard after setting owner (in the loop over records using variables
rec and owner) to check if owner is None or not a mapping/dict, log a warning
like "Skipping record missing owner or github_account_id: %s" with rec or owner,
and continue before attempting owner.get("id"); keep the existing int conversion
to gid and the warnings for missing/invalid github_account_id_raw unchanged.

In `@boost_usage_tracker/update_repository_from_csv.py`:
- Around line 1-5: The module docstring in update_repository_from_csv.py lists
expected CSV columns but omits the "forks" column which the code later parses
(the variable/name "forks" is read from the CSV). Update the top-of-file
docstring to include "forks" among the CSV columns (e.g., add it to the
comma-separated list alongside owner, repo_name, stars, description,
repo_pushed_at, repo_created_at, repo_updated_at, boost_version,
is_boost_embedded, is_boost_used) so the docstring matches the actual parsing
logic.

In `@config/settings.py`:
- Line 130: In the list of workspace slugs in settings.py the two string
literals "boost_usage_tracker" and "cppa_slack_transcript_tracker" are adjacent
without a comma causing implicit concatenation; add the missing comma between
them so they are two separate entries, ensuring downstream uses like
WORKSPACE_DIR / "cppa_slack_transcript_tracker" / "chrome_profile" reference the
correct directory name.

In `@github_activity_tracker/models.py`:
- Around line 58-60: The fields year, all_repos, and significant_repos currently
allow negative values; update their definitions to enforce non-negative values
at DB and model level by replacing IntegerField with PositiveIntegerField (or
adding MinValueValidator(0) if you need zero allowed) and add a CheckConstraint
in the model Meta (e.g., "year >= 0", "all_repos >= 0", "significant_repos >=
0") to the model(s) that contain these fields (the occurrences around the year,
all_repos, significant_repos definitions at lines ~58-60 and the other similar
block at ~67-72) so the database rejects negative values as well as the ORM.
Ensure constraints have unique names and include migration generation.

---

Outside diff comments:
In
`@discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py`:
- Around line 14-25: In migrate_users_forward(), pin all ORM operations to the
migration DB by first getting db_alias = schema_editor.connection.alias and then
call .using(db_alias) on every ORM access and write (e.g.,
DiscordUser.objects.using(db_alias).all(),
BaseProfile.objects.using(db_alias).create(...),
DiscordProfile.objects.using(db_alias).create(...)); make sure any other ORM
calls in that function also use .using(db_alias) so reads/writes target the same
migration-specific connection as schema_editor.execute().

---

Nitpick comments:
In `@boost_usage_tracker/models.py`:
- Line 78: The excepted_at field is defined as models.DateField which drops time
precision and is inconsistent with other timestamp fields (e.g., created_at,
updated_at, last_commit_date); change excepted_at to
models.DateTimeField(null=True, blank=True) to preserve time-of-day data (and
run the migration), or if the date-only semantics are intentional add a short
comment above the excepted_at declaration explaining why it differs.

In `@boost_usage_tracker/post_process.py`:
- Around line 218-221: The broad except in post_process (catching Exception)
currently logs only a warning and swallows programming errors; change the except
Exception as e handler so it logs at error level with full stack trace (use
logger.error(..., exc_info=True) and include repo_full_name in the message) and
then re-raise the exception so bugs (KeyError/AttributeError etc.) aren't
silently hidden; keep the existing except (ConnectionException,
RateLimitException) re-raise behavior unchanged.
- Around line 39-53: _resolve_boost_header performs up to O(len(parts)) DB
queries per call for the same unresolved header; add a local cache to avoid
repeated queries across calls by decorating _resolve_boost_header with an
appropriate cache (e.g., functools.lru_cache(maxsize=4096)) or by introducing a
module-level dict that maps header_path -> BoostFile|None, keyed by the
header_path string; ensure the cache stores either the BoostFile instance or
None and is consulted before running the query loop (and consider
evicting/clearing the cache between batches if repository state changes are a
concern).

In `@boost_usage_tracker/tests/test_models.py`:
- Line 156: Several tests unpack multiple values from calls like
get_or_create_missing_header_usage and other helpers but leave some variables
unused; rename those unused unpacked variables to have a leading underscore to
satisfy Ruff (RUF059). For example, in the call to
get_or_create_missing_header_usage change usage, tmp, _ = ... to usage, _tmp, _
= ... (or usage, _tmp, _ = get_or_create_missing_header_usage), and apply the
same pattern for the other occurrences listed: change usage2 → _usage2; tmp1,
c1, tmp2, c2 → _tmp1, _c1, _tmp2, _c2; tmp2 → _tmp2; c1, c2 → _c1, _c2; and tmp
→ _tmp. Locate these in the test functions surrounding the symbols
get_or_create_missing_header_usage and similar unpacking sites and rename the
unused variables accordingly.

In `@boost_usage_tracker/tests/test_services.py`:
- Line 269: Replace the unused unpacked variables by prefixing them with an
underscore where create_or_update_boost_usage and similar calls unpack tuples:
change usage2 to _usage2 (the call to services.create_or_update_boost_usage),
tmp2 to _tmp2, usage to _usage, and tmp to _tmp so the unused values are
explicitly ignored and Ruff RUF059 is satisfied; ensure only the variables
actually used remain without the underscore prefix.
- Around line 474-493: The test test_get_active_usages_for_repo_select_related
currently only accesses related fields but doesn't assert query count; update
the test to wrap access to usage.boost_header and usage.file_path with a
query-assertion context (e.g., django.test.utils.CaptureQueriesContext or
pytest-django's django_assert_num_queries) and assert that zero additional
queries are executed when calling services.get_active_usages_for_repo(ext_repo)
and then accessing those related attributes; keep references to the test
function name and the service function get_active_usages_for_repo so the change
is scoped to this test.

In `@boost_usage_tracker/update_boostusage_from_csv.py`:
- Around line 32-35: The helper function _parse_datetime is duplicated in
update_boostusage_from_csv.py and update_repository_from_csv.py; extract it into
a shared module (e.g., create boost_usage_tracker/utils.py with the
_parse_datetime implementation) and update both modules to import
_parse_datetime from boost_usage_tracker.utils instead of defining it inline;
ensure the signature and behavior (handling None/empty strings and calling
parse_datetime) remain identical and run tests/imports to confirm no breaking
changes.
- Around line 82-88: The CSV loop silently drops rows when
owner/repo_name/file_path/boost_header_name are empty; initialize a skipped
counter in the result dict (add "skipped_empty": 0 to the result structure
constructed where results are initialized) and increment result["skipped_empty"]
inside the for row in reader block right before the continue; optionally add a
debug log there mentioning the row index/contents to aid troubleshooting.
- Around line 111-113: The lookup for boost_header using
BoostFile.objects.filter(github_file__filename=...) can return matches from
other repos because GitHubFile.filename is not unique; update the query that
sets boost_header to also filter by the Boost repo (e.g., add a condition on the
related repo field such as github_file__repository__full_name or
github_file__repo__full_name == "boostorg/boost" or the appropriate repo
identifier) so the filter is constrained to the Boost library repository before
calling .first().

In `@boost_usage_tracker/update_git_account.py`:
- Around line 26-59: The three helpers get_github_account_dir,
_load_json_records_from_path, and _load_all_json_records are duplicated from
db_from_file.py; extract them into a shared module (e.g.,
boost_usage_tracker.utils or boost_usage_tracker.file_utils) and have
update_git_account.py import and use them instead of redefining; ensure the
directory-constant naming is reconciled (either rename GITHUB_ACCOUNT_SUBDIR to
match _GITHUB_ACCOUNT_SUBDIR or export the canonical constant from
db_from_file.py and update callers) and keep the same function signatures for
_load_json_records_from_path and _load_all_json_records so callers like
get_github_account_dir continue to work.

In `@boost_usage_tracker/update_githubfile_from_csv.py`:
- Around line 26-32: Duplicate helper _parse_bool found in
update_githubfile_from_csv.py and update_repository_from_csv.py; extract it into
a new shared module (e.g., csv_helpers.py) and replace both copies with imports.
Create csv_helpers.py exporting _parse_bool (or parse_bool) with the same logic,
update update_githubfile_from_csv.py and update_repository_from_csv.py to import
the function from that module, and remove the duplicated definitions so both
files use the single shared helper.
- Around line 79-106: The loop reading CSV rows currently silently continues
when owner, repo_name, or file_path are empty; update the for-row logic (the
block using variables owner, repo_name, file_path and the result dict) to
increment a new counter like result["skipped_missing_fields"] and emit a debug
log (including the row index or the raw row dict) when any required field is
missing, so malformed/empty rows are visible alongside the existing
skipped_no_repo handling; keep the existing repo lookup and
create_or_update_github_file flow unchanged.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34257ca and 903d6c1.

📒 Files selected for processing (43)

.env.example
boost_usage_tracker/__init__.py
boost_usage_tracker/admin.py
boost_usage_tracker/apps.py
boost_usage_tracker/boost_searcher.py
boost_usage_tracker/db_from_file.py
boost_usage_tracker/management/__init__.py
boost_usage_tracker/management/commands/__init__.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
boost_usage_tracker/management/commands/run_update_created_repos_by_language.py
boost_usage_tracker/management/commands/run_update_db.py
boost_usage_tracker/migrations/0001_initial.py
boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py
boost_usage_tracker/migrations/__init__.py
boost_usage_tracker/models.py
boost_usage_tracker/post_process.py
boost_usage_tracker/repo_searcher.py
boost_usage_tracker/services.py
boost_usage_tracker/tests/__init__.py
boost_usage_tracker/tests/fixtures.py
boost_usage_tracker/tests/test_db_from_file.py
boost_usage_tracker/tests/test_models.py
boost_usage_tracker/tests/test_services.py
boost_usage_tracker/tests/test_update_created_repos_by_language.py
boost_usage_tracker/update_boostusage_from_csv.py
boost_usage_tracker/update_created_repos_by_language.py
boost_usage_tracker/update_git_account.py
boost_usage_tracker/update_githubfile_from_csv.py
boost_usage_tracker/update_repository_from_csv.py
config/settings.py
conftest.py
discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py
docs/Contributing.md
docs/Schema.md
docs/Service_API.md
docs/service_api/README.md
docs/service_api/boost_usage_tracker.md
github_activity_tracker/migrations/0002_createdreposbylanguage.py
github_activity_tracker/models.py
github_activity_tracker/services.py
github_activity_tracker/tests/test_models.py
github_activity_tracker/tests/test_services.py
workflow/management/commands/run_all_collectors.py

henry0816191 · 2026-02-25T23:57:48Z

@coderabbitai full review

coderabbitai · 2026-02-25T23:57:55Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py (1)

14-40: ⚠️ Potential issue | 🟠 Major

Bind ORM operations to the migration database alias to prevent data inconsistency.

Lines 14, 16, and 18 execute ORM queries and writes without schema_editor.connection.alias. In multi-database or database-router setups, these operations default to the DEFAULT database instead of the migration's active database, while the raw SQL at line 40 correctly executes on the migration connection. This creates a race condition where new DiscordProfile records may be written to one database while the raw SQL update reads/writes from another, breaking data integrity.

Suggested fix

 def migrate_users_forward(apps, schema_editor):
     """Create DiscordProfile for each DiscordUser, remap DiscordMessage.author_id."""
     DiscordUser = apps.get_model("discord_activity_tracker", "DiscordUser")
     BaseProfile = apps.get_model("cppa_user_tracker", "BaseProfile")
     DiscordProfile = apps.get_model("cppa_user_tracker", "DiscordProfile")
 
+    db_alias = schema_editor.connection.alias
     # Build mapping: old DiscordUser.pk → new DiscordProfile.pk
     pk_map = {}
-    for du in DiscordUser.objects.all():
+    for du in DiscordUser.objects.using(db_alias).all():
         # Create BaseProfile row first (multi-table inheritance)
-        bp = BaseProfile.objects.create(type="discord")
+        bp = BaseProfile.objects.using(db_alias).create(type="discord")
         # Create DiscordProfile row
-        DiscordProfile.objects.create(
+        DiscordProfile.objects.using(db_alias).create(
             baseprofile_ptr_id=bp.pk,
             discord_user_id=du.user_id,
             username=du.username,
             display_name=du.display_name,
             avatar_url=du.avatar_url,
             is_bot=du.is_bot,
         )
         pk_map[du.pk] = bp.pk

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py`
around lines 14 - 40, The ORM queries and saves must be bound to the migration
DB alias: change DiscordUser.objects.all(), BaseProfile.objects.create(...) and
DiscordProfile.objects.create(...) to use the migration connection alias (e.g.
.using(schema_editor.connection.alias) on the queryset/create calls) so all
reads/writes occur on schema_editor.connection.alias rather than the default DB;
keep the raw SQL on schema_editor.execute as-is so both ORM operations and the
SQL update target the same database.

♻️ Duplicate comments (14)

boost_usage_tracker/tests/test_update_created_repos_by_language.py (1)

14-24: ⚠️ Potential issue | 🟡 Minor

patch.dict("os.environ", {}, clear=False) is a no-op — REPO_COUNT_LANGUAGES is not cleared.

This was raised in a previous review and remains unresolved. If the env var is set in CI or via a .env loader, the test will silently exercise the wrong code path.

Proposed fix — pin the key to empty

-    with patch.dict("os.environ", {}, clear=False):
+    with patch.dict("os.environ", {"REPO_COUNT_LANGUAGES": ""}, clear=False):

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_update_created_repos_by_language.py` around
lines 14 - 24, The test currently uses patch.dict("os.environ", {}, clear=False)
which is a no-op; change the patch to explicitly set the REPO_COUNT_LANGUAGES
key to an empty string so the code path that requires languages is exercised.
Concretely, update the test to use patch.dict("os.environ",
{"REPO_COUNT_LANGUAGES": ""}, clear=False) (or clear=True if you prefer to wipe
all env vars) so update_created_repos_by_language is forced to rely on the
provided languages_csv and return the expected error; keep references to
update_created_repos_by_language and REPO_COUNT_LANGUAGES when making the
change.

github_activity_tracker/models.py (1)

58-60: ⚠️ Potential issue | 🟠 Major

Missing DB-level non-negative constraints on year, all_repos, and significant_repos.

This was raised in a previous review and remains unresolved. Negative counts in aggregate metrics would silently corrupt reporting.

Proposed schema hardening

     class Meta:
         db_table = "github_activity_tracker_createdreposbylanguage"
         ordering = ["language", "year"]
         constraints = [
             models.UniqueConstraint(
                 fields=["language", "year"],
                 name="github_activity_tracker_created_lang_year_uniq",
-            )
+            ),
+            models.CheckConstraint(
+                check=models.Q(year__gte=0),
+                name="github_activity_tracker_created_lang_year_non_negative",
+            ),
+            models.CheckConstraint(
+                check=models.Q(all_repos__gte=0),
+                name="github_activity_tracker_created_lang_all_repos_non_negative",
+            ),
+            models.CheckConstraint(
+                check=models.Q(significant_repos__gte=0),
+                name="github_activity_tracker_created_lang_sig_repos_non_negative",
+            ),
         ]

A corresponding migration will also be needed.

Also applies to: 67-72

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@github_activity_tracker/models.py` around lines 58 - 60, Change the
IntegerField definitions for year, all_repos, and significant_repos to
non-negative types and add a DB-level CheckConstraint: use
models.PositiveIntegerField (or IntegerField with MinValueValidator(0)) for year
(keep db_index=True) and models.PositiveIntegerField(default=0) for all_repos
and significant_repos, then add a Meta.constraints entry with
CheckConstraint(expressions=Q(year__gte=0) & Q(all_repos__gte=0) &
Q(significant_repos__gte=0), name='githubactivity_nonnegative_counts') to the
model to enforce the rule at the DB level; apply the same changes to the other
similar fields mentioned (lines 67-72) and generate the corresponding migration.

boost_usage_tracker/models.py (1)

127-129: ⚠️ Potential issue | 🟡 Minor

Fix typo in index name before migrations settle.

Line 128 has bost_missing_tmp_usage_id; this should be boost_missing_tmp_usage_id.

Suggested fix

         indexes = [
-            models.Index(fields=["usage"], name="bost_missing_tmp_usage_id"),
+            models.Index(fields=["usage"], name="boost_missing_tmp_usage_id"),
         ]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/models.py` around lines 127 - 129, The index name in the
model Meta indexes array is misspelled: change the models.Index name value
"bost_missing_tmp_usage_id" to the correct "boost_missing_tmp_usage_id" so the
index name is accurate before migrations are finalized; update the indexes entry
where models.Index(fields=["usage"], name="bost_missing_tmp_usage_id") is
declared to use the corrected string.

config/settings.py (1)

130-130: ⚠️ Potential issue | 🔴 Critical

Critical: missing comma causes implicit string concatenation.

At Line 130, two literals are concatenated into one slug, so neither boost_usage_tracker nor cppa_slack_transcript_tracker workspace directory is created correctly.

Suggested fix

 _WORKSPACE_APP_SLUGS = (
     "github_activity_tracker",
     "boost_library_tracker",
-    "boost_usage_tracker" "cppa_slack_transcript_tracker",
+    "boost_usage_tracker",
+    "cppa_slack_transcript_tracker",
     "discord_activity_tracker",
     "shared",
 )

#!/bin/bash
# Verify the adjacent string literal pattern and impacted workspace slug block.
rg -n -C2 '"boost_usage_tracker"\s*"cppa_slack_transcript_tracker"|_WORKSPACE_APP_SLUGS|chrome_profile' config/settings.py

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@config/settings.py` at line 130, The list of workspace slugs contains
adjacent string literals that are implicitly concatenated; add a missing comma
between "boost_usage_tracker" and "cppa_slack_transcript_tracker" in the
workspace slug block (the _WORKSPACE_APP_SLUGS / workspace slug list in
config/settings.py) so each slug is its own entry and both directories are
created correctly.

boost_usage_tracker/management/commands/run_update_created_repos_by_language.py (1)

77-95: ⚠️ Potential issue | 🟡 Minor

Use warning style for summary when errors are present.

Line 78 always formats the summary with SUCCESS, even when errors were printed and Line 95 returns 1. This makes failed runs look successful in logs.

Suggested fix

+        summary_style = (
+            self.style.WARNING if result.get("errors") else self.style.SUCCESS
+        )  # pylint: disable=no-member
         self.stdout.write(
-            self.style.SUCCESS(  # pylint: disable=no-member
+            summary_style(
                 "languages_requested={requested} processed={processed} missing={missing} "
                 "years={start}-{end} stars_min={stars} rows_processed={rows} "
                 "created={created} updated={updated}".format(

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@boost_usage_tracker/management/commands/run_update_created_repos_by_language.py`
around lines 77 - 95, The summary message always uses self.style.SUCCESS in the
self.stdout.write call even when result.get("errors") is truthy; change the
formatting to use a warning (e.g., self.style.WARNING) when result.get("errors")
is present so the printed summary matches the exit code. Locate the
self.stdout.write(...) block that formats requested/processed/missing/... using
result.get(...) and conditionally choose style.SUCCESS when no errors and
style.WARNING when result.get("errors") is truthy, keeping the same message
content, then keep the existing return 1 if result.get("errors") else 0
behavior.

boost_usage_tracker/update_git_account.py (2)

84-90: ⚠️ Potential issue | 🟠 Major

Guard owner before dereferencing to avoid runtime crash.

At Line 88, owner.get("id") can raise AttributeError when rec["owner"] is None or non-object.

Proposed fix

     for rec in records:
         owner = rec
         if "owner" in rec:
             owner = rec.get("owner")
+        if not isinstance(owner, dict):
+            logger.warning(
+                "Skipping record missing owner or github_account_id: %s", rec
+            )
+            continue
         github_account_id_raw = owner.get("id")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_git_account.py` around lines 84 - 90, The loop
dereferences owner with owner.get("id") which can raise AttributeError if owner
is None or not a mapping; update the loop in the records processing (the for rec
in records block) to first validate owner (the owner variable derived from rec
or rec.get("owner")) is a dict-like object (e.g., check owner is not None and
isinstance(owner, dict) or hasattr(owner, "get")) before calling
owner.get("id"); if the owner is invalid, emit the existing logger.warning (or
an enhanced message referencing owner) and continue to the next record so
github_account_id_raw is never accessed on a non-object.

116-117: ⚠️ Potential issue | 🟡 Minor

Fix misleading log wording after successful upsert.

Line 117 says “Skipping”, but the account has already been created/updated on Line 104-111.

Proposed fix

         if username == "":
-            logger.warning("Skipping record with empty username: %s", owner)
+            logger.warning("Upserted record with empty username: %s", owner)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_git_account.py` around lines 116 - 117, The log
message "Skipping record with empty username: %s" is misleading because the
account was already created/updated earlier; update the logger call in the block
that checks username == "" to reflect that the upsert was completed for this
owner (e.g., logger.info or logger.debug with text like "Account created/updated
for owner with empty username: %s") and keep the referenced variables owner and
username in the message so it clearly identifies the record; modify the
logger.warning call to logger.info (or logger.debug) and change the message text
accordingly in the same function where owner and username are used.

boost_usage_tracker/update_created_repos_by_language.py (1)

58-77: ⚠️ Potential issue | 🟡 Minor

Keep error returns schema-compatible with success return.

Current early returns omit keys present in normal output (languages_requested, start_year, end_year, etc.), which complicates command/reporting code.

Proposed fix pattern

+def _result_template(*, start_year: int, end_year: int, stars_min: int) -> dict[str, Any]:
+    return {
+        "languages_requested": [],
+        "languages_processed": [],
+        "languages_missing": [],
+        "start_year": start_year,
+        "end_year": end_year,
+        "stars_min": stars_min,
+        "created": 0,
+        "updated": 0,
+        "rows_processed": 0,
+        "errors": [],
+    }
@@
     if start_year > end_year:
-        return {
-            "created": 0,
-            "updated": 0,
-            "rows_processed": 0,
-            "errors": [
-                f"Invalid year range: start_year({start_year}) > end_year({end_year})"
-            ],
-        }
+        result = _result_template(
+            start_year=start_year,
+            end_year=end_year,
+            stars_min=stars_min,
+        )
+        result["errors"].append(
+            f"Invalid year range: start_year({start_year}) > end_year({end_year})"
+        )
+        return result

Also applies to: 95-100

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_created_repos_by_language.py` around lines 58 -
77, Early return payloads in update_created_repos_by_language.py omit fields
present in successful results; make all early error returns include the full
schema (e.g., languages_requested, start_year, end_year, created, updated,
rows_processed, errors) so callers can rely on a consistent shape. Locate the
two early-return blocks around the _parse_languages_csv call (and the other one
at lines ~95-100) and update them to populate languages_requested (use
language_names or source_csv), start_year and end_year (use existing
start_year/end_year variables), plus created/updated/rows_processed set to 0,
and append the error message into errors, matching the keys used by the normal
success return.

boost_usage_tracker/update_repository_from_csv.py (1)

4-5: ⚠️ Potential issue | 🟡 Minor

Include forks in the documented CSV columns.

forks is parsed on Line 108-109 but is still missing from the module docstring column list, so docs and behavior diverge.

Proposed fix

-CSV columns: owner, repo_name, stars, description, repo_pushed_at, repo_created_at,
+CSV columns: owner, repo_name, stars, forks, description, repo_pushed_at, repo_created_at,
 repo_updated_at, boost_version, is_boost_embedded, is_boost_used.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_repository_from_csv.py` around lines 4 - 5, The
module docstring's CSV column list is missing "forks" even though the code
parses a forks field (see the forks parsing around the existing parsing logic at
lines ~108-109); update the top-of-file docstring CSV columns line to include
"forks" so the documented columns match the actual parser behavior, i.e., add
forks to the comma-separated list of columns in the module docstring.

boost_usage_tracker/management/commands/run_update_db.py (1)

135-142: ⚠️ Potential issue | 🟠 Major

Use CommandError for failure instead of returning integers from handle().

Returning 1/0 here is not a reliable CLI failure signal; raise CommandError when result["errors"] is present.

Proposed fix

-from django.core.management.base import BaseCommand
+from django.core.management.base import BaseCommand, CommandError
@@
         if result.get("errors"):
             for err in result["errors"]:
                 self.stderr.write(self.style.ERROR(err))  # pylint: disable=no-member
-            return 1
+            raise CommandError(f"run_update_db failed for target={target}")
@@
         msg = formatter(result)
         self.stdout.write(self.style.SUCCESS(msg))  # pylint: disable=no-member
-        return 0
+        return None

Use this read-only check to verify Django’s behavior in your environment:

#!/bin/bash
python - <<'PY'
import inspect
from django.core.management.base import BaseCommand

print("=== BaseCommand.execute ===")
print(inspect.getsource(BaseCommand.execute))
print("\n=== BaseCommand.run_from_argv ===")
print(inspect.getsource(BaseCommand.run_from_argv))
PY

Expected result: run_from_argv() does not map handle() integer return values to process exit codes; non-zero exit comes from raised exceptions.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/management/commands/run_update_db.py` around lines 135 -
142, Replace the integer return flow in handle() with a raised CommandError:
when result.get("errors") is truthy, import CommandError from
django.core.management and raise CommandError with a clear message (e.g. join
result["errors"] or include relevant details) instead of writing to stderr and
returning 1; keep the success path (use formatter(result),
self.stdout.write(self.style.SUCCESS(msg))) and simply return None (no numeric
return). Ensure you reference the same symbols: result, formatter,
self.stderr/self.stdout, and raise CommandError.

boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py (1)

67-67: ⚠️ Potential issue | 🟡 Minor

Fix index name typo before schema lands.

bost_missing_tmp_usage_id appears misspelled and will persist in DB metadata.

Proposed fix

-            index=models.Index(fields=["usage"], name="bost_missing_tmp_usage_id"),
+            index=models.Index(fields=["usage"], name="boost_missing_tmp_usage_id"),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py`
at line 67, The index name in the migration's Index declaration is misspelled;
update the Index(... name="bost_missing_tmp_usage_id") to the correct spelling
(e.g., "boost_missing_tmp_usage_id") in the migration file so the database
metadata uses the intended name; locate the Index call in the migration (the
models.Index(fields=["usage"], name="bost_missing_tmp_usage_id") line) and
change only the name string to the corrected identifier.

boost_usage_tracker/services.py (2)

235-260: ⚠️ Potential issue | 🟠 Major

Bulk upsert currently updates rows even when nothing changed.

Line 245 appends all matched usages to to_update, so unchanged rows still get bulk_updated and updated_count is overstated.

🔧 Proposed fix

     for (bh_id, fp_id), usage in existing_map.items():
         key = (bh_id, fp_id)
         if key not in key_to_item:
             continue
         to_create_keys.discard(key)
         boost_header, file_path, last_commit_date = key_to_item[key]
+        changed = False
         if last_commit_date is not None and usage.last_commit_date != last_commit_date:
             usage.last_commit_date = last_commit_date
+            changed = True
         if usage.excepted_at is not None:
             usage.excepted_at = None
-        to_update.append(usage)
+            changed = True
+        if changed:
+            to_update.append(usage)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` around lines 235 - 260, The code currently
appends every matched usage to to_update causing unchanged rows to be
bulk_updated and inflating updated_count; change the logic in the loop that
iterates existing_map items (referencing existing_map, key_to_item, usage,
to_update) to compare the current fields (last_commit_date and excepted_at)
against the values from key_to_item and only append usage to to_update when at
least one of those fields actually changes, then set usage.updated_at for those
changed records before calling BoostUsage.objects.bulk_update and compute
updated_count as the length of the filtered to_update list.

184-193: ⚠️ Potential issue | 🟠 Major

Re-detected missing-header usage can remain incorrectly excepted.

At Line 190-Line 193, existing placeholder usages update last_commit_date but never clear excepted_at. Re-detected rows can stay inactive.

🔧 Proposed fix

-    usage, _ = BoostUsage.objects.get_or_create(
+    usage, created_usage = BoostUsage.objects.get_or_create(
         repo=repo,
         boost_header=None,
         file_path=file_path,
         defaults={"last_commit_date": last_commit_date},
     )
-    if not _ and last_commit_date and usage.last_commit_date != last_commit_date:
-        usage.last_commit_date = last_commit_date
-        usage.save(update_fields=["last_commit_date", "updated_at"])
+    if not created_usage:
+        changed = False
+        if last_commit_date is not None and usage.last_commit_date != last_commit_date:
+            usage.last_commit_date = last_commit_date
+            changed = True
+        if usage.excepted_at is not None:
+            usage.excepted_at = None
+            changed = True
+        if changed:
+            usage.save(update_fields=["last_commit_date", "excepted_at", "updated_at"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` around lines 184 - 193, When updating an
existing BoostUsage after get_or_create (the variable usage returned from
BoostUsage.objects.get_or_create), clear any stale exception state by setting
usage.excepted_at = None before saving if you change last_commit_date; include
"excepted_at" in the save(update_fields=[...]) call so the null is persisted.
Locate the block that checks "if not _ and last_commit_date and
usage.last_commit_date != last_commit_date" and update it to reset excepted_at
and save with update_fields=["last_commit_date", "updated_at", "excepted_at"]
(ensure excepted_at is set to None).

boost_usage_tracker/management/commands/run_boost_usage_tracker.py (1)

375-388: ⚠️ Potential issue | 🟠 Major

Invalid --since/--until input still hard-fails command execution.

Line 376 and Line 383 can raise ValueError, which aborts the command instead of falling back to defaults.

🔧 Proposed permissive parsing

         now = datetime.now(timezone.utc)
-        until = (
-            datetime.strptime(options["until"], "%Y-%m-%d").replace(
-                tzinfo=timezone.utc,
-            )
-            if options["until"]
-            else now
-        )
-        since = (
-            datetime.strptime(options["since"], "%Y-%m-%d").replace(
-                tzinfo=timezone.utc,
-            )
-            if options["since"]
-            else until - timedelta(days=1)
-        )
+        def _parse_ymd_or_none(value: str | None) -> datetime | None:
+            if not value:
+                return None
+            try:
+                return datetime.strptime(value, "%Y-%m-%d").replace(tzinfo=timezone.utc)
+            except ValueError:
+                return None
+
+        until = _parse_ymd_or_none(options["until"]) or now
+        since = _parse_ymd_or_none(options["since"]) or (until - timedelta(days=1))

Based on learnings, invalid date inputs in this repository’s management commands should fall back to default logic instead of raising.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py` around
lines 375 - 388, The parsing of options["until"] and options["since"] using
datetime.strptime can raise ValueError and currently aborts execution; wrap each
datetime.strptime call that sets until and since in a try/except ValueError
block so that on parse failure you fall back to the existing defaults (for until
use now, for since use until - timedelta(days=1)), and optionally emit a
warning/log about the invalid input; update the logic around the until and since
assignments (the blocks that reference options["until"], options["since"],
datetime.strptime, tzinfo=timezone.utc, and the default calculations) to
implement this permissive parsing behavior.

🧹 Nitpick comments (5)

boost_usage_tracker/migrations/0001_initial.py (1)

40-40: Verify the field name excepted_at is intentional.

excepted_at is an unusual name — did you mean extracted_at, exempted_at, or something else? If it tracks when a record was marked as an exception/exclusion, a more descriptive name like exempted_at or excluded_at might reduce confusion.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/migrations/0001_initial.py` at line 40, The migration
defines a DateField named excepted_at which is likely a typo and should be
clarified/renamed (e.g., extracted_at, exempted_at, or excluded_at) — locate the
field definition in the initial migration and in the corresponding model (search
for excepted_at) and decide the intended name; if it's a typo, update the model
field name to the correct one and create a new migration that renames the field
(use a RenameField operation) rather than editing already-applied historic
migrations, or if the project is pre-release and safe to rewrite, update the
initial migration and all references to excepted_at to the chosen name (e.g.,
exempted_at) to keep naming consistent.

boost_usage_tracker/boost_searcher.py (1)

248-252: Silently swallowed exception hides commit-date fetch failures.

The bare except Exception: pass loses all diagnostic information. Even though commit_date is optional, logging at debug level (consistent with the rest of the file) would help troubleshoot issues.

♻️ Proposed fix

         except Exception:
-            pass
+            logger.debug(
+                "Failed to fetch commit date for %s/%s",
+                repo_full_name,
+                file_path,
+            )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/boost_searcher.py` around lines 248 - 252, The try/except
block that parses commits to set commit_date currently swallows all errors
(except Exception: pass); change it to catch Exception as e and emit a
debug-level log (consistent with other uses in this module) with the error
details (e or use exc_info/stack) so failures to parse commits/date are
recorded; keep commit_date optional and unchanged on error but do not silently
ignore exceptions—update the block around the commits parsing (where commits is
inspected and datetime.fromisoformat(...) is called) to log the diagnostic
information.

boost_usage_tracker/repo_searcher.py (1)

133-133: Remove commented-out time.sleep call.

This looks like a leftover from development. If the rate-limit handling in the client makes this unnecessary, remove the dead comment to avoid confusion.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/repo_searcher.py` at line 133, Remove the leftover
commented-out call "# time.sleep(SEARCH_DELAY)" from repo_searcher.py to
eliminate dead code and confusion; locate the commented symbol
time.sleep(SEARCH_DELAY) in the search loop (where SEARCH_DELAY is referenced)
and delete the line, or if a deliberate placeholder is required, replace it with
a clear TODO comment referencing SEARCH_DELAY and the surrounding search logic.

github_activity_tracker/services.py (1)

79-79: Use unpacking instead of list concatenation.

Ruff RUF005 flags update_fields + ["updated_at"] as less idiomatic.

♻️ Proposed fix

-            row.save(update_fields=update_fields + ["updated_at"])
+            row.save(update_fields=[*update_fields, "updated_at"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@github_activity_tracker/services.py` at line 79, Replace the
list-concatenation passed to row.save with list unpacking: instead of using
update_fields + ["updated_at"] when calling the save method on the row object,
build the update_fields argument using unpacking so the call to
row.save(update_fields=...) uses the expanded original update_fields plus
"updated_at"; locate the save call on the variable named row (row.save(...)) in
github_activity_tracker/services.py and update the argument accordingly.

boost_usage_tracker/tests/test_models.py (1)

156-156: Use _ for intentionally unused tuple elements to satisfy lint cleanly.

Several unpacked values are not used; renaming to _ (or _tmp, _created) keeps intent explicit and avoids RUF059 noise.

Example cleanup

-    usage, tmp, _ = services.get_or_create_missing_header_usage(
+    usage, _, _ = services.get_or_create_missing_header_usage(
@@
-    usage2, created2 = services.create_or_update_boost_usage(
+    _, created2 = services.create_or_update_boost_usage(
@@
-    usage1, tmp1, c1 = services.get_or_create_missing_header_usage(
+    usage1, _, _ = services.get_or_create_missing_header_usage(

Also applies to: 199-199, 263-263, 268-268, 386-386, 448-448, 453-453, 478-478

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_models.py` at line 156, Several tests unpack
tuple returns but leave some elements unused (e.g., the call to
services.get_or_create_missing_header_usage), which triggers lint RUF059; update
those unpackings to use `_` (or `_tmp`, `_created`) for intentionally unused
elements so intent is clear and lint clean. Specifically, locate calls such as
services.get_or_create_missing_header_usage and the other occurrences referenced
(around the noted lines) and change the tuple unpack pattern from e.g. "usage,
tmp, _" or "a, b, c" to use "_" for any value that isn’t referenced later in the
test; keep the variable name for the used element(s) only and leave the rest as
underscores.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py`:
- Around line 215-220: The global watermark last_commit_dt computed from
BoostUsage.objects.aggregate(Max("last_commit_date")) can skip valid updates in
slower repos; replace the global watermark with a per-repository watermark:
compute the max last_commit_date grouped by the repo identifier from BoostUsage
(falling back to CREATION_START_DEFAULT) and pass the repo-specific value into
monitor_content (or any function using last_commit_dt) so each repo uses its own
max last_commit_date instead of the single last_commit_dt; update any call sites
that reference last_commit_dt to accept a repo parameter or lookup the per-repo
watermark by repository id/name.

In `@boost_usage_tracker/tests/test_services.py`:
- Line 269: In tests/test_services.py replace intentionally unused unpacked
variables returned from services.create_or_update_boost_usage with underscore
placeholders (e.g., change "usage2, created2 =
services.create_or_update_boost_usage(...)" to "_, created2 =
services.create_or_update_boost_usage(...)" or "usageX, _ = ..." depending on
which value is unused) for each occurrence (the calls around the
create_or_update_boost_usage invocations flagged). This removes the unused-name
lint warnings (RUF059) while keeping the used return values intact.

In `@boost_usage_tracker/update_boostusage_from_csv.py`:
- Line 4: The CSV contract documents an excepted_at column but the import loop
in update_boostusage_from_csv.py never applies it; update the CSV row parsing to
extract the excepted_at field (parse as datetime or None), and pass that value
into the service call that creates/updates each record (e.g., the boost usage
import loop where owner, repo_name, file_path, boost_header_name, last_commit_ts
are mapped) so the service method (create_or_update_boost_usage /
BoostUsageService.*) receives and persists excepted_at instead of silently
dropping it; alternatively, if you intend to drop the column, remove excepted_at
from the declared CSV columns to keep the contract consistent.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 142-143: The single-file branch doesn't handle JSON parsing or IO
errors like the directory branch does; wrap the call to
_load_json_records_from_path(path) (the block starting with "elif path.is_file()
and path.suffix.lower() == '.json':") in a try/except that catches
json.JSONDecodeError and OSError, log an explanatory error (including the
exception) and gracefully exit/return the same way directory-mode does so the
command doesn't crash on malformed or unreadable JSON.

---

Outside diff comments:
In
`@discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py`:
- Around line 14-40: The ORM queries and saves must be bound to the migration DB
alias: change DiscordUser.objects.all(), BaseProfile.objects.create(...) and
DiscordProfile.objects.create(...) to use the migration connection alias (e.g.
.using(schema_editor.connection.alias) on the queryset/create calls) so all
reads/writes occur on schema_editor.connection.alias rather than the default DB;
keep the raw SQL on schema_editor.execute as-is so both ORM operations and the
SQL update target the same database.

---

Duplicate comments:
In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py`:
- Around line 375-388: The parsing of options["until"] and options["since"]
using datetime.strptime can raise ValueError and currently aborts execution;
wrap each datetime.strptime call that sets until and since in a try/except
ValueError block so that on parse failure you fall back to the existing defaults
(for until use now, for since use until - timedelta(days=1)), and optionally
emit a warning/log about the invalid input; update the logic around the until
and since assignments (the blocks that reference options["until"],
options["since"], datetime.strptime, tzinfo=timezone.utc, and the default
calculations) to implement this permissive parsing behavior.

In
`@boost_usage_tracker/management/commands/run_update_created_repos_by_language.py`:
- Around line 77-95: The summary message always uses self.style.SUCCESS in the
self.stdout.write call even when result.get("errors") is truthy; change the
formatting to use a warning (e.g., self.style.WARNING) when result.get("errors")
is present so the printed summary matches the exit code. Locate the
self.stdout.write(...) block that formats requested/processed/missing/... using
result.get(...) and conditionally choose style.SUCCESS when no errors and
style.WARNING when result.get("errors") is truthy, keeping the same message
content, then keep the existing return 1 if result.get("errors") else 0
behavior.

In `@boost_usage_tracker/management/commands/run_update_db.py`:
- Around line 135-142: Replace the integer return flow in handle() with a raised
CommandError: when result.get("errors") is truthy, import CommandError from
django.core.management and raise CommandError with a clear message (e.g. join
result["errors"] or include relevant details) instead of writing to stderr and
returning 1; keep the success path (use formatter(result),
self.stdout.write(self.style.SUCCESS(msg))) and simply return None (no numeric
return). Ensure you reference the same symbols: result, formatter,
self.stderr/self.stdout, and raise CommandError.

In
`@boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py`:
- Line 67: The index name in the migration's Index declaration is misspelled;
update the Index(... name="bost_missing_tmp_usage_id") to the correct spelling
(e.g., "boost_missing_tmp_usage_id") in the migration file so the database
metadata uses the intended name; locate the Index call in the migration (the
models.Index(fields=["usage"], name="bost_missing_tmp_usage_id") line) and
change only the name string to the corrected identifier.

In `@boost_usage_tracker/models.py`:
- Around line 127-129: The index name in the model Meta indexes array is
misspelled: change the models.Index name value "bost_missing_tmp_usage_id" to
the correct "boost_missing_tmp_usage_id" so the index name is accurate before
migrations are finalized; update the indexes entry where
models.Index(fields=["usage"], name="bost_missing_tmp_usage_id") is declared to
use the corrected string.

In `@boost_usage_tracker/services.py`:
- Around line 235-260: The code currently appends every matched usage to
to_update causing unchanged rows to be bulk_updated and inflating updated_count;
change the logic in the loop that iterates existing_map items (referencing
existing_map, key_to_item, usage, to_update) to compare the current fields
(last_commit_date and excepted_at) against the values from key_to_item and only
append usage to to_update when at least one of those fields actually changes,
then set usage.updated_at for those changed records before calling
BoostUsage.objects.bulk_update and compute updated_count as the length of the
filtered to_update list.
- Around line 184-193: When updating an existing BoostUsage after get_or_create
(the variable usage returned from BoostUsage.objects.get_or_create), clear any
stale exception state by setting usage.excepted_at = None before saving if you
change last_commit_date; include "excepted_at" in the save(update_fields=[...])
call so the null is persisted. Locate the block that checks "if not _ and
last_commit_date and usage.last_commit_date != last_commit_date" and update it
to reset excepted_at and save with update_fields=["last_commit_date",
"updated_at", "excepted_at"] (ensure excepted_at is set to None).

In `@boost_usage_tracker/tests/test_update_created_repos_by_language.py`:
- Around line 14-24: The test currently uses patch.dict("os.environ", {},
clear=False) which is a no-op; change the patch to explicitly set the
REPO_COUNT_LANGUAGES key to an empty string so the code path that requires
languages is exercised. Concretely, update the test to use
patch.dict("os.environ", {"REPO_COUNT_LANGUAGES": ""}, clear=False) (or
clear=True if you prefer to wipe all env vars) so
update_created_repos_by_language is forced to rely on the provided languages_csv
and return the expected error; keep references to
update_created_repos_by_language and REPO_COUNT_LANGUAGES when making the
change.

In `@boost_usage_tracker/update_created_repos_by_language.py`:
- Around line 58-77: Early return payloads in
update_created_repos_by_language.py omit fields present in successful results;
make all early error returns include the full schema (e.g., languages_requested,
start_year, end_year, created, updated, rows_processed, errors) so callers can
rely on a consistent shape. Locate the two early-return blocks around the
_parse_languages_csv call (and the other one at lines ~95-100) and update them
to populate languages_requested (use language_names or source_csv), start_year
and end_year (use existing start_year/end_year variables), plus
created/updated/rows_processed set to 0, and append the error message into
errors, matching the keys used by the normal success return.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 84-90: The loop dereferences owner with owner.get("id") which can
raise AttributeError if owner is None or not a mapping; update the loop in the
records processing (the for rec in records block) to first validate owner (the
owner variable derived from rec or rec.get("owner")) is a dict-like object
(e.g., check owner is not None and isinstance(owner, dict) or hasattr(owner,
"get")) before calling owner.get("id"); if the owner is invalid, emit the
existing logger.warning (or an enhanced message referencing owner) and continue
to the next record so github_account_id_raw is never accessed on a non-object.
- Around line 116-117: The log message "Skipping record with empty username: %s"
is misleading because the account was already created/updated earlier; update
the logger call in the block that checks username == "" to reflect that the
upsert was completed for this owner (e.g., logger.info or logger.debug with text
like "Account created/updated for owner with empty username: %s") and keep the
referenced variables owner and username in the message so it clearly identifies
the record; modify the logger.warning call to logger.info (or logger.debug) and
change the message text accordingly in the same function where owner and
username are used.

In `@boost_usage_tracker/update_repository_from_csv.py`:
- Around line 4-5: The module docstring's CSV column list is missing "forks"
even though the code parses a forks field (see the forks parsing around the
existing parsing logic at lines ~108-109); update the top-of-file docstring CSV
columns line to include "forks" so the documented columns match the actual
parser behavior, i.e., add forks to the comma-separated list of columns in the
module docstring.

In `@config/settings.py`:
- Line 130: The list of workspace slugs contains adjacent string literals that
are implicitly concatenated; add a missing comma between "boost_usage_tracker"
and "cppa_slack_transcript_tracker" in the workspace slug block (the
_WORKSPACE_APP_SLUGS / workspace slug list in config/settings.py) so each slug
is its own entry and both directories are created correctly.

In `@github_activity_tracker/models.py`:
- Around line 58-60: Change the IntegerField definitions for year, all_repos,
and significant_repos to non-negative types and add a DB-level CheckConstraint:
use models.PositiveIntegerField (or IntegerField with MinValueValidator(0)) for
year (keep db_index=True) and models.PositiveIntegerField(default=0) for
all_repos and significant_repos, then add a Meta.constraints entry with
CheckConstraint(expressions=Q(year__gte=0) & Q(all_repos__gte=0) &
Q(significant_repos__gte=0), name='githubactivity_nonnegative_counts') to the
model to enforce the rule at the DB level; apply the same changes to the other
similar fields mentioned (lines 67-72) and generate the corresponding migration.

---

Nitpick comments:
In `@boost_usage_tracker/boost_searcher.py`:
- Around line 248-252: The try/except block that parses commits to set
commit_date currently swallows all errors (except Exception: pass); change it to
catch Exception as e and emit a debug-level log (consistent with other uses in
this module) with the error details (e or use exc_info/stack) so failures to
parse commits/date are recorded; keep commit_date optional and unchanged on
error but do not silently ignore exceptions—update the block around the commits
parsing (where commits is inspected and datetime.fromisoformat(...) is called)
to log the diagnostic information.

In `@boost_usage_tracker/migrations/0001_initial.py`:
- Line 40: The migration defines a DateField named excepted_at which is likely a
typo and should be clarified/renamed (e.g., extracted_at, exempted_at, or
excluded_at) — locate the field definition in the initial migration and in the
corresponding model (search for excepted_at) and decide the intended name; if
it's a typo, update the model field name to the correct one and create a new
migration that renames the field (use a RenameField operation) rather than
editing already-applied historic migrations, or if the project is pre-release
and safe to rewrite, update the initial migration and all references to
excepted_at to the chosen name (e.g., exempted_at) to keep naming consistent.

In `@boost_usage_tracker/repo_searcher.py`:
- Line 133: Remove the leftover commented-out call "# time.sleep(SEARCH_DELAY)"
from repo_searcher.py to eliminate dead code and confusion; locate the commented
symbol time.sleep(SEARCH_DELAY) in the search loop (where SEARCH_DELAY is
referenced) and delete the line, or if a deliberate placeholder is required,
replace it with a clear TODO comment referencing SEARCH_DELAY and the
surrounding search logic.

In `@boost_usage_tracker/tests/test_models.py`:
- Line 156: Several tests unpack tuple returns but leave some elements unused
(e.g., the call to services.get_or_create_missing_header_usage), which triggers
lint RUF059; update those unpackings to use `_` (or `_tmp`, `_created`) for
intentionally unused elements so intent is clear and lint clean. Specifically,
locate calls such as services.get_or_create_missing_header_usage and the other
occurrences referenced (around the noted lines) and change the tuple unpack
pattern from e.g. "usage, tmp, _" or "a, b, c" to use "_" for any value that
isn’t referenced later in the test; keep the variable name for the used
element(s) only and leave the rest as underscores.

In `@github_activity_tracker/services.py`:
- Line 79: Replace the list-concatenation passed to row.save with list
unpacking: instead of using update_fields + ["updated_at"] when calling the save
method on the row object, build the update_fields argument using unpacking so
the call to row.save(update_fields=...) uses the expanded original update_fields
plus "updated_at"; locate the save call on the variable named row
(row.save(...)) in github_activity_tracker/services.py and update the argument
accordingly.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34257ca and 903d6c1.

📒 Files selected for processing (43)

.env.example
boost_usage_tracker/__init__.py
boost_usage_tracker/admin.py
boost_usage_tracker/apps.py
boost_usage_tracker/boost_searcher.py
boost_usage_tracker/db_from_file.py
boost_usage_tracker/management/__init__.py
boost_usage_tracker/management/commands/__init__.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
boost_usage_tracker/management/commands/run_update_created_repos_by_language.py
boost_usage_tracker/management/commands/run_update_db.py
boost_usage_tracker/migrations/0001_initial.py
boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py
boost_usage_tracker/migrations/__init__.py
boost_usage_tracker/models.py
boost_usage_tracker/post_process.py
boost_usage_tracker/repo_searcher.py
boost_usage_tracker/services.py
boost_usage_tracker/tests/__init__.py
boost_usage_tracker/tests/fixtures.py
boost_usage_tracker/tests/test_db_from_file.py
boost_usage_tracker/tests/test_models.py
boost_usage_tracker/tests/test_services.py
boost_usage_tracker/tests/test_update_created_repos_by_language.py
boost_usage_tracker/update_boostusage_from_csv.py
boost_usage_tracker/update_created_repos_by_language.py
boost_usage_tracker/update_git_account.py
boost_usage_tracker/update_githubfile_from_csv.py
boost_usage_tracker/update_repository_from_csv.py
config/settings.py
conftest.py
discord_activity_tracker/migrations/0002_migrate_users_to_discord_profile.py
docs/Contributing.md
docs/Schema.md
docs/Service_API.md
docs/service_api/README.md
docs/service_api/boost_usage_tracker.md
github_activity_tracker/migrations/0002_createdreposbylanguage.py
github_activity_tracker/models.py
github_activity_tracker/services.py
github_activity_tracker/tests/test_models.py
github_activity_tracker/tests/test_services.py
workflow/management/commands/run_all_collectors.py

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

boost_usage_tracker/boost_searcher.py (1)

265-276: ⚠️ Potential issue | 🟠 Major

_build_boost_include_query lacks query-length validation, causing potential silent failures.

The function builds a query that can exceed GitHub's 255-character limit when multiple repos are provided but does not validate this. When the query exceeds the limit, the API call in search_boost_include_files_batch fails silently—the exception is caught (line 506) and an empty list is returned without indication that repos were skipped due to an invalid query.

Consider implementing query-length enforcement before sending to the API: either split repos into batches within search_boost_include_files_batch to ensure query length stays under 255 characters, or explicitly warn when truncation occurs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/boost_searcher.py` around lines 265 - 276, The
_build_boost_include_query can produce queries over GitHub's 255-character limit
causing silent failures; update the logic so search_boost_include_files_batch
never sends an over-long query: modify either _build_boost_include_query to
accept a max_length and return one or multiple safe queries (or add batching in
search_boost_include_files_batch using _chunked) by computing base_len =
len('"#include <boost/" language:C++ ') and ensuring each appended " repo:NAME"
segment keeps query length <255, splitting the repo_full_names into smaller
chunks accordingly; also add a warning/log (processLogger or logger used in this
module) whenever splitting occurs so callers know repos were batched.

boost_usage_tracker/services.py (1)

190-193: ⚠️ Potential issue | 🟠 Major

Re-detected missing-header usage is not reactivated.

For existing rows, Line 190-Line 193 refreshes last_commit_date but does not clear excepted_at, so a re-detected usage can remain incorrectly excepted.

Suggested fix

-    if not _ and last_commit_date and usage.last_commit_date != last_commit_date:
-        usage.last_commit_date = last_commit_date
-        usage.save(update_fields=["last_commit_date", "updated_at"])
+    if not _:
+        changed = False
+        if last_commit_date is not None and usage.last_commit_date != last_commit_date:
+            usage.last_commit_date = last_commit_date
+            changed = True
+        if usage.excepted_at is not None:
+            usage.excepted_at = None
+            changed = True
+        if changed:
+            usage.save(update_fields=["last_commit_date", "excepted_at", "updated_at"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` around lines 190 - 193, When refreshing an
existing BoostMissingHeaderTmp usage's last_commit_date in the block that checks
`if not _ and last_commit_date and usage.last_commit_date != last_commit_date`,
also clear any previous exception by setting `usage.excepted_at = None` and
include "excepted_at" in the `usage.save(update_fields=[...])` call so a
re-detected missing-header is reactivated; update the `usage` object before
saving and ensure `BoostMissingHeaderTmp`'s save updates both "last_commit_date"
and "excepted_at".

boost_usage_tracker/update_created_repos_by_language.py (1)

58-65: ⚠️ Potential issue | 🟡 Minor

Keep error-path payload schema consistent with success-path schema.

Line 58 and Line 70 return dicts that omit keys returned elsewhere (languages_requested, languages_processed, start_year, end_year, stars_min, etc.). Reuse _result_template(...) for these early returns to keep a stable contract.

Suggested fix

     if start_year > end_year:
-        return {
-            "created": 0,
-            "updated": 0,
-            "rows_processed": 0,
-            "errors": [
-                f"Invalid year range: start_year({start_year}) > end_year({end_year})"
-            ],
-        }
+        result = _result_template(
+            start_year=start_year, end_year=end_year, stars_min=stars_min
+        )
+        result["errors"].append(
+            f"Invalid year range: start_year({start_year}) > end_year({end_year})"
+        )
+        return result
@@
     if not language_names:
-        return {
-            "created": 0,
-            "updated": 0,
-            "rows_processed": 0,
-            "errors": [
-                f"No languages provided. Set {LANGUAGES_ENV_KEY} in .env or pass --languages.",
-            ],
-        }
+        result = _result_template(
+            start_year=start_year, end_year=end_year, stars_min=stars_min
+        )
+        result["errors"].append(
+            f"No languages provided. Set {LANGUAGES_ENV_KEY} in .env or pass --languages."
+        )
+        return result

Also applies to: 70-77

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_created_repos_by_language.py` around lines 58 -
65, The early-return error payloads in update_created_repos_by_language.py
return ad-hoc dicts that omit keys present in successful responses; replace
those returns with the canonical template by calling _result_template(...)
(passing start_year, end_year, languages_requested, languages_processed,
stars_min, etc.) and then updating the returned dict's "created", "updated",
"rows_processed" and "errors" fields (e.g., set created/updated/rows_processed
to 0 and populate errors list). Do this for the invalid-year return in the
function and the other early-return block around lines 70-77 so all branches
return the same schema produced by _result_template.

boost_usage_tracker/update_git_account.py (1)

119-120: ⚠️ Potential issue | 🟡 Minor

Warning message is inaccurate for updated rows.

Line 120 logs "Created record..." even when the row was updated. Use neutral wording like "Upserted record with empty username...".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_git_account.py` around lines 119 - 120, The
warning message incorrectly says "Created record..." even when the row was
updated; change the log text to neutral wording like "Upserted record with empty
username: %s" so it correctly reflects both creates and updates. Locate the
logger.warning call that uses the variables username and owner (the line with if
username == "": logger.warning(...)) and replace the message string only,
keeping the same formatting and parameters (owner) and keeping the existing
logger.warning invocation intact.

🧹 Nitpick comments (3)

boost_usage_tracker/update_repository_from_csv.py (1)

31-52: Consider extracting shared parsing helpers.

_parse_int, _parse_bool, and _parse_datetime are duplicated across update_boostusage_from_csv.py, update_githubfile_from_csv.py, and this file. A shared csv_helpers.py module could reduce duplication.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_repository_from_csv.py` around lines 31 - 52,
Duplicate parsing helpers _parse_int, _parse_bool, and _parse_datetime are
repeated across modules; extract them into a new shared module (e.g.,
csv_helpers.py) exporting these functions (remove or keep leading underscores as
desired for public use), replace the local definitions in
update_repository_from_csv.py, update_boostusage_from_csv.py, and
update_githubfile_from_csv.py with imports from csv_helpers, and run
tests/linters to ensure no unresolved references (search for _parse_int,
_parse_bool, _parse_datetime to update all call sites).

boost_usage_tracker/boost_searcher.py (1)

243-252: Silent exception swallowing hides commit-date fetch failures.

The inner try-except-pass at Lines 251-252 silently discards errors when fetching the commit date via REST. While the file content is still returned, failures here (e.g., rate limits, auth issues) go unnoticed.

🔧 Add debug logging for visibility

             try:
                 commits = client.rest_request(
                     f"/repos/{repo_full_name}/commits",
                     params={"path": file_path, "per_page": 1},
                 )
                 if isinstance(commits, list) and commits:
                     date_str = commits[0]["commit"]["committer"]["date"]
                     commit_date = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
-            except Exception:
-                pass
+            except Exception as e:
+                logger.debug("Failed to fetch commit date for %s/%s: %s", repo_full_name, file_path, e)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/boost_searcher.py` around lines 243 - 252, The try/except
around client.rest_request is swallowing errors silently; change the except
block to catch Exception as e and log the failure (including repo_full_name and
file_path) instead of passing so commit-date fetch failures are visible; keep
the same behavior (don’t re-raise) but emit a debug/warning message via the
module logger (or processLogger if used) mentioning client.rest_request,
commits, and the exception to aid troubleshooting while preserving commit_date
fallback logic.

boost_usage_tracker/update_created_repos_by_language.py (1)

30-36: Reuse one GitHub client per run instead of creating one per query.

Line 31 is on the inner hot path and instantiates a client for every count call. Passing a shared client through the loop will reduce overhead and simplify request-level diagnostics.

Also applies to: 123-124

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_created_repos_by_language.py` around lines 30 -
36, The helper _count_items_from_git currently creates a new GitHub client on
each call which is expensive; refactor it to accept a client parameter (e.g.,
client: GitHubClient) or make it optional and only create one client at the
outer scope, then pass that same client into every _count_items_from_git call
from the main loop in update_created_repos_by_language (and the other call sites
noted around the second use). Update the callers to instantiate
get_github_client(use="scraping") once before the loop and pass it into
_count_items_from_git (or rely on the provided client param) so the client is
reused for all repository count queries.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@boost_usage_tracker/db_from_file.py`:
- Around line 155-157: The single-file JSON branch calls
_load_json_records_from_path(path) without protection and can raise
json.JSONDecodeError or OSError; wrap that call in the same try/except pattern
used in directory mode so the function preserves its structured result contract
— catch json.JSONDecodeError and OSError (and any other errors directory mode
handles), log or capture the exception, and return the same error-shaped result
(instead of letting the exception propagate); i.e., replace the direct
assignment to records with a try: records = _load_json_records_from_path(path)
except <same exceptions as directory handling>: return the equivalent error
result used elsewhere in this module.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 84-88: Guard against non-dict records before attempting key
access: inside the loop that iterates over records (for rec in records) check if
rec is a dict (e.g., if not isinstance(rec, dict): continue or log and continue)
before evaluating '"owner" in rec' and before calling rec.get("owner"); this
prevents TypeError when records contains non-dict items and preserves the
existing owner extraction logic using the owner variable.

---

Duplicate comments:
In `@boost_usage_tracker/boost_searcher.py`:
- Around line 265-276: The _build_boost_include_query can produce queries over
GitHub's 255-character limit causing silent failures; update the logic so
search_boost_include_files_batch never sends an over-long query: modify either
_build_boost_include_query to accept a max_length and return one or multiple
safe queries (or add batching in search_boost_include_files_batch using
_chunked) by computing base_len = len('"#include <boost/" language:C++ ') and
ensuring each appended " repo:NAME" segment keeps query length <255, splitting
the repo_full_names into smaller chunks accordingly; also add a warning/log
(processLogger or logger used in this module) whenever splitting occurs so
callers know repos were batched.

In `@boost_usage_tracker/services.py`:
- Around line 190-193: When refreshing an existing BoostMissingHeaderTmp usage's
last_commit_date in the block that checks `if not _ and last_commit_date and
usage.last_commit_date != last_commit_date`, also clear any previous exception
by setting `usage.excepted_at = None` and include "excepted_at" in the
`usage.save(update_fields=[...])` call so a re-detected missing-header is
reactivated; update the `usage` object before saving and ensure
`BoostMissingHeaderTmp`'s save updates both "last_commit_date" and
"excepted_at".

In `@boost_usage_tracker/update_created_repos_by_language.py`:
- Around line 58-65: The early-return error payloads in
update_created_repos_by_language.py return ad-hoc dicts that omit keys present
in successful responses; replace those returns with the canonical template by
calling _result_template(...) (passing start_year, end_year,
languages_requested, languages_processed, stars_min, etc.) and then updating the
returned dict's "created", "updated", "rows_processed" and "errors" fields
(e.g., set created/updated/rows_processed to 0 and populate errors list). Do
this for the invalid-year return in the function and the other early-return
block around lines 70-77 so all branches return the same schema produced by
_result_template.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 119-120: The warning message incorrectly says "Created record..."
even when the row was updated; change the log text to neutral wording like
"Upserted record with empty username: %s" so it correctly reflects both creates
and updates. Locate the logger.warning call that uses the variables username and
owner (the line with if username == "": logger.warning(...)) and replace the
message string only, keeping the same formatting and parameters (owner) and
keeping the existing logger.warning invocation intact.

---

Nitpick comments:
In `@boost_usage_tracker/boost_searcher.py`:
- Around line 243-252: The try/except around client.rest_request is swallowing
errors silently; change the except block to catch Exception as e and log the
failure (including repo_full_name and file_path) instead of passing so
commit-date fetch failures are visible; keep the same behavior (don’t re-raise)
but emit a debug/warning message via the module logger (or processLogger if
used) mentioning client.rest_request, commits, and the exception to aid
troubleshooting while preserving commit_date fallback logic.

In `@boost_usage_tracker/update_created_repos_by_language.py`:
- Around line 30-36: The helper _count_items_from_git currently creates a new
GitHub client on each call which is expensive; refactor it to accept a client
parameter (e.g., client: GitHubClient) or make it optional and only create one
client at the outer scope, then pass that same client into every
_count_items_from_git call from the main loop in
update_created_repos_by_language (and the other call sites noted around the
second use). Update the callers to instantiate get_github_client(use="scraping")
once before the loop and pass it into _count_items_from_git (or rely on the
provided client param) so the client is reused for all repository count queries.

In `@boost_usage_tracker/update_repository_from_csv.py`:
- Around line 31-52: Duplicate parsing helpers _parse_int, _parse_bool, and
_parse_datetime are repeated across modules; extract them into a new shared
module (e.g., csv_helpers.py) exporting these functions (remove or keep leading
underscores as desired for public use), replace the local definitions in
update_repository_from_csv.py, update_boostusage_from_csv.py, and
update_githubfile_from_csv.py with imports from csv_helpers, and run
tests/linters to ensure no unresolved references (search for _parse_int,
_parse_bool, _parse_datetime to update all call sites).

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 903d6c1 and d910b09.

📒 Files selected for processing (17)

boost_usage_tracker/boost_searcher.py
boost_usage_tracker/db_from_file.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
boost_usage_tracker/management/commands/run_update_created_repos_by_language.py
boost_usage_tracker/management/commands/run_update_db.py
boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py
boost_usage_tracker/models.py
boost_usage_tracker/repo_searcher.py
boost_usage_tracker/services.py
boost_usage_tracker/tests/test_services.py
boost_usage_tracker/tests/test_update_created_repos_by_language.py
boost_usage_tracker/update_boostusage_from_csv.py
boost_usage_tracker/update_created_repos_by_language.py
boost_usage_tracker/update_git_account.py
boost_usage_tracker/update_repository_from_csv.py
config/settings.py
github_activity_tracker/models.py

🚧 Files skipped from review as they are similar to previous changes (2)

config/settings.py
boost_usage_tracker/tests/test_services.py

henry0816191 · 2026-02-26T17:41:35Z

@coderabbitai review

coderabbitai · 2026-02-26T17:41:50Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

♻️ Duplicate comments (2)

boost_usage_tracker/boost_searcher.py (1)

265-271: ⚠️ Potential issue | 🟡 Minor

Query length limit (MAX_CODE_SEARCH_QUERY_LEN) is defined but not enforced.

_build_boost_include_query constructs queries without checking MAX_CODE_SEARCH_QUERY_LEN. If the combined repo: filters exceed 255 characters, the GitHub API may reject the query. Consider truncating the repo list to fit within the limit and returning remaining repos for subsequent batches.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/boost_searcher.py` around lines 265 - 271,
_build_boost_include_query currently concatenates all repo filters into a single
query without enforcing MAX_CODE_SEARCH_QUERY_LEN; change it to respect the
MAX_CODE_SEARCH_QUERY_LEN constant by computing allowed space after base =
'"#include <boost/" language:C++ ' and iterating through parts =
[f"repo:{name}"...] appending parts until adding the next part would exceed
MAX_CODE_SEARCH_QUERY_LEN, then return the built query and the list of remaining
repo names for batching; update the function to use the same return signature
(query, remaining_repos) and ensure callers handle remaining_repos for
subsequent queries.

boost_usage_tracker/services.py (1)

184-197: ⚠️ Potential issue | 🟠 Major

Missing-header usage re-detection doesn't clear excepted_at.

When a missing-header usage is re-detected, Lines 190-192 update last_commit_date but don't clear excepted_at. This inconsistency with create_or_update_boost_usage (Lines 147-151) means re-detected missing-header usages remain incorrectly marked as excepted.

🔧 Proposed fix

-    if not _ and last_commit_date and usage.last_commit_date != last_commit_date:
-        usage.last_commit_date = last_commit_date
-        usage.save(update_fields=["last_commit_date", "updated_at"])
+    if not _:
+        changed = False
+        if last_commit_date and usage.last_commit_date != last_commit_date:
+            usage.last_commit_date = last_commit_date
+            changed = True
+        if usage.excepted_at is not None:
+            usage.excepted_at = None
+            changed = True
+        if changed:
+            usage.save(update_fields=["last_commit_date", "excepted_at", "updated_at"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` around lines 184 - 197, The re-detection
path that updates last_commit_date for an existing BoostUsage (the block using
BoostUsage.objects.get_or_create and checking "if not _ and last_commit_date and
usage.last_commit_date != last_commit_date") must also clear the excepted_at
timestamp so re-detected missing-header usages are no longer marked excepted;
modify that branch in services.py to set usage.excepted_at = None and include
"excepted_at" in the usage.save(update_fields=[...]) call (in addition to
"last_commit_date" and "updated_at") so the change is persisted, mirroring the
behavior in create_or_update_boost_usage.

🧹 Nitpick comments (5)

boost_usage_tracker/update_git_account.py (1)

26-31: Consider extracting shared JSON loading utilities to reduce duplication.

These helper functions (get_github_account_dir, _load_json_records_from_path, _load_all_json_records) are nearly identical to those in boost_usage_tracker/db_from_file.py (lines 30-63). Consider consolidating them into a shared utility module to improve maintainability.

Also applies to: 33-59
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_git_account.py` around lines 26 - 31, Duplicate
JSON/dir helper logic in get_github_account_dir, _load_json_records_from_path,
and _load_all_json_records should be factored into a shared utility module:
extract the implementations into a new helper (e.g., json_io or file_utils) that
exposes functions like ensure_workspace_subdir(path_name) and
load_json_records(path) and then update get_github_account_dir to call the
shared ensure_workspace_subdir and replace local
_load_json_records_from_path/_load_all_json_records usages with imports from the
new module (adjust signatures to accept the same inputs used by
get_github_account_dir and any callers).

boost_usage_tracker/boost_searcher.py (1)

251-252: Log commit date fetch failures instead of silently passing.

The try-except-pass block silently swallows errors when fetching commit dates. Consider logging at debug level for observability.

♻️ Proposed fix

         try:
             commits = client.rest_request(
                 f"/repos/{repo_full_name}/commits",
                 params={"path": file_path, "per_page": 1},
             )
             if isinstance(commits, list) and commits:
                 date_str = commits[0]["commit"]["committer"]["date"]
                 commit_date = datetime.fromisoformat(date_str.replace("Z", "+00:00"))
-        except Exception:
-            pass
+        except Exception as e:
+            logger.debug("Failed to fetch commit date for %s/%s: %s", repo_full_name, file_path, e)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/boost_searcher.py` around lines 251 - 252, Replace the
silent "except Exception: pass" that swallows errors when fetching commit dates
with an explicit debug-level log: catch the exception as e (e.g., "except
Exception as e") and call the module or instance logger (e.g., logger.debug or
self.logger.debug) with a clear message including identifying context (commit
id/repo/file) and the exception (e.g., "Failed to fetch commit date for %s: %s",
<identifier>, e) so failures are observable while preserving existing flow.

boost_usage_tracker/services.py (1)

116-116: Consider using unpacking for list construction.

Per Ruff RUF005, [*update_fields, "updated_at"] is preferred over concatenation.

     if update_fields:
-        ext_repo.save(update_fields=update_fields + ["updated_at"])
+        ext_repo.save(update_fields=[*update_fields, "updated_at"])

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` at line 116, Replace the list concatenation
passed to ext_repo.save with list unpacking per Ruff RUF005: when calling
ext_repo.save(...), construct the fields argument using [*update_fields,
"updated_at"] instead of update_fields + ["updated_at"]; update the call site
(ext_repo.save) where update_fields is used to pass the unpacked list.

boost_usage_tracker/update_created_repos_by_language.py (1)

30-36: Client is instantiated on every API call.

_count_items_from_git creates a new GitHubAPIClient on each invocation. Since the main loop calls this twice per language/year combination (Lines 123-124), this creates many redundant client instances.

♻️ Proposed refactor

-def _count_items_from_git(query: str) -> int:
-    client = get_github_client(use="scraping")
+def _count_items_from_git(client, query: str) -> int:
     data = client.rest_request(
         "/search/repositories",
         params={"q": query, "per_page": 1},
     )
     return int(data.get("total_count", 0))

Then in the main function, create the client once:

+    client = get_github_client(use="scraping")
     for language_name in ordered_language_names:
         ...
         for year in range(start_year, end_year + 1):
             ...
             try:
-                all_repos = _count_items_from_git(q_all)
-                significant_repos = _count_items_from_git(q_sig)
+                all_repos = _count_items_from_git(client, q_all)
+                significant_repos = _count_items_from_git(client, q_sig)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/update_created_repos_by_language.py` around lines 30 -
36, _count_items_from_git currently calls get_github_client inside the function,
causing a new GitHubAPIClient to be created on every invocation; change this by
creating the client once and reusing it: modify _count_items_from_git to accept
a client parameter (e.g., client: GitHubAPIClient) and remove the internal
get_github_client call, then in the main loop that iterates languages/years (the
caller that invokes _count_items_from_git twice per combo) instantiate client =
get_github_client(use="scraping") once and pass that client into both calls;
alternatively, create a module-level reusable client via get_github_client and
have _count_items_from_git use that, ensuring only one client is created for all
queries.

boost_usage_tracker/management/commands/run_boost_usage_tracker.py (1)

108-110: Use logger.exception for rate limit/connection errors.

logger.error at Line 109 loses the stack trace. Using logger.exception provides better debugging context when these errors occur.

         except (ConnectionException, RateLimitException) as e:
-            logger.error("Rate limit / connection error during batch search: %s", e)
+            logger.exception("Rate limit / connection error during batch search")
             raise

The same applies to Lines 153-155.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py` around
lines 108 - 110, Replace the logger.error calls in the exception handlers that
catch ConnectionException and RateLimitException with logger.exception so the
stack trace is preserved; specifically, in the except block matching "except
(ConnectionException, RateLimitException) as e" (the block that currently calls
logger.error("Rate limit / connection error during batch search: %s", e) and
re-raises) and the similar except block later around lines 153-155, change the
call to logger.exception with the same message and continue re-raising the
exception.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@boost_usage_tracker/boost_searcher.py`:
- Around line 265-271: _build_boost_include_query currently concatenates all
repo filters into a single query without enforcing MAX_CODE_SEARCH_QUERY_LEN;
change it to respect the MAX_CODE_SEARCH_QUERY_LEN constant by computing allowed
space after base = '"#include <boost/" language:C++ ' and iterating through
parts = [f"repo:{name}"...] appending parts until adding the next part would
exceed MAX_CODE_SEARCH_QUERY_LEN, then return the built query and the list of
remaining repo names for batching; update the function to use the same return
signature (query, remaining_repos) and ensure callers handle remaining_repos for
subsequent queries.

In `@boost_usage_tracker/services.py`:
- Around line 184-197: The re-detection path that updates last_commit_date for
an existing BoostUsage (the block using BoostUsage.objects.get_or_create and
checking "if not _ and last_commit_date and usage.last_commit_date !=
last_commit_date") must also clear the excepted_at timestamp so re-detected
missing-header usages are no longer marked excepted; modify that branch in
services.py to set usage.excepted_at = None and include "excepted_at" in the
usage.save(update_fields=[...]) call (in addition to "last_commit_date" and
"updated_at") so the change is persisted, mirroring the behavior in
create_or_update_boost_usage.

---

Nitpick comments:
In `@boost_usage_tracker/boost_searcher.py`:
- Around line 251-252: Replace the silent "except Exception: pass" that swallows
errors when fetching commit dates with an explicit debug-level log: catch the
exception as e (e.g., "except Exception as e") and call the module or instance
logger (e.g., logger.debug or self.logger.debug) with a clear message including
identifying context (commit id/repo/file) and the exception (e.g., "Failed to
fetch commit date for %s: %s", <identifier>, e) so failures are observable while
preserving existing flow.

In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py`:
- Around line 108-110: Replace the logger.error calls in the exception handlers
that catch ConnectionException and RateLimitException with logger.exception so
the stack trace is preserved; specifically, in the except block matching "except
(ConnectionException, RateLimitException) as e" (the block that currently calls
logger.error("Rate limit / connection error during batch search: %s", e) and
re-raises) and the similar except block later around lines 153-155, change the
call to logger.exception with the same message and continue re-raising the
exception.

In `@boost_usage_tracker/services.py`:
- Line 116: Replace the list concatenation passed to ext_repo.save with list
unpacking per Ruff RUF005: when calling ext_repo.save(...), construct the fields
argument using [*update_fields, "updated_at"] instead of update_fields +
["updated_at"]; update the call site (ext_repo.save) where update_fields is used
to pass the unpacked list.

In `@boost_usage_tracker/update_created_repos_by_language.py`:
- Around line 30-36: _count_items_from_git currently calls get_github_client
inside the function, causing a new GitHubAPIClient to be created on every
invocation; change this by creating the client once and reusing it: modify
_count_items_from_git to accept a client parameter (e.g., client:
GitHubAPIClient) and remove the internal get_github_client call, then in the
main loop that iterates languages/years (the caller that invokes
_count_items_from_git twice per combo) instantiate client =
get_github_client(use="scraping") once and pass that client into both calls;
alternatively, create a module-level reusable client via get_github_client and
have _count_items_from_git use that, ensuring only one client is created for all
queries.

In `@boost_usage_tracker/update_git_account.py`:
- Around line 26-31: Duplicate JSON/dir helper logic in get_github_account_dir,
_load_json_records_from_path, and _load_all_json_records should be factored into
a shared utility module: extract the implementations into a new helper (e.g.,
json_io or file_utils) that exposes functions like
ensure_workspace_subdir(path_name) and load_json_records(path) and then update
get_github_account_dir to call the shared ensure_workspace_subdir and replace
local _load_json_records_from_path/_load_all_json_records usages with imports
from the new module (adjust signatures to accept the same inputs used by
get_github_account_dir and any callers).

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 903d6c1 and d910b09.

📒 Files selected for processing (17)

boost_usage_tracker/boost_searcher.py
boost_usage_tracker/db_from_file.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
boost_usage_tracker/management/commands/run_update_created_repos_by_language.py
boost_usage_tracker/management/commands/run_update_db.py
boost_usage_tracker/migrations/0002_add_boost_missing_header_tmp_and_nullable_boost_header.py
boost_usage_tracker/models.py
boost_usage_tracker/repo_searcher.py
boost_usage_tracker/services.py
boost_usage_tracker/tests/test_services.py
boost_usage_tracker/tests/test_update_created_repos_by_language.py
boost_usage_tracker/update_boostusage_from_csv.py
boost_usage_tracker/update_created_repos_by_language.py
boost_usage_tracker/update_git_account.py
boost_usage_tracker/update_repository_from_csv.py
config/settings.py
github_activity_tracker/models.py

✅ Files skipped from review due to trivial changes (1)

boost_usage_tracker/tests/test_services.py

🚧 Files skipped from review as they are similar to previous changes (2)

config/settings.py
boost_usage_tracker/update_repository_from_csv.py

henry0816191 · 2026-02-26T19:46:10Z

@coderabbitai resume

coderabbitai · 2026-02-26T19:46:21Z

✅ Actions performed

Reviews resumed.

coderabbitai

🧹 Nitpick comments (3)

boost_usage_tracker/tests/test_update_created_repos_by_language.py (1)

32-35: Consider prefixing unused client parameter with underscore to satisfy linter.

The static analysis tool flags unused client arguments. Using _client or _ makes the intent explicit.

🔧 Proposed fix

-    def fake_count(client, query: str) -> int:
+    def fake_count(_client, query: str) -> int:
         if "stars:>10" in query:
             return 12
         return 120

And for the lambda on line 69:

-        side_effect=lambda client, q: 130 if "stars:>10" not in q else 13,
+        side_effect=lambda _client, q: 130 if "stars:>10" not in q else 13,

Also applies to: 69-69

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/tests/test_update_created_repos_by_language.py` around
lines 32 - 35, Rename the unused function parameter and lambda parameter to
indicate intentionally unused variables: change fake_count's signature from def
fake_count(client, query: str) to def fake_count(_client, query: str) (or use
_), and update the anonymous lambda used later (the lambda accepting a client)
to use _client or _ as its parameter name; this satisfies the linter while
preserving behavior.

boost_usage_tracker/services.py (1)

228-234: Consider filtering existing usages by the keys being processed.

Lines 228-234 load all BoostUsage records for the repo with a non-null boost_header, but only a subset matching key_to_item keys is processed. For repositories with many historical usages, this could load significantly more data than needed.

♻️ Proposed optimization

     # Build map (boost_header_id, file_path_id) -> usage for existing rows
+    # Filter to only the keys we care about
+    bh_ids = {k[0] for k in key_to_item.keys()}
+    fp_ids = {k[1] for k in key_to_item.keys()}
     existing_map = {
         (u.boost_header_id, u.file_path_id): u
         for u in BoostUsage.objects.filter(
             repo=repo,
             boost_header__isnull=False,
+            boost_header_id__in=bh_ids,
+            file_path_id__in=fp_ids,
         ).select_related("boost_header", "file_path")
     }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/services.py` around lines 228 - 234, The current creation
of existing_map in boost_usage_tracker/services.py loads all BoostUsage for the
repo with a non-null boost_header; restrict the query to only the
(boost_header_id, file_path_id) pairs you are about to process (the keys in
key_to_item) to avoid loading unrelated historical usages. Build the set/list of
key pairs from key_to_item and use that to filter the BoostUsage queryset (e.g.,
construct a combined Q for each (boost_header_id, file_path_id) pair or an
equivalent filter) while keeping select_related("boost_header", "file_path"),
then build existing_map from that filtered queryset.

boost_usage_tracker/management/commands/run_boost_usage_tracker.py (1)

285-292: Consider updating updated_at when bulk-updating star counts.

The bulk update for changed star counts only updates the stars field. If GitHubRepository has an updated_at field that should reflect modification time, it won't be updated here.

♻️ Proposed fix

     if stars_to_update and not dry_run:
         repos_to_refresh = list(
             GitHubRepository.objects.filter(pk__in=stars_to_update.keys())
         )
+        from django.utils import timezone
+        now = timezone.now()
         for repo_obj in repos_to_refresh:
             repo_obj.stars = stars_to_update[repo_obj.pk]
-        GitHubRepository.objects.bulk_update(repos_to_refresh, ["stars"])
+            repo_obj.updated_at = now
+        GitHubRepository.objects.bulk_update(repos_to_refresh, ["stars", "updated_at"])
         logger.info("Bulk-updated stars for %d repos", len(repos_to_refresh))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py` around
lines 285 - 292, The bulk update only writes the stars field so
GitHubRepository.updated_at isn't changed; when preparing repos_to_refresh (from
stars_to_update and GitHubRepository.objects.filter(...)) set each
repo_obj.updated_at = timezone.now() (or the appropriate timestamp) and include
"updated_at" in the fields list passed to
GitHubRepository.objects.bulk_update(repos_to_refresh, ["stars", "updated_at"])
so the modification time is persisted; ensure you import django.utils.timezone
if needed and only set updated_at when not a dry_run.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@boost_usage_tracker/management/commands/run_boost_usage_tracker.py`:
- Around line 285-292: The bulk update only writes the stars field so
GitHubRepository.updated_at isn't changed; when preparing repos_to_refresh (from
stars_to_update and GitHubRepository.objects.filter(...)) set each
repo_obj.updated_at = timezone.now() (or the appropriate timestamp) and include
"updated_at" in the fields list passed to
GitHubRepository.objects.bulk_update(repos_to_refresh, ["stars", "updated_at"])
so the modification time is persisted; ensure you import django.utils.timezone
if needed and only set updated_at when not a dry_run.

In `@boost_usage_tracker/services.py`:
- Around line 228-234: The current creation of existing_map in
boost_usage_tracker/services.py loads all BoostUsage for the repo with a
non-null boost_header; restrict the query to only the (boost_header_id,
file_path_id) pairs you are about to process (the keys in key_to_item) to avoid
loading unrelated historical usages. Build the set/list of key pairs from
key_to_item and use that to filter the BoostUsage queryset (e.g., construct a
combined Q for each (boost_header_id, file_path_id) pair or an equivalent
filter) while keeping select_related("boost_header", "file_path"), then build
existing_map from that filtered queryset.

In `@boost_usage_tracker/tests/test_update_created_repos_by_language.py`:
- Around line 32-35: Rename the unused function parameter and lambda parameter
to indicate intentionally unused variables: change fake_count's signature from
def fake_count(client, query: str) to def fake_count(_client, query: str) (or
use _), and update the anonymous lambda used later (the lambda accepting a
client) to use _client or _ as its parameter name; this satisfies the linter
while preserving behavior.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d910b09 and 2401d21.

📒 Files selected for processing (5)

boost_usage_tracker/boost_searcher.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
boost_usage_tracker/services.py
boost_usage_tracker/tests/test_update_created_repos_by_language.py
boost_usage_tracker/update_created_repos_by_language.py

henry0816191 added 9 commits February 12, 2026 08:13

cppalliance#20- add boost_usage_tracker app

f84f63e

cppalliance#20-add tests

d152a03

cppalliance#20-make test more detail

5a92292

cppalliance#20-update boost-usage-tracker

6873bd2

- add db update logic from file - applied multi-processing for different rate limit of github search api and content scraping - applied bulk processing by repo for database updating

cppalliance#20-add new command that get repo count by year and language

3b7b3c5

cppalliance#20- add database migration

40639c0

cppalliance#20-tested command and checked all code

c714b9c

cppalliance#20-add a logic that does not extract headers for unchange…

91fdb27

…d files

cppalliance#20-remove unnecessary sleep logic

a46ad7c

henry0816191 added 6 commits February 25, 2026 13:51

Merge branch 'develop' into dev-20

ced033a

cppalliance#20-update for ci error

0837062

cppalliance#20-fixed by formatter

05640e2

cppalliance#20-fix linter error

d2d8073

cppalliance#20-fix lint error

2426248

cppalliance#20-fix lint error

903d6c1

henry0816191 mentioned this pull request Feb 25, 2026

Add boost-usage-tracker app in the project #20

Closed

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

Comment thread boost_usage_tracker/management/commands/run_boost_usage_tracker.py Outdated

Comment thread boost_usage_tracker/tests/test_services.py Outdated

Comment thread boost_usage_tracker/update_boostusage_from_csv.py

Comment thread boost_usage_tracker/update_git_account.py Outdated

henry0816191 added 2 commits February 26, 2026 08:19

cppalliance#20-first fix for coderabbit

2bdb04b

cppalliance#20-fix lint error

d910b09

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

Comment thread boost_usage_tracker/db_from_file.py

Comment thread boost_usage_tracker/update_git_account.py

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

cppalliance#20-fix for reviewer

cbace79

cppalliance#20-fix ci error

2401d21

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

cppalliance#20-fix rabbit's suggestions

4d96838

henry0816191 requested a review from snowfox1003 February 27, 2026 16:01

snowfox1003 and others added 2 commits February 27, 2026 11:57

Merge branch 'develop' into dev-20

b78d32e

cppalliance#20-update to save forks for a new repository

18722ad

snowfox1003 approved these changes Feb 27, 2026

View reviewed changes

snowfox1003 assigned henry0816191 Mar 2, 2026

snowfox1003 merged commit c4ea365 into cppalliance:develop Mar 2, 2026
3 checks passed

coderabbitai Bot mentioned this pull request Mar 10, 2026

Add WG21 paper tracker with fetch, download, GCS upload, and tests (#24) #104

Merged

coderabbitai Bot mentioned this pull request Apr 10, 2026

feat(boost_usage): resolve missing headers via catalog lookup (138) #140

Open

Conversation

henry0816191 commented Feb 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

henry0816191 commented Feb 25, 2026

Uh oh!

coderabbitai Bot commented Feb 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

henry0816191 commented Feb 25, 2026

Uh oh!

coderabbitai Bot commented Feb 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

henry0816191 commented Feb 26, 2026

Uh oh!

coderabbitai Bot commented Feb 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

henry0816191 commented Feb 26, 2026

Uh oh!

coderabbitai Bot commented Feb 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

henry0816191 commented Feb 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 25, 2026 •

edited

Loading