Feature: fetching full details of prs and issues - #77#78
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
github_activity_tracker/sync/raw_source.py (1)
61-64: Preserve existing detail fields when incoming payload is partial.Current logic replaces
issue_info/pr_infowholesale fromnew; partial updates can drop previously stored detail fields. Consider merging existing detail with incoming detail.♻️ Suggested merge behavior
def _merge_issue_json(existing: dict, new: dict) -> dict: @@ - new_detail = new.get("issue_info") - if new_detail is None: - new_detail = {k: v for k, v in new.items() if k != "comments"} + incoming_detail = new.get("issue_info") + if incoming_detail is None: + incoming_detail = {k: v for k, v in new.items() if k != "comments"} + existing_detail = existing.get("issue_info") + if not isinstance(existing_detail, dict): + existing_detail = {k: v for k, v in existing.items() if k != "comments"} + if not isinstance(incoming_detail, dict): + incoming_detail = {} + new_detail = {**existing_detail, **incoming_detail} @@ def _merge_pr_json(existing: dict, new: dict) -> dict: @@ - new_detail = new.get("pr_info") - if new_detail is None: - new_detail = {k: v for k, v in new.items() if k not in ("comments", "reviews")} + incoming_detail = new.get("pr_info") + if incoming_detail is None: + incoming_detail = { + k: v for k, v in new.items() if k not in ("comments", "reviews") + } + existing_detail = existing.get("pr_info") + if not isinstance(existing_detail, dict): + existing_detail = { + k: v for k, v in existing.items() if k not in ("comments", "reviews") + } + if not isinstance(incoming_detail, dict): + incoming_detail = {} + new_detail = {**existing_detail, **incoming_detail}Also applies to: 76-79
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@github_activity_tracker/sync/raw_source.py` around lines 61 - 64, The code replaces detail objects (issue_info/pr_info) wholesale, losing prior fields on partial updates; change the logic that sets new_detail (and the analogous pr_info block) to merge incoming detail onto the existing detail: read existing_detail = existing.get("issue_info") (or "pr_info"), create merged = {**existing_detail, **incoming_detail} so undefined incoming keys keep existing values, and then use merged for further processing while preserving existing comments handling (existing.get("comments") or []). Update both the block that sets new_detail and the similar block at the pr_info lines to perform this merge using the variables new_detail and existing.github_activity_tracker/sync/utils.py (1)
31-35: Add type guards before flattening nested payloads.
dict(data["issue_info"])/dict(data["pr_info"])can raise on malformed persisted input. A smallisinstance(..., dict)guard makes this path safer.🛡️ Suggested hardening
def normalize_issue_json(data: dict[str, Any]) -> dict[str, Any]: @@ - if "issue_info" in data: - out = dict(data["issue_info"]) + issue_info = data.get("issue_info") + if isinstance(issue_info, dict): + out = dict(issue_info) out["comments"] = data.get("comments", out.get("comments", [])) return out return data @@ def normalize_pr_json(data: dict[str, Any]) -> dict[str, Any]: @@ - if "pr_info" in data: - out = dict(data["pr_info"]) + pr_info = data.get("pr_info") + if isinstance(pr_info, dict): + out = dict(pr_info) out["comments"] = data.get("comments", out.get("comments", [])) out["reviews"] = data.get("reviews", out.get("reviews", [])) return out return dataAlso applies to: 43-48
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@github_activity_tracker/sync/utils.py` around lines 31 - 35, Check that nested payloads are dicts before converting them: when handling data with "issue_info" or "pr_info" keys, use isinstance(data.get("issue_info"), dict) / isinstance(data.get("pr_info"), dict) to guard the dict(...) conversion and only flatten into out = dict(...) if the check passes; otherwise skip flattening and return data (or return a safe default), and still normalize comments via out["comments"] = data.get("comments", out.get("comments", [])) only when out is a dict. Update the branches that create out (the blocks referencing "issue_info", "pr_info", and the out variable) to perform these type guards.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@github_activity_tracker/sync/raw_source.py`:
- Around line 61-64: The code replaces detail objects (issue_info/pr_info)
wholesale, losing prior fields on partial updates; change the logic that sets
new_detail (and the analogous pr_info block) to merge incoming detail onto the
existing detail: read existing_detail = existing.get("issue_info") (or
"pr_info"), create merged = {**existing_detail, **incoming_detail} so undefined
incoming keys keep existing values, and then use merged for further processing
while preserving existing comments handling (existing.get("comments") or []).
Update both the block that sets new_detail and the similar block at the pr_info
lines to perform this merge using the variables new_detail and existing.
In `@github_activity_tracker/sync/utils.py`:
- Around line 31-35: Check that nested payloads are dicts before converting
them: when handling data with "issue_info" or "pr_info" keys, use
isinstance(data.get("issue_info"), dict) / isinstance(data.get("pr_info"), dict)
to guard the dict(...) conversion and only flatten into out = dict(...) if the
check passes; otherwise skip flattening and return data (or return a safe
default), and still normalize comments via out["comments"] =
data.get("comments", out.get("comments", [])) only when out is a dict. Update
the branches that create out (the blocks referencing "issue_info", "pr_info",
and the out variable) to perform these type guards.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
github_activity_tracker/fetcher.pygithub_activity_tracker/sync/issues.pygithub_activity_tracker/sync/pull_requests.pygithub_activity_tracker/sync/raw_source.pygithub_activity_tracker/sync/utils.pygithub_activity_tracker/tests/test_fetcher.pygithub_activity_tracker/tests/test_sync_raw_source.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@github_activity_tracker/fetcher.py`:
- Around line 269-276: The try/except around the detailed fetch using
client.rest_request (the block that assigns full_issue to issue) is catching all
Exception and swallowing transport errors; change the except to only catch
requests.exceptions.HTTPError (i.e. except requests.exceptions.HTTPError as e)
so RateLimitException and ConnectionException from client.rest_request can
propagate to the sync-layer handlers, and apply the same change to the analogous
try/except at the other site (the second block around line 435 that currently
uses except Exception). Add the requests import if missing and keep the existing
logger.debug call for HTTPError.
In `@github_activity_tracker/sync/utils.py`:
- Around line 31-35: The normalization currently copies nested "issue_info" (and
the similar block at the other location) but leaves "comments" and "reviews" as
null if present, which can break downstream "for ... in ..." loops; update the
logic in the block handling data["issue_info"] (and the similar branch at lines
43-48) to coerce comments and reviews into lists by replacing None/null with an
empty list (e.g., set out["comments"] = data.get("comments", out.get("comments",
[])) and ensure that value is converted to a list when it's None, and do the
same for out["reviews"]) so iteration is always safe.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
github_activity_tracker/fetcher.pygithub_activity_tracker/sync/issues.pygithub_activity_tracker/sync/pull_requests.pygithub_activity_tracker/sync/raw_source.pygithub_activity_tracker/sync/utils.pygithub_activity_tracker/tests/test_fetcher.pygithub_activity_tracker/tests/test_sync_raw_source.py
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
github_activity_tracker/fetcher.py (1)
421-423: Inconsistent exception handling for date parsing.PR date parsing catches broad
Exceptionwhile the equivalent issue date parsing at lines 245-247 catches only(ValueError, TypeError). Consider aligning for consistency:♻️ Suggested alignment
- except Exception as e: + except (ValueError, TypeError) as e: logger.debug("Failed to parse PR date '%s': %s", updated_str, e) continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@github_activity_tracker/fetcher.py` around lines 421 - 423, The PR date parsing currently uses a broad except Exception (in the block that logs "Failed to parse PR date '%s': %s" with variables updated_str and e) which is inconsistent with the issue date parsing that only catches (ValueError, TypeError); change the PR date parsing except clause to except (ValueError, TypeError) to match behavior, keep the existing logger.debug call and continue behavior, and ensure you reference the same variables (updated_str and e) in the log message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@github_activity_tracker/sync/utils.py`:
- Around line 26-36: The flat-path of normalize_issue_json currently returns the
input dict unchanged, so a flat payload with "comments": null will propagate
null and break downstream iteration; update normalize_issue_json so both
branches return a dict copy with "comments" coerced to a list: in the
no-"issue_info" branch clone data (e.g., out = dict(data)), read raw_comments =
out.get("comments", []) and set out["comments"] = raw_comments if
isinstance(raw_comments, list) else [], then return out (leave the existing
"issue_info" branch behavior unchanged).
- Around line 39-51: normalize_pr_json currently returns the input dict
unchanged when "pr_info" is absent, which means flat inputs with null/None
comments or reviews are not coerced to lists; update normalize_pr_json to ensure
the returned dict always has "comments" and "reviews" keys containing lists
(coerce None or non-list values to []), mirroring the existing branch that
handles "pr_info" (use the same logic for raw_comments/raw_reviews and
out["comments"]/out["reviews"] when returning the original data).
---
Nitpick comments:
In `@github_activity_tracker/fetcher.py`:
- Around line 421-423: The PR date parsing currently uses a broad except
Exception (in the block that logs "Failed to parse PR date '%s': %s" with
variables updated_str and e) which is inconsistent with the issue date parsing
that only catches (ValueError, TypeError); change the PR date parsing except
clause to except (ValueError, TypeError) to match behavior, keep the existing
logger.debug call and continue behavior, and ensure you reference the same
variables (updated_str and e) in the log message.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
github_activity_tracker/fetcher.pygithub_activity_tracker/sync/issues.pygithub_activity_tracker/sync/pull_requests.pygithub_activity_tracker/sync/raw_source.pygithub_activity_tracker/sync/utils.pygithub_activity_tracker/tests/test_fetcher.pygithub_activity_tracker/tests/test_sync_raw_source.py
Summary by CodeRabbit
Release Notes
Bug Fixes
Refactor