Feature: fetching full details of prs and issues - #77 by snowfox1003 · Pull Request #78 · cppalliance/boost-data-collector

snowfox1003 · 2026-02-27T21:44:59Z

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved error handling during GitHub data fetching and processing for issues and pull requests.
Refactor
- Reorganized internal data handling to enhance robustness and maintainability.
- Enhanced API data retrieval to capture complete details for issues and pull requests.

coderabbitai · 2026-02-27T21:45:11Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

github_activity_tracker/sync/raw_source.py (1)

61-64: Preserve existing detail fields when incoming payload is partial.

Current logic replaces issue_info/pr_info wholesale from new; partial updates can drop previously stored detail fields. Consider merging existing detail with incoming detail.

♻️ Suggested merge behavior

 def _merge_issue_json(existing: dict, new: dict) -> dict:
@@
-    new_detail = new.get("issue_info")
-    if new_detail is None:
-        new_detail = {k: v for k, v in new.items() if k != "comments"}
+    incoming_detail = new.get("issue_info")
+    if incoming_detail is None:
+        incoming_detail = {k: v for k, v in new.items() if k != "comments"}
+    existing_detail = existing.get("issue_info")
+    if not isinstance(existing_detail, dict):
+        existing_detail = {k: v for k, v in existing.items() if k != "comments"}
+    if not isinstance(incoming_detail, dict):
+        incoming_detail = {}
+    new_detail = {**existing_detail, **incoming_detail}
@@
 def _merge_pr_json(existing: dict, new: dict) -> dict:
@@
-    new_detail = new.get("pr_info")
-    if new_detail is None:
-        new_detail = {k: v for k, v in new.items() if k not in ("comments", "reviews")}
+    incoming_detail = new.get("pr_info")
+    if incoming_detail is None:
+        incoming_detail = {
+            k: v for k, v in new.items() if k not in ("comments", "reviews")
+        }
+    existing_detail = existing.get("pr_info")
+    if not isinstance(existing_detail, dict):
+        existing_detail = {
+            k: v for k, v in existing.items() if k not in ("comments", "reviews")
+        }
+    if not isinstance(incoming_detail, dict):
+        incoming_detail = {}
+    new_detail = {**existing_detail, **incoming_detail}

Also applies to: 76-79

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@github_activity_tracker/sync/raw_source.py` around lines 61 - 64, The code
replaces detail objects (issue_info/pr_info) wholesale, losing prior fields on
partial updates; change the logic that sets new_detail (and the analogous
pr_info block) to merge incoming detail onto the existing detail: read
existing_detail = existing.get("issue_info") (or "pr_info"), create merged =
{**existing_detail, **incoming_detail} so undefined incoming keys keep existing
values, and then use merged for further processing while preserving existing
comments handling (existing.get("comments") or []). Update both the block that
sets new_detail and the similar block at the pr_info lines to perform this merge
using the variables new_detail and existing.

github_activity_tracker/sync/utils.py (1)

31-35: Add type guards before flattening nested payloads.

dict(data["issue_info"]) / dict(data["pr_info"]) can raise on malformed persisted input. A small isinstance(..., dict) guard makes this path safer.

🛡️ Suggested hardening

 def normalize_issue_json(data: dict[str, Any]) -> dict[str, Any]:
@@
-    if "issue_info" in data:
-        out = dict(data["issue_info"])
+    issue_info = data.get("issue_info")
+    if isinstance(issue_info, dict):
+        out = dict(issue_info)
         out["comments"] = data.get("comments", out.get("comments", []))
         return out
     return data
@@
 def normalize_pr_json(data: dict[str, Any]) -> dict[str, Any]:
@@
-    if "pr_info" in data:
-        out = dict(data["pr_info"])
+    pr_info = data.get("pr_info")
+    if isinstance(pr_info, dict):
+        out = dict(pr_info)
         out["comments"] = data.get("comments", out.get("comments", []))
         out["reviews"] = data.get("reviews", out.get("reviews", []))
         return out
     return data

Also applies to: 43-48

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@github_activity_tracker/sync/utils.py` around lines 31 - 35, Check that
nested payloads are dicts before converting them: when handling data with
"issue_info" or "pr_info" keys, use isinstance(data.get("issue_info"), dict) /
isinstance(data.get("pr_info"), dict) to guard the dict(...) conversion and only
flatten into out = dict(...) if the check passes; otherwise skip flattening and
return data (or return a safe default), and still normalize comments via
out["comments"] = data.get("comments", out.get("comments", [])) only when out is
a dict. Update the branches that create out (the blocks referencing
"issue_info", "pr_info", and the out variable) to perform these type guards.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@github_activity_tracker/sync/raw_source.py`:
- Around line 61-64: The code replaces detail objects (issue_info/pr_info)
wholesale, losing prior fields on partial updates; change the logic that sets
new_detail (and the analogous pr_info block) to merge incoming detail onto the
existing detail: read existing_detail = existing.get("issue_info") (or
"pr_info"), create merged = {**existing_detail, **incoming_detail} so undefined
incoming keys keep existing values, and then use merged for further processing
while preserving existing comments handling (existing.get("comments") or []).
Update both the block that sets new_detail and the similar block at the pr_info
lines to perform this merge using the variables new_detail and existing.

In `@github_activity_tracker/sync/utils.py`:
- Around line 31-35: Check that nested payloads are dicts before converting
them: when handling data with "issue_info" or "pr_info" keys, use
isinstance(data.get("issue_info"), dict) / isinstance(data.get("pr_info"), dict)
to guard the dict(...) conversion and only flatten into out = dict(...) if the
check passes; otherwise skip flattening and return data (or return a safe
default), and still normalize comments via out["comments"] =
data.get("comments", out.get("comments", [])) only when out is a dict. Update
the branches that create out (the blocks referencing "issue_info", "pr_info",
and the out variable) to perform these type guards.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7323ef8 and 8bdf1d1.

📒 Files selected for processing (7)

github_activity_tracker/fetcher.py
github_activity_tracker/sync/issues.py
github_activity_tracker/sync/pull_requests.py
github_activity_tracker/sync/raw_source.py
github_activity_tracker/sync/utils.py
github_activity_tracker/tests/test_fetcher.py
github_activity_tracker/tests/test_sync_raw_source.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@github_activity_tracker/fetcher.py`:
- Around line 269-276: The try/except around the detailed fetch using
client.rest_request (the block that assigns full_issue to issue) is catching all
Exception and swallowing transport errors; change the except to only catch
requests.exceptions.HTTPError (i.e. except requests.exceptions.HTTPError as e)
so RateLimitException and ConnectionException from client.rest_request can
propagate to the sync-layer handlers, and apply the same change to the analogous
try/except at the other site (the second block around line 435 that currently
uses except Exception). Add the requests import if missing and keep the existing
logger.debug call for HTTPError.

In `@github_activity_tracker/sync/utils.py`:
- Around line 31-35: The normalization currently copies nested "issue_info" (and
the similar block at the other location) but leaves "comments" and "reviews" as
null if present, which can break downstream "for ... in ..." loops; update the
logic in the block handling data["issue_info"] (and the similar branch at lines
43-48) to coerce comments and reviews into lists by replacing None/null with an
empty list (e.g., set out["comments"] = data.get("comments", out.get("comments",
[])) and ensure that value is converted to a list when it's None, and do the
same for out["reviews"]) so iteration is always safe.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7323ef8 and ca8ec27.

📒 Files selected for processing (7)

github_activity_tracker/fetcher.py
github_activity_tracker/sync/issues.py
github_activity_tracker/sync/pull_requests.py
github_activity_tracker/sync/raw_source.py
github_activity_tracker/sync/utils.py
github_activity_tracker/tests/test_fetcher.py
github_activity_tracker/tests/test_sync_raw_source.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

github_activity_tracker/fetcher.py (1)

421-423: Inconsistent exception handling for date parsing.

PR date parsing catches broad Exception while the equivalent issue date parsing at lines 245-247 catches only (ValueError, TypeError). Consider aligning for consistency:

♻️ Suggested alignment

-                except Exception as e:
+                except (ValueError, TypeError) as e:
                     logger.debug("Failed to parse PR date '%s': %s", updated_str, e)
                     continue

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@github_activity_tracker/fetcher.py` around lines 421 - 423, The PR date
parsing currently uses a broad except Exception (in the block that logs "Failed
to parse PR date '%s': %s" with variables updated_str and e) which is
inconsistent with the issue date parsing that only catches (ValueError,
TypeError); change the PR date parsing except clause to except (ValueError,
TypeError) to match behavior, keep the existing logger.debug call and continue
behavior, and ensure you reference the same variables (updated_str and e) in the
log message.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@github_activity_tracker/sync/utils.py`:
- Around line 26-36: The flat-path of normalize_issue_json currently returns the
input dict unchanged, so a flat payload with "comments": null will propagate
null and break downstream iteration; update normalize_issue_json so both
branches return a dict copy with "comments" coerced to a list: in the
no-"issue_info" branch clone data (e.g., out = dict(data)), read raw_comments =
out.get("comments", []) and set out["comments"] = raw_comments if
isinstance(raw_comments, list) else [], then return out (leave the existing
"issue_info" branch behavior unchanged).
- Around line 39-51: normalize_pr_json currently returns the input dict
unchanged when "pr_info" is absent, which means flat inputs with null/None
comments or reviews are not coerced to lists; update normalize_pr_json to ensure
the returned dict always has "comments" and "reviews" keys containing lists
(coerce None or non-list values to []), mirroring the existing branch that
handles "pr_info" (use the same logic for raw_comments/raw_reviews and
out["comments"]/out["reviews"] when returning the original data).

---

Nitpick comments:
In `@github_activity_tracker/fetcher.py`:
- Around line 421-423: The PR date parsing currently uses a broad except
Exception (in the block that logs "Failed to parse PR date '%s': %s" with
variables updated_str and e) which is inconsistent with the issue date parsing
that only catches (ValueError, TypeError); change the PR date parsing except
clause to except (ValueError, TypeError) to match behavior, keep the existing
logger.debug call and continue behavior, and ensure you reference the same
variables (updated_str and e) in the log message.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7323ef8 and 5760a6d.

📒 Files selected for processing (7)

github_activity_tracker/fetcher.py
github_activity_tracker/sync/issues.py
github_activity_tracker/sync/pull_requests.py
github_activity_tracker/sync/raw_source.py
github_activity_tracker/sync/utils.py
github_activity_tracker/tests/test_fetcher.py
github_activity_tracker/tests/test_sync_raw_source.py

jonathanMLDev

lgtm

snowfox1003 added 2 commits February 27, 2026 16:26

Feature: fetching full details of prs and issues

fab55e8

Fix: lint/format error - cppalliance#77

8bdf1d1

snowfox1003 self-assigned this Feb 27, 2026

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

full issue/PR detail fetch and nested raw save format - cppalliance#77

ca8ec27

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread github_activity_tracker/fetcher.py

Comment thread github_activity_tracker/sync/utils.py

Fix type error.

5760a6d

coderabbitai Bot reviewed Feb 27, 2026

View reviewed changes

Comment thread github_activity_tracker/sync/utils.py

Comment thread github_activity_tracker/sync/utils.py

snowfox1003 mentioned this pull request Feb 27, 2026

Make workflow discover collector commands so adding or reordering apps requires no code changes #76

Closed

snowfox1003 requested a review from jonathanMLDev February 27, 2026 22:59

snowfox1003 mentioned this pull request Feb 27, 2026

Implement full issue and PR detail fetch in github_activity_tracker #77

Closed

jonathanMLDev approved these changes Mar 9, 2026

View reviewed changes

snowfox1003 merged commit bfd9f1c into cppalliance:develop Mar 9, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: fetching full details of prs and issues - #77#78

Feature: fetching full details of prs and issues - #77#78
snowfox1003 merged 4 commits into
cppalliance:developfrom
snowfox1003:dev-77

snowfox1003 commented Feb 27, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

jonathanMLDev left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

snowfox1003 commented Feb 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jonathanMLDev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

snowfox1003 commented Feb 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 27, 2026 •

edited

Loading