-
Notifications
You must be signed in to change notification settings - Fork 0
Feature/170 add logic for fetching sub issues from external repositories #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/170 add logic for fetching sub issues from external repositories #175
Conversation
- cross-repo sub-issues are now visible in release notes.
WalkthroughRefactors repository handling to use a home_repository mapping in MinedData, adds cross-repo resolution and flags across factories and records, converts several helpers to static methods, changes commit record keys to SHA-only, adds deduplication and repository lookup helpers in miner/factories, and updates tests and logging. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User
participant Generator
participant Miner
participant Factory as RecordFactory
participant GH as GitHub API
note right of Miner: Cross-repo-aware mining & de-duplication
User->>Generator: generate()
Generator->>Miner: mine_data()
Miner->>GH: get_repository(home_full_name)
GH-->>Miner: Repository (home)
Miner->>GH: fetch PRs/issues/commits for home
Miner->>Miner: __filter_duplicated_issues(data)
Miner-->>Generator: MinedData (home_repository + repo map)
Generator->>Factory: build records(home_repository, data)
Factory->>Factory: parse PRs/issues/commits
alt PR links external issue
Factory->>GH: get_repository(parent_full_name)
GH-->>Factory: Repository (parent) or None
Factory->>MinedData: add_repository(parent_repo) [if found]
Factory->>Factory: mark sub-issue is_cross_repo=true
end
Factory-->>Generator: Records (commit keys by SHA)
Generator-->>User: release notes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧠 Learnings (2)📚 Learning: 2025-09-28T14:13:38.954Z
Applied to files:
📚 Learning: 2025-09-28T13:20:42.353Z
Applied to files:
🧬 Code graph analysis (1)release_notes_generator/record/factory/default_record_factory.py (3)
🪛 Ruff (0.13.1)release_notes_generator/record/factory/default_record_factory.py137-137: Use Replace with (TRY400) 🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (15)
release_notes_generator/model/sub_issue_record.py (1)
26-34
: Remove unused SubIssue import or reintroduce the typed override.With the override commented out,
SubIssue
isn’t referenced; keep it consistent:
- Either delete
SubIssue
import and update the docstring (no specialization remains), or- Restore the override returning
SubIssue | Issue
if downstream relies onSubIssue
members.-from github.Issue import SubIssue, Issue +from github.Issue import Issue @@ -class SubIssueRecord(IssueRecord): - """ - Represents a sub-issue record in the release notes. - Inherits from IssueRecord and specializes behavior for sub-issues. - """ +class SubIssueRecord(IssueRecord): + """ + Represents a sub-issue record in the release notes. + Currently no additional specialization over IssueRecord. + """tests/release_notes/record/factory/test_issue_hierarchy_record_factory.py (1)
63-64
: Stabilize commit-key assertions and DRY the tests
- Add a guard that the old namespaced key is absent to prevent regressions.
- Extract the repeated expected-keys set into a helper/fixture to avoid duplication.
Example tweaks:
-assert {'org/repo#121', ..., "merge_commit_sha_direct"}.issubset(result.keys()) +expected = {'org/repo#121', 'org/repo#301', 'org/repo#302', 'org/repo#303', + 'org/repo#304', 'org/repo#123', 'org/repo#124', 'merge_commit_sha_direct'} +assert expected.issubset(result.keys()) +assert "org/repo@merge_commit_sha_direct" not in result # legacy key must not appearIf feasible, also add a parametric test that injects multiple direct commits to ensure each SHA appears as a top-level key exactly once.
Also applies to: 72-73, 133-134, 142-143, 196-197, 205-206, 259-260, 268-269
release_notes_generator/model/hierarchy_issue_record.py (2)
63-75
: Confirm counting semantics for cross‑repo sub-itemsThe method now counts each cross-repo sub-issue/sub-hierarchy-issue as +1, ignoring their internal PR counts. If the metric is “change increments” this is fine; if it’s intended to be “total PRs”, this undercounts.
- Clarify the contract in the docstring (is this “increments” vs “PRs”?).
- Optionally rename or add a separate method (e.g., change_increments_count) to avoid ambiguity.
94-94
: Minor logging polish
- Use consistent placeholders (%s) for numbers to keep style uniform.
- Logs are fine at debug; no functional change.
-logger.debug("Rendering sub-issue row for issue #%d", sub_issue.issue.number) +logger.debug("Rendering sub-issue row for issue #%s", sub_issue.issue.number)Also applies to: 130-133, 139-147
release_notes_generator/model/mined_data.py (1)
55-60
: Avoid eager f-string evaluation in debug logUse parameterized logging to defer formatting.
-logger.debug(f"Added repository {repository.full_name} to mined data.") +logger.debug("Added repository %s to mined data.", repository.full_name)release_notes_generator/generator.py (2)
99-99
: Prefer explicit guard over assert for runtime safetyAssertions can be stripped with -O; fail fast with a clear log/return instead.
-assert data_filtered_by_release.home_repository is not None, "Repository must not be None" +if data_filtered_by_release.home_repository is None: + logger.error("Home repository is None; aborting release notes generation.") + return None
111-118
: Widen return type annotation and fix docstringFactory may return IssueHierarchyRecordFactory; annotate with the common base interface.
-from release_notes_generator.record.factory.default_record_factory import DefaultRecordFactory +from release_notes_generator.record.factory.default_record_factory import DefaultRecordFactory +from release_notes_generator.record.factory.record_factory import RecordFactory ... -@staticmethod -def _get_record_factory(github: Github, home_repository: Repository) -> DefaultRecordFactory: +@staticmethod +def _get_record_factory(github: Github, home_repository: Repository) -> RecordFactory: @@ - Returns: - DefaultRecordFactory: An instance of either IssueHierarchyRecordFactory or RecordFactory. + Returns: + RecordFactory: Either IssueHierarchyRecordFactory or DefaultRecordFactory.release_notes_generator/model/record.py (1)
39-39
: Enforce boolean in is_cross_repo.setter
Add a type check in the setter (lines 54–71) to raise a TypeError ifvalue
is not a bool; all current writes go through this setter in the factory at creation, so it won’t break existing flows.release_notes_generator/filter.py (1)
126-140
: Static helpers: call explicitly as static for clarity.Methods don’t use self, good. Consider invoking via the class (e.g., FilterByRelease._filter_issues_default(data)) to make static dispatch explicit.
release_notes_generator/record/factory/default_record_factory.py (2)
90-101
: Repository resolution helper looks good.get_repository uses safe calls and logs missing repos. Consider adding a small TTL cache if this gets hot, but fine as-is.
170-170
: Improve cross-repo parsing from PR body.extract_issue_numbers_from_body currently prefixes the passed repository and won’t catch “owner/repo#123” mentions. Consider enhancing the regex to optionally capture owner/repo (and return fully qualified IDs), then call it without forcing home repo context.
I can propose a regex update in pull_request_utils if helpful.
release_notes_generator/miner.py (4)
81-93
: Avoid fetching all commits; filter by release “since” after it’s derivedCurrently all commits are fetched, which can be large. Move commit retrieval after
data.since
is determined (via_get_issues
) and passsince
toget_commits
.- self._get_issues(data) - - # pulls and commits, and then reduce them by the latest release since time - data.pull_requests = list(self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED)) - data.commits = list(self._safe_call(repo.get_commits)()) + self._get_issues(data) + + # pulls and commits, reduced by the latest release "since" time when available + data.pull_requests = list(self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED)) + since = data.since if data.release else None + data.commits = list(self._safe_call(repo.get_commits)(since=since))
151-151
: Remove redundant assert on non-optional property
home_repository
is not Optional and would raiseKeyError
earlier if missing. The assert is ineffective and the message is misleading.- assert data.home_repository is not None, "Repository must not be None"
190-214
: Correct semver type annotations to VersionInfo
semver.Version
doesn’t exist;VersionInfo
is returned bysemver.VersionInfo.parse
. Update annotations to avoid mypy ignores and operator type warnings.- latest_version: Optional[semver.Version] = None + latest_version: Optional[semver.VersionInfo] = None @@ - current_version: Optional[semver.Version] = semver.VersionInfo.parse(version_str) + current_version: semver.VersionInfo = semver.VersionInfo.parse(version_str) @@ - if latest_version is None or current_version > latest_version: # type: ignore[operator] + if latest_version is None or current_version > latest_version:
215-235
: Make issue/PR de-dup robust for cross-repo scenariosDe-duping by PR numbers alone can drop legitimate external issues when numbers collide across repos. Filter out PR-like entries in the issues list by checking the
pull_request
field instead; this is repo-agnostic and matches the API’s semantics.- pr_numbers = {pr.number for pr in data.pull_requests} - filtered_issues = [issue for issue in data.issues if issue.number not in pr_numbers] + # GitHub returns PRs in the issues feed; exclude those by checking the presence of the PR marker + filtered_issues = [issue for issue in data.issues if getattr(issue, "pull_request", None) is None]Please confirm PyGithub’s
Issue
objects here expose.pull_request
(or.raw_data.get("pull_request")
) so we can rely on this check.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
release_notes_generator/filter.py
(3 hunks)release_notes_generator/generator.py
(2 hunks)release_notes_generator/miner.py
(5 hunks)release_notes_generator/model/hierarchy_issue_record.py
(3 hunks)release_notes_generator/model/issue_record.py
(1 hunks)release_notes_generator/model/mined_data.py
(1 hunks)release_notes_generator/model/record.py
(2 hunks)release_notes_generator/model/sub_issue_record.py
(1 hunks)release_notes_generator/record/factory/default_record_factory.py
(6 hunks)release_notes_generator/record/factory/issue_hierarchy_record_factory.py
(4 hunks)tests/release_notes/record/factory/test_default_record_factory.py
(4 hunks)tests/release_notes/record/factory/test_issue_hierarchy_record_factory.py
(8 hunks)tests/test_filter.py
(3 hunks)tests/test_miner.py
(6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-28T14:13:38.954Z
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.954Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
Applied to files:
tests/release_notes/record/factory/test_default_record_factory.py
release_notes_generator/record/factory/default_record_factory.py
release_notes_generator/record/factory/issue_hierarchy_record_factory.py
🧬 Code graph analysis (12)
tests/test_miner.py (3)
release_notes_generator/miner.py (1)
get_latest_release
(102-140)release_notes_generator/model/mined_data.py (1)
home_repository
(51-53)tests/conftest.py (1)
mock_git_releases
(110-119)
tests/release_notes/record/factory/test_default_record_factory.py (2)
release_notes_generator/model/commit_record.py (2)
CommitRecord
(13-78)commit
(45-50)release_notes_generator/model/mined_data.py (1)
home_repository
(51-53)
release_notes_generator/model/hierarchy_issue_record.py (2)
release_notes_generator/model/record.py (5)
is_cross_repo
(55-61)is_cross_repo
(64-70)contains_change_increment
(140-146)to_chapter_row
(128-137)is_open
(109-114)release_notes_generator/model/issue_record.py (5)
pull_requests_count
(223-229)issue
(63-68)contains_change_increment
(125-129)to_chapter_row
(100-123)is_open
(51-52)
tests/release_notes/record/factory/test_issue_hierarchy_record_factory.py (1)
release_notes_generator/model/commit_record.py (1)
CommitRecord
(13-78)
release_notes_generator/record/factory/default_record_factory.py (3)
release_notes_generator/miner.py (1)
get_repository
(63-74)release_notes_generator/model/mined_data.py (2)
get_repository
(61-65)home_repository
(51-53)release_notes_generator/utils/pull_request_utils.py (1)
extract_issue_numbers_from_body
(32-52)
release_notes_generator/miner.py (5)
release_notes_generator/model/mined_data.py (3)
get_repository
(61-65)MinedData
(38-74)home_repository
(51-53)release_notes_generator/record/factory/default_record_factory.py (1)
get_repository
(90-101)release_notes_generator/action_inputs.py (1)
get_github_repository
(113-117)release_notes_generator/model/pull_request_record.py (1)
PullRequestRecord
(15-293)release_notes_generator/model/issue_record.py (1)
IssueRecord
(16-334)
release_notes_generator/generator.py (2)
release_notes_generator/model/mined_data.py (1)
home_repository
(51-53)release_notes_generator/record/factory/default_record_factory.py (1)
DefaultRecordFactory
(46-232)
release_notes_generator/record/factory/issue_hierarchy_record_factory.py (3)
release_notes_generator/model/mined_data.py (3)
MinedData
(38-74)home_repository
(51-53)get_repository
(61-65)release_notes_generator/model/record.py (3)
get_labels
(149-154)is_cross_repo
(55-61)is_cross_repo
(64-70)release_notes_generator/utils/pull_request_utils.py (1)
extract_issue_numbers_from_body
(32-52)
release_notes_generator/model/mined_data.py (2)
release_notes_generator/miner.py (1)
get_repository
(63-74)release_notes_generator/record/factory/default_record_factory.py (1)
get_repository
(90-101)
release_notes_generator/model/issue_record.py (1)
release_notes_generator/model/record.py (2)
is_cross_repo
(55-61)is_cross_repo
(64-70)
tests/test_filter.py (1)
release_notes_generator/model/mined_data.py (1)
home_repository
(51-53)
release_notes_generator/filter.py (1)
release_notes_generator/model/mined_data.py (2)
MinedData
(38-74)home_repository
(51-53)
🪛 Ruff (0.13.1)
release_notes_generator/miner.py
83-83: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (13)
tests/test_filter.py (1)
21-21
: Import looks fine.Importing
Repository
for stricter mocks is reasonable.tests/release_notes/record/factory/test_default_record_factory.py (1)
245-245
: LGTM on commit keying by SHA.Assertions against
'ghi789'
andCommitRecord
align withCommitRecord.record_id = sha
.Also applies to: 250-250, 268-268
release_notes_generator/model/issue_record.py (1)
125-129
: is_cross_repo is already properly decorated
Theis_cross_repo
getter (with@property
) and corresponding setter (@is_cross_repo.setter
) are already present—no changes needed.tests/test_miner.py (1)
77-77
: Approve changes MinedData.home_repository is implemented as a @Property returning a Repository, so calls like data.home_repository are correct.release_notes_generator/model/mined_data.py (1)
41-49
: Guardget_repository
in default_record_factory
In release_notes_generator/record/factory/default_record_factory.py (lines 134–136),parent_repository = data.get_repository(...)
may be None; add anif parent_repository:
check around the call toparent_repository.get_issue
to prevent anAttributeError
.⛔ Skipped due to learnings
Learnt from: miroslavpojer PR: AbsaOSS/generate-release-notes#173 File: release_notes_generator/record/factory/default_record_factory.py:109-126 Timestamp: 2025-09-28T14:13:38.954Z Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
release_notes_generator/filter.py (1)
64-64
: Correct switch to home_repository.Using data.home_repository to seed MinedData aligns with the multi-repo model. LGTM.
release_notes_generator/record/factory/default_record_factory.py (2)
156-157
: Fallback PR record creation LGTM.Stand-alone PR path correctly preserves labels and skip flag.
88-89
: No prefixed commit keys found — safe to switch. Verified that no code references “owner/repo#SHA” patterns.release_notes_generator/record/factory/issue_hierarchy_record_factory.py (4)
139-141
: Signature extension is fine.Optional target_repository enables correct context when needed. LGTM.
147-151
: Correct repo context for body parsing.Using pr_repo is an improvement. Note the extractor won’t detect “owner/repo#123” — see my note in DefaultRecordFactory for a regex enhancement.
333-334
: Helpful debug logs.Nice to log successful reattachment of sub-issues and sub-hierarchy issues.
Also applies to: 339-340
304-308
: Define missing@property
foris_cross_repo
and usesplit("#", 1)
- Add a
@property
-decorated getter foris_cross_repo
inrelease_notes_generator/model/record.py
before the existing setter.- Replace
iid.split("#")
withiid.split("#", 1)
to limit to the first split.⛔ Skipped due to learnings
Learnt from: miroslavpojer PR: AbsaOSS/generate-release-notes#173 File: release_notes_generator/record/factory/default_record_factory.py:109-126 Timestamp: 2025-09-28T14:13:38.954Z Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
release_notes_generator/miner.py (1)
166-172
: LGTM on switching to home_repository for issue queriesUsing
data.home_repository.get_issues(...)
aligns with the new multi-repo model. No action required.
def get_repository(self, full_name: str) -> Optional[Repository]: | ||
""" | ||
Retrieves the specified GitHub repository. | ||
Returns: | ||
Optional[Repository]: The GitHub repository if found, None otherwise. | ||
""" | ||
repo: Repository = self._safe_call(self.github_instance.get_repo)(full_name) | ||
if repo is None: | ||
logger.error("Repository not found: %s", full_name) | ||
return None | ||
return repo | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix variable typing; consider deduping helper to avoid divergence
- The local variable may be None from
_safe_call
, but it's annotated asRepository
. UseOptional[Repository]
to match usage. - You now have two similar helpers (this one and record/factory/default_record_factory.get_repository). Consider consolidating into a single utility to keep behavior (logging, retries) consistent. Based on relevant snippets.
- repo: Repository = self._safe_call(self.github_instance.get_repo)(full_name)
+ repo: Optional[Repository] = self._safe_call(self.github_instance.get_repo)(full_name)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def get_repository(self, full_name: str) -> Optional[Repository]: | |
""" | |
Retrieves the specified GitHub repository. | |
Returns: | |
Optional[Repository]: The GitHub repository if found, None otherwise. | |
""" | |
repo: Repository = self._safe_call(self.github_instance.get_repo)(full_name) | |
if repo is None: | |
logger.error("Repository not found: %s", full_name) | |
return None | |
return repo | |
def get_repository(self, full_name: str) -> Optional[Repository]: | |
""" | |
Retrieves the specified GitHub repository. | |
Returns: | |
Optional[Repository]: The GitHub repository if found, None otherwise. | |
""" | |
repo: Optional[Repository] = self._safe_call(self.github_instance.get_repo)(full_name) | |
if repo is None: | |
logger.error("Repository not found: %s", full_name) | |
return None | |
return repo |
🤖 Prompt for AI Agents
In release_notes_generator/miner.py around lines 63 to 75, the local variable
repo is currently typed as Repository but can be None from _safe_call; change
its annotation to Optional[Repository] (and adjust any local uses if a type
checker complains) so the signature and implementation are consistent; then
factor this get_repository implementation into a single shared utility (e.g.,
release_notes_generator/utils/github.py or a common helpers module) and update
the other similar helper (record/factory/default_record_factory.get_repository)
to call that shared function so logging, retry behavior and error handling are
centralized and consistent across callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the current implementation of get_repository in miner.py to confirm the fix
cat release_notes_generator/miner.py | sed -n '63,75p'
# Also check if there are still duplicate get_repository methods in different files
rg -n "def get_repository" --type=py
Length of output: 824
@miroslavpojer Thank you for working on the fix! I can see you've updated the return type annotation to Optional[Repository]
, which is great. However, there's still one small detail that needs attention:
On line 72, the local variable is still typed as Repository
but should be Optional[Repository]
to be consistent with what _safe_call
can return:
# Current (line 72):
repo: Repository = self._safe_call(self.github_instance.get_repo)(full_name)
# Should be:
repo: Optional[Repository] = self._safe_call(self.github_instance.get_repo)(full_name)
Also, I noticed there are now 3 duplicate get_repository
methods across the codebase:
release_notes_generator/miner.py:63
release_notes_generator/record/factory/default_record_factory.py:90
release_notes_generator/model/mined_data.py:61
Consider consolidating these into a shared utility to maintain consistent behavior and reduce code duplication.
🐰 Getting close to the finish line!
🧠 Learnings used
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.954Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On line 72 I already did what was a source of review comment.
for that 3 duplicates I have created TODO in my tech debt list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the current implementation around line 72 in miner.py to verify the fix
sed -n '70,75p' release_notes_generator/miner.py
Length of output: 228
@miroslavpojer You're absolutely right! I can confirm that line 70 (formerly 72) now correctly shows:
repo: Optional[Repository] = self._safe_call(self.github_instance.get_repo)(full_name)
The variable typing issue has been properly resolved. Thank you for addressing the immediate concern and for adding the duplicate methods to your technical debt list for future cleanup.
🧠 Learnings used
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.954Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
release_notes_generator/record/factory/default_record_factory.py
Outdated
Show resolved
Hide resolved
release_notes_generator/record/factory/issue_hierarchy_record_factory.py
Outdated
Show resolved
Hide resolved
data.home_repository.return_value = request.getfixturevalue("mock_repo") | ||
data.issues = [] # No issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Fix mocking of home_repository to match property semantics.
Use direct attribute assignment if home_repository
is a property.
- data.home_repository.return_value = request.getfixturevalue("mock_repo")
+ data.home_repository = request.getfixturevalue("mock_repo")
Scan for similar patterns:
Also applies to: 397-398
🏁 Script executed:
#!/bin/bash
rg -n 'home_repository\.return_value' tests
Length of output: 475
🏁 Script executed:
#!/bin/bash
# Locate all definitions of home_repository
rg -nP 'def\s+home_repository' -g '*.py'
# Check for property decorator on home_repository
rg -nP '@property.*\n\s*def\s+home_repository' -g '*.py'
Length of output: 91
🏁 Script executed:
sed -n '35,70p' release_notes_generator/model/mined_data.py
Length of output: 1355
Update tests to assign the mock repo into the backing map instead of mocking the read-only property.
- data.home_repository.return_value = request.getfixturevalue("mock_repo")
+ data._repositories[data._home_repository_full_name] = request.getfixturevalue("mock_repo")
Apply to both occurrences (lines 355–356 and 397–398).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
data.home_repository.return_value = request.getfixturevalue("mock_repo") | |
data.issues = [] # No issues | |
data._repositories[data._home_repository_full_name] = request.getfixturevalue("mock_repo") | |
data.issues = [] # No issues |
🤖 Prompt for AI Agents
In tests/release_notes/record/factory/test_default_record_factory.py around
lines 355–356 and 397–398, don't mock the read-only home_repository property;
instead put the mock repository into the object's backing map that the property
reads from (for example assign the mock into the internal repos map such as
data._repos['home'] = request.getfixturevalue("mock_repo") or the actual backing
attribute used by the class), then set data.issues = [] as before; apply the
same change at both occurrences so the property returns the mock via its real
getter rather than being patched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is not needed to be in test. Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miroslavpojer Thank you for confirming the fix! You're absolutely right - removing that line entirely is the cleanest solution if the test doesn't need to mock the home_repository
property. This eliminates the incorrect attempt to use .return_value
on a read-only property.
🧠 Learnings used
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: tests/release_notes/record/factory/test_default_record_factory.py:0-0
Timestamp: 2025-09-28T14:09:54.315Z
Learning: In tests/release_notes/record/factory/test_default_record_factory.py, the mock_get_issues_for_pr_no_issues helper function parameter should remain as `str` type (not changed to `int`) because changing it to `int` causes test failures, even though the parameter is unused and flagged by linters.
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.954Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
tests/test_filter.py
Outdated
data.home_repository.return_value = MagicMock(spec=Repository) | ||
data.since = datetime(2023, 1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Mocking home_repository as callable will break if it’s a property.
If MinedData.home_repository
is a property (recommended), set the attribute directly, not .return_value
.
- data.home_repository.return_value = MagicMock(spec=Repository)
+ data.home_repository = MagicMock(spec=Repository)
Repeat the same change in the second test setup. To find other spots:
Also applies to: 58-59
🏁 Script executed:
#!/bin/bash
rg -n 'home_repository\.return_value' tests
Length of output: 475
🏁 Script executed:
#!/bin/bash
# Find MinedData class
rg -nP 'class\s+MinedData' -C5
# Find home_repository definition
rg -nP 'def\s+home_repository' -C5
Length of output: 3167
Use direct assignment for home_repository
property mocks
MinedData.home_repository
is a read-only @property
; setting .return_value
won’t work. Replace in tests:
- data.home_repository.return_value = MagicMock(spec=Repository)
+ data.home_repository = MagicMock(spec=Repository)
Apply at tests/test_filter.py
lines 33–34 and 58–59, and at tests/release_notes/record/factory/test_default_record_factory.py
lines 355–356 and 397–398. To locate all occurrences:
rg -n 'home_repository\.return_value' tests
🤖 Prompt for AI Agents
In tests/test_filter.py around lines 33–34, the test is trying to set a
return_value on MinedData.home_repository which is a read-only @property;
instead assign the MagicMock directly to the attribute (e.g.,
data.home_repository = MagicMock(spec=Repository)) and keep the since assignment
as-is; apply the same direct-assignment fix for home_repository at
tests/test_filter.py lines 58–59 and in
tests/release_notes/record/factory/test_default_record_factory.py at lines
355–356 and 397–398 (search for "home_repository.return_value" to update all
occurrences).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miroslavpojer Thank you for fixing the home_repository
mocking issue! The tests should now work correctly with the property-based implementation.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (12)
tests/test_filter.py (1)
58-59
: Same property-mock fix applied here; consider two small test hardening tweaks.
- Make datetimes timezone-aware (UTC) to avoid naive/aware comparison issues if production uses aware datetimes.
- Relax logger assertions to avoid brittle index-based checks (use
assert_any_call
or substring matching).Example (logger check):
mock_log_debug.assert_any_call('Count of issues reduced from %d to %d', 2, 1)release_notes_generator/miner.py (3)
81-93
: Bound data volume: use since for commits; avoid materializing all PRs
- Commits: pass since to limit API reads.
- PRs: PyGithub can’t filter by since; iterate the PaginatedList and break when updated_at < data.since instead of list(...) which loads all pages.
Apply for commits:
- data.pull_requests = list(self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED)) - data.commits = list(self._safe_call(repo.get_commits)()) + data.pull_requests = list(self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED)) + # Reduce API calls by honoring the derived window + data.commits = list(self._safe_call(repo.get_commits)(since=data.since))For PRs, consider iterating lazily and breaking on date; happy to propose a small helper if you want.
151-151
: Redundant asserthome_repository is guaranteed by MinedData’s ctor/backing map; the assert adds little signal. Optional: replace with a defensive ValueError if you need a guard.
190-206
: Fix semver typing; simplify and keep trace handling idiomaticUse VersionInfo for types; you can drop type: ignore.
- latest_version: Optional[semver.Version] = None + latest_version: Optional[semver.VersionInfo] = None @@ - current_version: Optional[semver.Version] = semver.VersionInfo.parse(version_str) + current_version: semver.VersionInfo = semver.VersionInfo.parse(version_str) @@ - if latest_version is None or current_version > latest_version: # type: ignore[operator] + if latest_version is None or current_version > latest_version: latest_version = current_version rls = releaseAlso, consider using logger.exception in exception handlers to avoid manual traceback formatting, matching the style used elsewhere.
release_notes_generator/record/factory/issue_hierarchy_record_factory.py (4)
139-147
: Safer label fetch; honor rate‑limits consistentlyWrap pull.get_labels with the shared safe_call to match the rest of the code.
- pull_labels = [label.name for label in pull.get_labels()] + _labels = self._safe_call(pull.get_labels)() or [] + pull_labels = [label.name for label in _labels]
147-151
: Body extraction uses PR repo only; add support for owner/repo#NPassing pr_repo is correct, but extract_issue_numbers_from_body currently ignores cross‑repo syntax in PR bodies (e.g., other/repo#123). Extend the regex to return fully‑qualified IDs when present; default to pr_repo when only “#N” is used.
Proposed change in utils/pull_request_utils.py:
-def extract_issue_numbers_from_body(pr: PullRequest, repository: Repository) -> set[str]: +def extract_issue_numbers_from_body(pr: PullRequest, repository: Repository) -> set[str]: @@ - regex_pattern = re.compile(r"([Cc]los(e|es|ed)|[Ff]ix(es|ed)?|[Rr]esolv(e|es|ed))\s*#\s*([0-9]+)") + # Matches: + # Close|Fix|Resolve owner/repo#123 OR Close|Fix|Resolve #123 + regex_pattern = re.compile( + r"([Cc]los(e|es|ed)|[Ff]ix(es|ed)?|[Rr]esolv(e|es|ed))" + r"\s*(?:(?P<full_repo>[A-Za-z0-9_.-]+/[A-Za-z0-9_.-]+)\s*#\s*(?P<num1>\d+)|#\s*(?P<num2>\d+))" + ) @@ - issue_matches = regex_pattern.findall(pr.body if pr.body else "") - - # Extract the issue numbers from the matches - issue_numbers = {int(match[-1]) for match in issue_matches} - issue_ids = {f"{repository.full_name}#{number}" for number in issue_numbers} + matches = regex_pattern.finditer(pr.body or "") + issue_ids = set() + for m in matches: + if m.group("full_repo") and m.group("num1"): + issue_ids.add(f"{m.group('full_repo')}#{int(m.group('num1'))}") + elif m.group("num2"): + issue_ids.add(f"{repository.full_name}#{int(m.group('num2'))}") return issue_idsBased on learnings
163-176
: Cross‑repo fetch path: solid; minor logging tweakFlow correctly resolves repo → caches into MinedData → fetches issue by int id. Use logger.exception in the ValueError handler to capture context/stack per code‑style hints.
- except ValueError: - logger.error("Invalid issue id: %s", issue_id) + except ValueError: + logger.exception("Invalid issue id: %s", issue_id)
337-351
: Re‑register loop: good guardrails; minor nitThe protective pop avoids recursion loops. Optional: fix comment typos (“location”, “possible”) for clarity; no behavior change.
release_notes_generator/record/factory/default_record_factory.py (4)
112-116
: Avoid duplicate label fetches; pass labels intoregister_pull_request
.This saves an extra API call and reduces rate‑limit pressure.
- def register_pull_request(pr: PullRequest, skip_rec: bool) -> None: - l_pid = self.get_id(pr) - l_pull_labels = [label.name for label in pr.get_labels()] + def register_pull_request(pr: PullRequest, skip_rec: bool, pull_labels: list[str]) -> None: + l_pid = self.get_id(pr) attached_any = False detected_issues = extract_issue_numbers_from_body(pr, repository=data.home_repository) @@ - logger.debug("Registering pull number: %s, title : %s", pid, pull.title) - register_pull_request(pull, skip_record) + logger.debug("Registering pull number: %s, title : %s", pid, pull.title) + register_pull_request(pull, skip_record, pull_labels)After this change, also update the fallback creation below to use the passed
pull_labels
(see next comment).Also applies to: 172-185
116-116
: Body parsing missesowner/repo#123
references; extend the regex.Update
extract_issue_numbers_from_body
to capture optionalowner/repo
before#num
. This enables cross‑repo detection from PR bodies, not only API links.Patch (release_notes_generator/utils/pull_request_utils.py):
def extract_issue_numbers_from_body(pr: PullRequest, repository: Repository) -> set[str]: - """ - Extracts the numbers of the issues mentioned in the body of the pull request. - - Parameters: - pr (PullRequest): The pull request to extract numbers from. - - Returns: - Set of issue numbers mentioned in the pull request body. - """ - # Regex pattern to match issue numbers following keywords like "Close", "Fix", "Resolve" - regex_pattern = re.compile(r"([Cc]los(e|es|ed)|[Ff]ix(es|ed)?|[Rr]esolv(e|es|ed))\s*#\s*([0-9]+)") - - # Extract all issue numbers from the PR body - issue_matches = regex_pattern.findall(pr.body if pr.body else "") - - # Extract the issue numbers from the matches - issue_numbers = {int(match[-1]) for match in issue_matches} - issue_ids = {f"{repository.full_name}#{number}" for number in issue_numbers} - - return issue_ids + """ + Extracts issue IDs referenced in the PR body, supporting: + - "Close/Fix/Resolve #123" (scoped to the provided repository) + - "Close/Fix/Resolve owner/repo#123" (cross-repo) + """ + pattern = re.compile( + r"(?ix)\b(?:close[sd]?|fix(?:es|ed)?|resolv(?:e|es|ed))\s+" + r"(?:(?P<repo>[A-Za-z0-9_.-]+/[A-Za-z0-9_.-]+)\s*#)?\s*(?P<num>\d+)" + ) + ids: set[str] = set() + body = pr.body or "" + for m in pattern.finditer(body): + repo_full = m.group('repo') or repository.full_name + num = int(m.group('num')) + ids.add(f"{repo_full}#{num}") + return idsAlso applies to: 178-178
133-147
: Minor logging/doc tweak: uselogger.exception
in except; update stale comment.
- The “same repository” note is outdated now that cross‑repo is supported.
- Per TRY400, log the caught
ValueError
with stack for easier triage.- # dev note: here we expect that PR links to an issue in the same repository !!! + # PR may link issues across repositories; resolve and fetch dynamically. @@ - except ValueError: - logger.error("Invalid parent issue id: %s", parent_issue_id) + except ValueError: + logger.exception("Invalid parent issue id: %s", parent_issue_id) continue
164-165
: Align with label‑passing refactor.Use the labels already fetched by the caller to avoid re‑calling
.get_labels()
:- self._records[l_pid] = PullRequestRecord(pr, l_pull_labels, skip=skip_rec) + self._records[l_pid] = PullRequestRecord(pr, pull_labels, skip=skip_rec)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
release_notes_generator/miner.py
(5 hunks)release_notes_generator/model/sub_issue_record.py
(0 hunks)release_notes_generator/record/factory/default_record_factory.py
(6 hunks)release_notes_generator/record/factory/issue_hierarchy_record_factory.py
(4 hunks)tests/release_notes/record/factory/test_default_record_factory.py
(2 hunks)tests/test_filter.py
(3 hunks)
💤 Files with no reviewable changes (1)
- release_notes_generator/model/sub_issue_record.py
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-09-28T14:13:38.954Z
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:109-126
Timestamp: 2025-09-28T14:13:38.954Z
Learning: In release_notes_generator/record/factory/default_record_factory.py, there is a cross-repo fetch bug where parent_issue_id in format "owner/repo#number" is fetched from the wrong repository because the code only queries data.repository instead of parsing the owner/repo from the ID and using the GitHub client to fetch from the correct repository.
Applied to files:
release_notes_generator/record/factory/issue_hierarchy_record_factory.py
tests/release_notes/record/factory/test_default_record_factory.py
release_notes_generator/record/factory/default_record_factory.py
release_notes_generator/miner.py
📚 Learning: 2025-09-28T14:09:54.315Z
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: tests/release_notes/record/factory/test_default_record_factory.py:0-0
Timestamp: 2025-09-28T14:09:54.315Z
Learning: In tests/release_notes/record/factory/test_default_record_factory.py, the mock_get_issues_for_pr_no_issues helper function parameter should remain as `str` type (not changed to `int`) because changing it to `int` causes test failures, even though the parameter is unused and flagged by linters.
Applied to files:
tests/release_notes/record/factory/test_default_record_factory.py
📚 Learning: 2025-09-28T13:20:42.353Z
Learnt from: miroslavpojer
PR: AbsaOSS/generate-release-notes#173
File: release_notes_generator/record/factory/default_record_factory.py:115-117
Timestamp: 2025-09-28T13:20:42.353Z
Learning: In the DefaultRecordFactory class, the Github instance passed to __init__ is not stored as a class attribute, only used to create a rate limiter. For cross-repo issue handling, the Github instance needs to be stored to enable fetching issues from different repositories via github.get_repo().
Applied to files:
release_notes_generator/record/factory/default_record_factory.py
🧬 Code graph analysis (5)
release_notes_generator/record/factory/issue_hierarchy_record_factory.py (5)
release_notes_generator/model/mined_data.py (4)
MinedData
(38-74)home_repository
(51-53)get_repository
(61-65)add_repository
(55-59)release_notes_generator/model/record.py (2)
is_cross_repo
(55-61)is_cross_repo
(64-70)release_notes_generator/utils/pull_request_utils.py (2)
get_issues_for_pr
(56-82)extract_issue_numbers_from_body
(32-52)release_notes_generator/miner.py (1)
get_repository
(63-74)release_notes_generator/record/factory/default_record_factory.py (1)
get_repository
(90-101)
tests/test_filter.py (1)
release_notes_generator/model/mined_data.py (1)
home_repository
(51-53)
tests/release_notes/record/factory/test_default_record_factory.py (1)
release_notes_generator/model/commit_record.py (2)
CommitRecord
(13-78)commit
(45-50)
release_notes_generator/record/factory/default_record_factory.py (3)
release_notes_generator/miner.py (1)
get_repository
(63-74)release_notes_generator/model/mined_data.py (3)
get_repository
(61-65)home_repository
(51-53)add_repository
(55-59)release_notes_generator/utils/pull_request_utils.py (1)
extract_issue_numbers_from_body
(32-52)
release_notes_generator/miner.py (3)
release_notes_generator/model/mined_data.py (3)
get_repository
(61-65)MinedData
(38-74)home_repository
(51-53)release_notes_generator/action_inputs.py (1)
get_github_repository
(113-117)release_notes_generator/model/pull_request_record.py (1)
PullRequestRecord
(15-293)
🪛 Ruff (0.13.1)
release_notes_generator/record/factory/issue_hierarchy_record_factory.py
167-167: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
release_notes_generator/record/factory/default_record_factory.py
137-137: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
release_notes_generator/miner.py
83-83: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (8)
tests/test_filter.py (2)
21-21
: Repository import looks correct.Import path matches PyGithub’s structure; good for using
spec=Repository
. Based on learnings.
33-34
: Approve changes:home_repository
is now assigned directly and no.return_value
usages remain.release_notes_generator/miner.py (2)
63-75
: Helper looks good; avoid divergence across classesNice Optional typing and error log on miss. Consider centralizing this helper (shared util) so DefaultRecordFactory and others don’t drift in behavior over time. Based on learnings
215-216
: Static de-dupe helper: LGTMClear and side‑effect confined to data. No concerns.
release_notes_generator/record/factory/issue_hierarchy_record_factory.py (1)
315-319
: Cross‑repo flagging: LGTMMarking SubIssueRecord as cross‑repo when repo != home is the right signal for downstream rendering.
tests/release_notes/record/factory/test_default_record_factory.py (1)
244-245
: All legacy commit key references have been updated
No occurrences ofowner/repo@sha
remain in tests.release_notes_generator/record/factory/default_record_factory.py (2)
51-56
: Good: keep a reference to the Github client for cross‑repo fetches.This unblocks repository resolution via
get_repo(...)
. Nice.Based on learnings
87-89
: Commit IDs switched to SHA-only — no remaining callers expectrepo@sha
. Verified codebase-wide: no f-strings,.format
calls, or other patterns buildingrepo@sha
remain.
Release Notes:
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests
Closes #170