Conversation
## Walkthrough
The code updates webhook server logic to fetch GitHub and Cloudflare IP allowlists asynchronously using an `httpx.AsyncClient`. It introduces new async functions for allowlist retrieval, updates the application lifespan context manager to handle async client lifecycle, unifies logging via a module-level logger, and modifies webhook payload parsing to use `json.loads` on raw bytes. Additionally, multiple internal classes are refactored to remove implicit `self.pull_request` usage, replacing it with explicit `PullRequest` parameters in methods, converting several methods to async, and removing the `NoPullRequestError` exception class and its handling.
## Changes
| File(s) | Change Summary |
|------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `webhook_server/app.py` | Replaces synchronous allowlist HTTP requests with async functions using `httpx.AsyncClient`; updates lifespan context manager for async fetching and client cleanup; unifies logging; changes webhook payload parsing to use `json.loads` on raw body bytes; removes handling of `NoPullRequestError` exception from webhook processing. |
| `webhook_server/libs/check_run_handler.py` | Removes `self.pull_request` attribute; methods now require explicit `pull_request` parameter; converts `all_required_status_checks` from cached property to method with `pull_request` argument; updates calls accordingly; converts `process_pull_request_check_run_webhook_data` to async; updates status-setting methods to async using `asyncio.to_thread`. |
| `webhook_server/libs/exceptions.py` | Removes `NoPullRequestError` exception class; adds new `NoApiTokenError` exception class. |
| `webhook_server/libs/github_api.py` | Refactors `GithubWebhook` to remove instance attributes for pull request and GitHub API; methods now take explicit `pull_request` parameters; `_get_pull_request` renamed and returns `None` instead of raising; `process` method rewritten to async with explicit pull request handling; updates method signatures and internal calls accordingly; removes exception-based control flow for missing pull requests; removes OWNERS-related methods and logic. |
| `webhook_server/libs/issue_comment_handler.py` | Refactors `IssueCommentHandler` methods to accept explicit `pull_request` parameter instead of using `self.pull_request`; converts some methods to async; updates all internal calls to use passed `pull_request`; replaces ThreadPoolExecutor concurrency with asyncio tasks and gather for retest commands; constructor now accepts `OwnersFileHandler`. |
| `webhook_server/libs/labels_handler.py` | Refactors `LabelsHandler` methods to accept explicit `pull_request` parameter instead of using `self.pull_request`; converts label addition/removal methods to async; updates logging to use passed `pull_request`; constructor now accepts `OwnersFileHandler`; no logic changes. |
| `webhook_server/libs/pull_request_handler.py` | Refactors `PullRequestHandler` to remove `self.pull_request`; methods now accept explicit `pull_request` parameters; converts `process_pull_request_webhook_data` and other internal methods to async; replaces ThreadPoolExecutor concurrency with asyncio; updates all internal calls to use passed `pull_request`; constructor now accepts `OwnersFileHandler`; preserves logic and exception handling. |
| `webhook_server/libs/pull_request_review_handler.py` | Converts `process_pull_request_review_webhook_data` to async and requires explicit `pull_request` parameter; updates internal calls to pass `pull_request`; constructor now accepts `OwnersFileHandler`; removes reliance on `github_webhook` attributes. |
| `webhook_server/libs/push_handler.py` | Converts `process_push_webhook_data` and `upload_to_pypi` methods to async; uses async context manager and awaits command executions; internal logic unchanged. |
| `webhook_server/libs/runner_handler.py` | Removes `self.pull_request`; methods and context managers now accept explicit `pull_request` parameter; removes import and handling of `NoPullRequestError`; updates internal calls to use passed `pull_request`; converts methods and context manager to async; adds async branch existence check; constructor now accepts optional `OwnersFileHandler`; preserves logic but changes control flow to explicit parameter passing and async command execution. |
| `webhook_server/libs/owners_files_handler.py` | Adds new `OwnersFileHandler` class managing OWNERS files, approvers, reviewers, and user validation asynchronously; loads changed files and OWNERS data; supports assigning reviewers; caches repository collaborators, contributors, and maintainers; integrates with pull requests for authorization and reviewer assignment. |
| `webhook_server/tests/conftest.py` | Adds `Label` class and expands `PullRequest` test stub with additions, deletions, labels, and base ref; adds stub methods for issue comment and review request creation; defines pytest fixture for `pull_request`; changes import order to load `OwnersFileHandler` before `GithubWebhook`. |
| `webhook_server/tests/test_add_reviewer_action.py` | Updates test functions to accept `pull_request` and `owners_file_handler` fixtures and pass them explicitly to `_add_reviewer_by_user_comment`. |
| `webhook_server/tests/test_pull_request_owners.py` | Updates test functions to accept `pull_request` and `owners_file_handler` fixtures; converts tests to async; adds `changed_files` fixture; updates parametrization to use `changed_files`; updates calls to await async methods on `owners_file_handler`. |
| `webhook_server/tests/test_pull_request_size.py` | Removes local `PullRequest` and `Label` classes; imports `PullRequest` from `conftest`; changes test to pass explicit `pull_request` argument to `LabelsHandler.get_size`; includes `owners_file_handler` fixture. |
| `webhook_server/tests/test_prepare_retest_wellcome_comment.py` | Adds `owners_file_handler` fixture parameter and passes it to `PullRequestHandler` constructor in test setup. |
| `.gitignore` | Adds `webhook_server/tests/manifests/logs` directory to ignore list. |
| `webhook_server/utils/helpers.py` | Ensures the log directory exists before creating log files by checking and creating the "logs" directory under the configured data directory; converts `run_command` to async using `asyncio.create_subprocess_exec` with awaited output handling; tightens return types and adds handling for `BadCredentialsException` in `get_api_with_highest_rate_limit`; raises `NoApiTokenError` if no valid API/token found; sets default masking of sensitive info in logger. |
| `entrypoint.py` | Imports `asyncio` and runs `repository_and_webhook_settings` asynchronously using `asyncio.run()`; replaces print and synchronous call with direct `uvicorn.run()` invocation. |
| `webhook_server/utils/github_repository_and_webhook_settings.py` | Converts `repository_and_webhook_settings` function to async and awaits the call to `set_repositories_settings`. |
| `webhook_server/utils/github_repository_settings.py` | Converts `set_repositories_settings` to async and awaits `run_command`; removes `github_api` parameter from `set_branch_protection` and its calls. |
| `Dockerfile` | Removes copying and execution permission setting of `entrypoint.sh`; replaces entrypoint with direct `uv run entrypoint.py` command. |
| `entrypoint.sh` | Deleted. Removed Bash entrypoint script that launched the server via a dynamically generated uvicorn command. |
| `pyproject.toml` | Adds required dependency `asyncstdlib>=3.13.1`; adds optional test dependency group with `pytest-asyncio>=0.26.0`. |
| `tox.toml` | Modifies `unittests` environment command to include `--extra tests` argument in `uv run` invocation. |
## Possibly related PRs
- myk-org/github-webhook-server#750: Both PRs modify the `process` method in `webhook_server/libs/github_api.py` to be asynchronous and update its usage.
- myk-org/github-webhook-server#756: Both PRs modify `process_webhook` in `webhook_server/app.py` to improve async handling and background task usage.
- myk-org/github-webhook-server#744: Both PRs modify allowlist fetching functions in `webhook_server/app.py` but differ in sync vs async implementation and lifecycle management.
## Suggested labels
`size/L`, `verified`
## Suggested reviewers
- rnetserNote ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms (13)
🔇 Additional comments (6)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Report bugs in Issues The following are automatically added:
Available user actions:
PR will be approved when the following conditions are met:
Approvers and Reviewers
Supported /retest check runs
Supported labels
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (5)
webhook_server/app.py (5)
135-154: Deduplicate & optionally fetch in parallel
networkscan contain duplicates if the two providers return overlapping CIDRs.
Building asetavoids redundant membership checks on each request:- networks: list[ipaddress._BaseNetwork] = [] + networks: set[ipaddress._BaseNetwork] = set()
- Fetching each provider sequentially adds ~200-400 ms to startup.
When both flags are enabled you can cut that in half:import asyncio tasks = [] if verify_cloudflare_ips: tasks.append(get_cloudflare_allowlist()) if verify_github_ips: tasks.append(get_github_allowlist()) for cidr_list in await asyncio.gather(*tasks, return_exceptions=False): networks.update(ipaddress.ip_network(cidr) for cidr in cidr_list)This keeps startup snappy and memory-efficient.
157-162: Potential double-close of the sharedAsyncClient
await _lifespan_http_client.aclose()is called both in the normal shutdown path (afteryield) and in theexceptblock.
If an exception happens after the first close, the second call is redundant and may trigger a warning.You can guard it:
if _lifespan_http_client and not _lifespan_http_client.is_closed: await _lifespan_http_client.aclose()or rely on a
try/finallypattern instead of duplicating the call.
196-198: Avoid re-loading configuration on every request
Config(logger=LOGGER)is instantiated for every webhook hit, even though the values are immutable at runtime.
Cacheroot_configduring startup and inject it via dependency or closure to reduce latency and GC pressure.# inside lifespan app.state.root_config = root_config # in handler root_config = request.app.state.root_config
215-216: Reuse the global logger instead of creating a new one per request
get_logger_with_paramsis invoked on each request, producing many logger instances and repeating handler configuration.
PreferLOGGER.bind(repository_name=...)(struct-log style) orLOGGER.getChild(repo)to keep a single hierarchy and consistent formatting.
This also avoids the slight overhead of repeatedly walking the logging tree.
88-101: Remove dead commented-out codeThe legacy synchronous implementations and their old usage examples create noise and hinder readability. Since the async variants are stable, please delete the commented blocks instead of keeping them in the source.
Also applies to: 165-181
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
webhook_server/app.py(6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: pre-commit
- GitHub Check: build-container
- GitHub Check: python-module-install
- GitHub Check: tox
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
webhook_server/app.py (3)
38-50: Add explicit timeouts & reuse parsed IPs to harden allow-list fetcherThe helper looks good, but two robustness tweaks are advisable:
- Pass a per-request timeout so a mis-behaving endpoint doesn’t block startup forever:
- response = await _lifespan_http_client.get("https://api.github.com/meta") + response = await _lifespan_http_client.get( + "https://api.github.com/meta", + timeout=httpx.Timeout(5.0), + )
- The GitHub API already returns CIDR strings. Consider validating them here and returning
set[ipaddress._BaseNetwork]to avoid the later double iteration & potentialValueErrorinside the lifespan loop.These fixes localise failure handling and make the later startup code simpler.
53-65: Deduplicate & validate Cloudflare CIDRs before returningSimilar to the GitHub helper, the Cloudflare function can:
- Wrap each CIDR with
ipaddress.ip_network(cidr)inside the helper, catching invalid entries once.- Return a set to avoid duplicates – Cloudflare occasionally overlaps IPv4 ranges between services.
- return result.get("ipv4_cidrs", []) + result.get("ipv6_cidrs", []) + cidrs = result.get("ipv4_cidrs", []) + result.get("ipv6_cidrs", []) + networks: set[ipaddress._BaseNetwork] = set() + for cidr in cidrs: + try: + networks.add(ipaddress.ip_network(cidr)) + except ValueError: + LOGGER.warning(f"Skipping invalid CIDR from Cloudflare response: {cidr}") + return list(networks)This guarantees the caller only receives validated networks.
210-215: Preferrequest.json()or decode bytes once to avoid double parsing & large payload memoryCalling
await request.body()loads the entire request into memory and keeps a second copy when you pass the same bytes tojson.loads.request.json()streams & decodes in one step.If you need the raw bytes for signature verification, decode once:
payload_body = await request.body() # verify_signature(payload_body, ...) try: hook_data = json.loads(payload_body.decode("utf-8")) except json.JSONDecodeError as exc: ...This removes an implicit re-decode done inside
json.loadsforbytesand makes the charset explicit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
webhook_server/app.py(6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: pre-commit
- GitHub Check: build-container
- GitHub Check: python-module-install
- GitHub Check: tox
There was a problem hiding this comment.
Actionable comments posted: 4
🔭 Outside diff range comments (5)
webhook_server/libs/push_handler.py (1)
20-32: 💡 Verification agent🧩 Analysis chain
Async wrapper still executes heavy, blocking work on the event-loop thread
process_push_webhook_datais nowasync, yet it directly invokes two CPU / I/O-heavy synchronous helpers (upload_to_pypi,_run_build_container).
When this coroutine is awaited inside an async web-request handler it will block the loop, wiping out the scalability gains we try to achieve with the refactor.Suggested quick win – move the work to a thread pool with
asyncio.to_thread(or refactor helpers to be truly async):@@ -import re +import asyncio +import re @@ - if self.github_webhook.pypi: - self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}") - self.upload_to_pypi(tag_name=tag_name) + if self.github_webhook.pypi: + self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}") + await asyncio.to_thread(self.upload_to_pypi, tag_name=tag_name) @@ - if self.github_webhook.build_and_push_container and self.github_webhook.container_release: - self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}") - self.runner_handler._run_build_container(push=True, set_check=False, tag=tag_name) + if self.github_webhook.build_and_push_container and self.github_webhook.container_release: + self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}") + await asyncio.to_thread( + self.runner_handler._run_build_container, + push=True, + set_check=False, + tag=tag_name, + )This prevents the event loop from being blocked while keeping the public async API intact.
If you prefer a full async rewrite of the helpers, we can tackle that separately.
🏁 Script executed:
#!/bin/bash # Check if upload_to_pypi is async or sync rg -n "async def upload_to_pypi" -C2 rg -n "def upload_to_pypi" -C2 # Check if _run_build_container is async or sync rg -n "async def _run_build_container" -C2 rg -n "def _run_build_container" -C2Length of output: 976
Offload Blocking Helpers to Thread Pool
Both
upload_to_pypiand_run_build_containerare synchronous methods (see definitions inpush_handler.py:33andrunner_handler.py:234), so calling them directly inside the asyncprocess_push_webhook_datawill block the event loop. Offload these calls withasyncio.to_threadto prevent blocking:Locations to update:
- webhook_server/libs/push_handler.py:
process_push_webhook_data(lines 20–32)- webhook_server/libs/push_handler.py:
def upload_to_pypi(line 33)- webhook_server/libs/runner_handler.py:
def _run_build_container(line 234)Suggested diff:
@@ webhook_server/libs/push_handler.py -import re +import asyncio +import re @@ - if self.github_webhook.pypi: - self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}") - self.upload_to_pypi(tag_name=tag_name) + if self.github_webhook.pypi: + self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}") + await asyncio.to_thread(self.upload_to_pypi, tag_name=tag_name) @@ - if self.github_webhook.build_and_push_container and self.github_webhook.container_release: - self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}") - self.runner_handler._run_build_container(push=True, set_check=False, tag=tag_name) + if self.github_webhook.build_and_push_container and self.github_webhook.container_release: + self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}") + await asyncio.to_thread( + self.runner_handler._run_build_container, + push=True, + set_check=False, + tag=tag_name, + )This change preserves your async API while preventing the event loop from being blocked.
webhook_server/libs/pull_request_review_handler.py (1)
15-32: 💡 Verification agent🧩 Analysis chain
asyncfunction contains only synchronous calls – confirm/await or revert
process_pull_request_review_webhook_datais declaredasync, but all internal calls (manage_reviewed_by_label,label_by_user_comment) are invoked withoutawait.If these helper methods were converted to
async, we mustawaitthem or nothing will execute.
If they stayed synchronous, keeping this wrapperasyncadds unnecessary complexity.Please verify the helpers’ signatures and either:
- add
awaitin front of each async helper,- or drop the
asynckeyword from this method.
🏁 Script executed:
#!/bin/bash # Show the class and method context in pull_request_review_handler.py echo "=== pull_request_review_handler.py (lines 1-200) ===" sed -n '1,200p' webhook_server/libs/pull_request_review_handler.py # Find where manage_reviewed_by_label is defined echo "=== Definition of manage_reviewed_by_label ===" rg "def manage_reviewed_by_label" -n . # Find any async definitions of helpers invoked in this method echo "=== Any async helper definitions ===" rg -nE "async def .*manage_reviewed_by_label|async def .*label_by_user_comment" -n .Length of output: 2187
🏁 Script executed:
#!/bin/bash set -e # Display the implementation of manage_reviewed_by_label for context echo "=== manage_reviewed_by_label implementation ===" sed -n '150,260p' webhook_server/libs/labels_handler.py # Locate and display label_by_user_comment definition echo "=== label_by_user_comment definition ===" rg --context 5 "def label_by_user_comment" -n webhook_server/libs/labels_handler.py # Find where process_pull_request_review_webhook_data is invoked elsewhere echo "=== References to process_pull_request_review_webhook_data ===" rg -n "process_pull_request_review_webhook_data" -n .Length of output: 4374
Remove unnecessary
asyncfromprocess_pull_request_review_webhook_dataand update its call siteBoth
manage_reviewed_by_labelandlabel_by_user_comment(and their underlying_add_label/_remove_label) are plain, synchronous methods. There’s noawaitinsideprocess_pull_request_review_webhook_data, so it doesn’t need to be anasynccoroutine.Please update accordingly:
- In
webhook_server/libs/pull_request_review_handler.py(around line 15), change- async def process_pull_request_review_webhook_data(self, pull_request: PullRequest) -> None: + def process_pull_request_review_webhook_data(self, pull_request: PullRequest) -> None:- In
webhook_server/libs/github_api.py(around line 144), remove theawaitwhen calling it:- return await PullRequestReviewHandler(github_webhook=self) - .process_pull_request_review_webhook_data(pull_request) + return PullRequestReviewHandler(github_webhook=self) + .process_pull_request_review_webhook_data(pull_request)This will eliminate the unnecessary coroutine overhead and satisfy linters complaining about “async function contains no await.”
webhook_server/libs/runner_handler.py (1)
96-104: 🛠️ Refactor suggestionParameter shadowing may hide the caller-supplied
pull_requestInside
_prepare_cloned_repo_diryou re-assign the namepull_requestusing the walrus operator:if pull_request := self.github_webhook._get_pull_request():If the caller passed an explicit
pull_request, that value is silently overwritten, which is surprising and could lead to inconsistent logging or merge behaviour.- if pull_request := self.github_webhook._get_pull_request(): + _pr = pull_request or self.github_webhook._get_pull_request() + if _pr: rc, out, err = run_command( - command=f"{git_cmd} checkout origin/pr/{pull_request.number}", + command=f"{git_cmd} checkout origin/pr/{_pr.number}",This keeps the caller’s intent intact.
webhook_server/libs/labels_handler.py (1)
84-98:⚠️ Potential issueFix missing
pull_requestargument inwait_for_labelto prevent runtimeTypeError.
wait_for_label()now callslabel_exists_in_pull_request, which requires thepull_requestargument.
Because the method itself no longer receives that object,TimeoutSamplerwill invokelabel_exists_in_pull_request(label=…)and blow up with:TypeError: label_exists_in_pull_request() missing 1 required positional argument: 'pull_request'The two callers (
_remove_label,_add_label) also omit the object.
Patch:- def wait_for_label(self, label: str, exists: bool) -> bool: + def wait_for_label(self, *, pull_request: PullRequest, label: str, exists: bool) -> bool: @@ - func=self.label_exists_in_pull_request, - label=label, + func=self.label_exists_in_pull_request, + pull_request=pull_request, + label=label, @@ - return self.wait_for_label(label=label, exists=False) + return self.wait_for_label(pull_request=pull_request, label=label, exists=False) @@ - self.wait_for_label(label=label, exists=True) + self.wait_for_label(pull_request=pull_request, label=label, exists=True)This preserves the new explicit-context style and restores functionality.
Also applies to: 38-45, 51-58, 80-83
webhook_server/libs/pull_request_handler.py (1)
471-486:⚠️ Potential issue
_process_verified_for_update_or_new_pull_requestuses staleself.pull_requestSimilar to the previous comment, the method receives
pull_requestbut internally usesself.pull_request, which is often unset.- self.labels_handler._add_label(pull_request=self.pull_request, label=VERIFIED_LABEL_STR) + self.labels_handler._add_label(pull_request=pull_request, label=VERIFIED_LABEL_STR) ... - self.labels_handler._remove_label(pull_request=pull_request, label=VERIFIED_LABEL_STR) + self.labels_handler._remove_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)Leaving it unfixed silently skips (or crashes) label management for verified workflows.
🧹 Nitpick comments (10)
webhook_server/libs/pull_request_review_handler.py (1)
17-23: Replace stray triple-quoted string with a commentThe multiline string inside the
ifblock is not a docstring and is discarded at runtime.
Use a regular#comment instead to avoid confusion and minor memory overhead.- """ - Available actions: - commented - approved - changes_requested - """ + # Available actions: + # commented + # approved + # changes_requestedwebhook_server/libs/runner_handler.py (3)
375-394: Error message wording & list join formattingTwo small issues hamper readability:
- Grammar – “must starts” → “must start”.
": ,".join()producesfoo: ,bar– probably meant", ".join().- if any([title.startswith(f"{_name}:") for _name in allowed_names]): + if any(title.startswith(f"{name}:") for name in allowed_names): @@ - output["text"] = f"Pull request title must starts with allowed title: {': ,'.join(allowed_names)}" + output["text"] = ( + "Pull-request title must start with one of: " + f"{', '.join(allowed_names)}" + )
396-420: Variable reuse obscures intent in_is_user_valid_to_run_commandsUsing
commentboth inside the list-comprehension and the subsequent loop obscures that we’re dealing with comment bodies, not objects.- comments_from_approvers = [ - comment.body - for comment in pull_request.get_issue_comments() - if comment.user.login in allowed_user_to_approve - ] - for comment in comments_from_approvers: - if allow_user_comment in comment: + comment_bodies = [ + c.body + for c in pull_request.get_issue_comments() + if c.user.login in allowed_user_to_approve + ] + for body in comment_bodies: + if allow_user_comment in body: return TrueClearer naming improves maintainability and avoids shadowing.
457-521:cherry_pick: user-visible error handling ok, but consider returning statusThe method writes an issue comment on failure but returns nothing.
Returning a boolean or raising a dedicated exception would let callers act programmatically (e.g., set check-run status).webhook_server/libs/labels_handler.py (1)
214-225: Typo – rename method towip_or_hold_labels_exists
wip_or_hold_lables_existsis misspelled. A public API with a typo reduces readability and discoverability.- def wip_or_hold_lables_exists(self, labels: list[str]) -> str: + def wip_or_hold_labels_exists(self, labels: list[str]) -> str:webhook_server/libs/check_run_handler.py (1)
216-229: Parameter shadowing makeslast_commit_check_runsargument useless
required_check_in_progress()acceptslast_commit_check_runsbut
immediately overwrites it with:last_commit_check_runs = list(self.github_webhook.last_commit.get_check_runs())Either drop the parameter or respect the supplied list; otherwise callers
may assume their pre-filtered list is being used.webhook_server/libs/issue_comment_handler.py (2)
45-49: Log prints boolean instead of action stringif comment_action := self.hook_data["action"] in ("edited", "deleted"): self.logger.debug(f"... action is {comment_action}")
comment_actionbecomesTrue/False, so the log message is misleading.
A clearer pattern:-action = self.hook_data["action"] -if action in ("edited", "deleted"): - self.logger.debug(f"... action is {action}") +action = self.hook_data["action"] +if action in ("edited", "deleted"): + self.logger.debug(f"... action is {action}")
101-107: Blocking GitHub API call inside async context – consider off-thread execution
create_comment_reaction()invokes the synchronous PyGithub call
pull_request.get_issue_comment(...)from withinasync def user_commands.
Under heavy load this can block the event loop.Consider delegating to a thread pool:
loop = asyncio.get_running_loop() _comment = await loop.run_in_executor(None, pull_request.get_issue_comment, issue_comment_id) await loop.run_in_executor(None, _comment.create_reaction, reaction)This keeps the coroutine non-blocking.
webhook_server/libs/github_api.py (2)
286-288: Inefficient retrieval of the last commit
list(pull_request.get_commits())[-1]loads all commits into memory only to access the tail element.
Use the GitHub API’sget_commits()iterator withreversed=True(or slice) to fetch a single item:-def _get_last_commit(self, pull_request: PullRequest) -> Commit: - return list(pull_request.get_commits())[-1] +def _get_last_commit(self, pull_request: PullRequest) -> Commit: + # Fetch only the most recent commit to reduce latency / memory + return next(iter(pull_request.get_commits().reversed))This significantly reduces API latency for large PRs.
501-515: Bulk-create reviewer requests instead of per-user loop
pull_request.create_review_requestaccepts a list of logins, so we can issue one API call instead ofO(n)calls.-_to_add: list[str] = list(set(self.all_pull_request_reviewers)) -... -for reviewer in _to_add: - if reviewer != pull_request.user.login: - try: - pull_request.create_review_request([reviewer]) -... +to_request = [r for r in set(self.all_pull_request_reviewers) if r != pull_request.user.login] +if to_request: + try: + pull_request.create_review_request(reviewers=to_request) + except GithubException as ex: + self.logger.debug(f"{self.log_prefix} Failed to add reviewers {to_request}. {ex}") + pull_request.create_issue_comment(f"Cannot add reviewers {', '.join(to_request)}. {ex}")Fewer HTTP round-trips → faster & safer (GitHub rate limits apply per request).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
webhook_server/app.py(6 hunks)webhook_server/libs/check_run_handler.py(5 hunks)webhook_server/libs/exceptions.py(0 hunks)webhook_server/libs/github_api.py(12 hunks)webhook_server/libs/issue_comment_handler.py(8 hunks)webhook_server/libs/labels_handler.py(8 hunks)webhook_server/libs/pull_request_handler.py(19 hunks)webhook_server/libs/pull_request_review_handler.py(4 hunks)webhook_server/libs/push_handler.py(1 hunks)webhook_server/libs/runner_handler.py(15 hunks)
💤 Files with no reviewable changes (1)
- webhook_server/libs/exceptions.py
🚧 Files skipped from review as they are similar to previous changes (1)
- webhook_server/app.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
webhook_server/libs/issue_comment_handler.py (7)
webhook_server/tests/test_add_reviewer_action.py (3)
PullRequest(19-27)create_issue_comment(23-24)create_review_request(26-27)webhook_server/utils/constants.py (1)
REACTIONS(72-80)webhook_server/libs/github_api.py (1)
assign_reviewers(501-514)webhook_server/libs/pull_request_handler.py (1)
check_if_can_be_merged(499-580)webhook_server/libs/labels_handler.py (3)
_remove_label(38-49)_add_label(51-82)label_by_user_comment(137-159)webhook_server/libs/check_run_handler.py (2)
set_verify_check_queued(55-56)set_verify_check_success(58-59)webhook_server/libs/runner_handler.py (2)
cherry_pick(457-520)_is_user_valid_to_run_commands(396-421)
|
/test build-container |
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (2)
webhook_server/libs/pull_request_handler.py (2)
268-276: 🛠️ Refactor suggestionBlocking
time.sleepinside async-driven flow stalls the event-loop
label_all_opened_pull_requests_merge_state_after_mergedis invoked from the async
process_pull_request_webhook_data, yet it performstime.sleep(30), blocking the
entire asyncio loop for half a minute.Convert the function to async and use
await asyncio.sleep()(or schedule it in a
separate executor) to keep the server responsive.
469-484:⚠️ Potential issue
_process_verified_for_update_or_new_pull_requeststill referencesself.pull_requestSimilar to the previous issue, using the removed attribute will break the
verified-label logic.- self.labels_handler._add_label(pull_request=self.pull_request, label=VERIFIED_LABEL_STR) + self.labels_handler._add_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)Search & replace all remnants of
self.pull_requestto avoid future regressions.
♻️ Duplicate comments (2)
webhook_server/libs/pull_request_handler.py (2)
348-357:⚠️ Potential issue
assign_reviewers()is invoked without the mandatorypull_requestparameter
GithubWebhook.assign_reviewerswas refactored to acceptpull_request: PullRequest, yet the thread-pool submission still calls it with no arguments.
At runtime this will raiseTypeError: assign_reviewers() missing 1 required positional argument.-prepare_pull_futures.append(executor.submit(self.github_webhook.assign_reviewers)) +prepare_pull_futures.append( + executor.submit(self.github_webhook.assign_reviewers, pull_request) +)Without this fix, all opened / synchronised PRs will crash during initialisation and none of the remaining futures will execute.
453-468:⚠️ Potential issueIncorrect use of
self.pull_requestinsidelabel_pull_request_by_merge_stateThe method correctly receives
pull_request, but still forwardsself.pull_request
to theLabelsHandler. In most call-sitesself.pull_requestis unset, so every
label operation will crash.- if merge_state == "behind": - self.labels_handler._add_label(pull_request=self.pull_request, label=NEEDS_REBASE_LABEL_STR) + if merge_state == "behind": + self.labels_handler._add_label(pull_request=pull_request, label=NEEDS_REBASE_LABEL_STR) ... - self.labels_handler._remove_label(pull_request=self.pull_request, label=NEEDS_REBASE_LABEL_STR) + self.labels_handler._remove_label(pull_request=pull_request, label=NEEDS_REBASE_LABEL_STR) ... - self.labels_handler._add_label(pull_request=self.pull_request, label=HAS_CONFLICTS_LABEL_STR) + self.labels_handler._add_label(pull_request=pull_request, label=HAS_CONFLICTS_LABEL_STR) ... - self.labels_handler._remove_label(pull_request=self.pull_request, label=HAS_CONFLICTS_LABEL_STR) + self.labels_handler._remove_label(pull_request=pull_request, label=HAS_CONFLICTS_LABEL_STR)Apply the same fix anywhere else
self.pull_requestsneaks back in.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
webhook_server/libs/pull_request_handler.py(17 hunks)webhook_server/libs/runner_handler.py(25 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- webhook_server/libs/runner_handler.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: python-module-install
- GitHub Check: tox
- GitHub Check: pre-commit
|
/retest build-container |
|
/retest build-container |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
webhook_server/libs/runner_handler.py (4)
134-137: Possible unnecessary checkYou're checking for
pull_request and successbut it appears that this code would only be reached if the previous code at line 125 successfully setpull_request, making this check redundant.- if pull_request and success: + if success: # pull_request is already verified to be truthy in the parent block
38-39: Consider adding type annotations for the return valueThe function returns a tuple, but the type annotation for the generator doesn't match what is actually returned from the
yieldstatement.Update the type annotation to match the actual return value:
def _prepare_cloned_repo_dir( self, clone_repo_dir: str, pull_request: PullRequest | None = None, is_merged: bool = False, checkout: str = "", tag_name: str = "", -) -> Generator[tuple[bool, Any, Any], None, None]: +) -> Generator[tuple[bool, str, str], None, None]:
247-252: Simplify the conditional checkThe conditional statement can be simplified for better readability.
- if ( - reviewed_user - and pull_request - and not self._is_user_valid_to_run_commands(reviewed_user=reviewed_user, pull_request=pull_request) - ): + if reviewed_user and pull_request and not self._is_user_valid_to_run_commands(reviewed_user=reviewed_user, pull_request=pull_request):
168-171: Consider making the pull_request parameter non-optionalSince the method seems to require a PullRequest object to function correctly (it's used to determine
base.refon line 180), consider making it a required parameter without a default value.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
webhook_server/libs/github_api.py(12 hunks)webhook_server/libs/issue_comment_handler.py(6 hunks)webhook_server/libs/pull_request_handler.py(16 hunks)webhook_server/libs/push_handler.py(2 hunks)webhook_server/libs/runner_handler.py(25 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- webhook_server/libs/push_handler.py
- webhook_server/libs/issue_comment_handler.py
- webhook_server/libs/github_api.py
- webhook_server/libs/pull_request_handler.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: tox
- GitHub Check: pre-commit
- GitHub Check: python-module-install
- GitHub Check: can-be-merged
🔇 Additional comments (4)
webhook_server/libs/runner_handler.py (4)
10-10: Good import additionAdding the explicit import of
PullRequestis a good choice when updating the code to use the type as a parameter.
234-243: Consistent typing for pull_request parameterWhile most methods use
pull_request: PullRequestas their parameter type, this method usespull_request: PullRequest | None = None. Ensure this inconsistency is intentional based on the function's requirements.This appears to be intentionally different since there are checks for
pull_requestbeing None throughout the function, and it can be called without a pull request in certain contexts.
262-264: Good parameter forwardingCorrectly updating method calls to forward the pull_request parameter rather than relying on an instance attribute. This is a good refactoring that makes data flow more explicit.
1-522: Overall refactoring direction looks goodThe changes consistently move away from using
self.pull_requestas an instance attribute to requiring an explicitPullRequestparameter. This makes the code more maintainable by:
- Making dependencies explicit
- Avoiding hidden state
- Improving testability
This aligns with good software engineering practices.
|
/retest build-container |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
webhook_server/tests/test_add_reviewer_action.py (1)
19-28: Consider removing redundant PullRequest classThis file defines its own PullRequest class (lines 19-28) but also uses the imported fixture. Since the code now uses the fixture from conftest.py, this local class definition could be removed to avoid confusion.
-class PullRequest: - def __init__(self): - pass - - def create_issue_comment(self, _): - return - - def create_review_request(self, _): - return
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
webhook_server/libs/github_api.py(12 hunks)webhook_server/libs/issue_comment_handler.py(6 hunks)webhook_server/libs/pull_request_handler.py(16 hunks)webhook_server/libs/runner_handler.py(24 hunks)webhook_server/tests/conftest.py(2 hunks)webhook_server/tests/test_add_reviewer_action.py(1 hunks)webhook_server/tests/test_pull_request_owners.py(1 hunks)webhook_server/tests/test_pull_request_size.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- webhook_server/libs/github_api.py
- webhook_server/libs/runner_handler.py
- webhook_server/libs/pull_request_handler.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
webhook_server/tests/test_pull_request_size.py (2)
webhook_server/tests/conftest.py (2)
PullRequest(103-118)pull_request(122-123)webhook_server/libs/labels_handler.py (1)
get_size(100-114)
webhook_server/tests/conftest.py (1)
webhook_server/tests/test_add_reviewer_action.py (3)
PullRequest(19-27)create_issue_comment(23-24)create_review_request(26-27)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: tox
- GitHub Check: build-container
- GitHub Check: python-module-install
- GitHub Check: pre-commit
🔇 Additional comments (25)
webhook_server/tests/conftest.py (2)
98-118: Good implementation of Label class and PullRequest enhancementsThe addition of the Label class and enhancements to the PullRequest class properly support the refactoring to explicit parameter passing. This centralized test fixture approach improves test consistency across the codebase.
121-123: Good implementation of pull_request fixtureThe pull_request fixture provides a clean way to create and share PullRequest instances across tests, which aligns well with the refactoring to explicit parameter passing throughout the codebase.
webhook_server/tests/test_pull_request_owners.py (1)
106-110: Good adaptation of test signature and method callThe test has been properly updated to include the pull_request parameter and pass it explicitly to the get_all_repository_approvers_and_reviewers method, which aligns with the codebase refactoring to explicit parameter passing.
webhook_server/tests/test_pull_request_size.py (2)
4-4: Good import of shared PullRequest classImporting the PullRequest class from conftest.py eliminates duplicate code and promotes consistency across tests.
21-23: Good implementation of explicit pull_request usageThe test now correctly creates a PullRequest instance with specific additions and deletions and passes it explicitly to the get_size method, aligning with the refactoring approach across the codebase.
webhook_server/tests/test_add_reviewer_action.py (2)
30-35: Good adaptation of test and method callThe test function signature and method call have been properly updated to include and pass the pull_request parameter explicitly, which aligns with the codebase refactoring.
39-44: Good adaptation of negative test caseThe negative test case has also been properly updated to include and pass the pull_request parameter explicitly, maintaining consistency with the refactoring approach.
webhook_server/libs/issue_comment_handler.py (18)
6-7: Import added for PullRequest type.This is a good addition for properly typing the parameter in the refactored methods.
45-46: Converted to async function with explicit PullRequest parameter.This change aligns with the PR objective to use more async code, while also making the pull request dependency explicit rather than relying on instance state.
55-55: Now references pull_request parameter.Updated to use the explicit parameter instead of an implied instance attribute.
62-64: Using await with explicit parameter passing.Correctly awaits the now async method and passes the pull_request parameter explicitly.
69-71: Method converted to async with explicit PullRequest parameter.Good refactoring to make the pull request dependency explicit and support asynchronous operations.
101-101: Now calls method directly on pull_request parameter.This properly uses the explicitly passed parameter instead of an instance attribute.
104-106: Updated create_comment_reaction call with explicit parameter.This change maintains consistency with the new pattern of explicit parameter passing.
109-109: Updated method calls with explicit pull_request parameter.All method calls consistently pass the pull_request parameter explicitly, maintaining the new pattern throughout the code.
Also applies to: 112-113, 115-116, 118-119, 121-122
125-131: Updated runner_handler.run_build_container with pull_request parameter.This maintains consistency with the pattern of explicit parameter passing.
138-146: Updated label handling with explicit parameter passing.All calls to label handlers and pull request edit methods now use the explicitly passed parameter, maintaining consistent code style.
153-159: Updated label operations with explicit parameter passing and await.All label operations now use the explicit parameter, and the check_if_can_be_merged call correctly uses await since it's now an async method.
176-178: Updated create_comment_reaction method signature.This method now properly accepts an explicit pull_request parameter instead of using an instance attribute.
180-192: Updated _add_reviewer_by_user_comment with explicit parameter.The method now accepts and uses the PullRequest parameter directly for all operations.
193-230: Updated process_cherry_pick_command with explicit parameter.The method correctly uses the passed pull_request parameter for all operations, maintaining consistency with the new pattern.
231-234: Updated process_retest_command with runner validation using explicit parameter.The method now passes the pull_request parameter to the user validation check.
240-246: Mapping to handler functions maintained correctly.All the handler functions referenced in the mapping are likely updated in their respective files to accept a pull_request parameter as indicated by the PR summary.
281-285: Updated thread pool execution with pull_request parameter.The executor now correctly passes the pull_request parameter to the mapped functions when submitting them for execution.
45-289: Overall async implementation is well-executed.The changes consistently implement the PR objective of using more async code. Methods that interact with external services or perform potentially blocking operations have been properly converted to async, with await keywords added at all call sites.
The refactoring to pass PullRequest explicitly is also thorough and consistently applied throughout the file, improving code clarity by making dependencies explicit rather than relying on instance state.
|
/lgtm |
4 similar comments
|
/lgtm |
|
/lgtm |
|
/lgtm |
|
/lgtm |
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
webhook_server/tests/test_pull_request_owners.py (1)
212-226: 🛠️ Refactor suggestionMissing setting of parent_committer attribute.
In this test function,
process_github_webhook.parent_committeris not set, unlike in thetest_check_pr_approvedfunction. This inconsistency could lead to test failures if the code depends on this attribute.pull_request_handler = PullRequestHandler( github_webhook=process_github_webhook, owners_file_handler=owners_file_handler, ) + process_github_webhook.parent_committer = "test" check_if_pr_approved = await pull_request_handler._check_if_pr_approved(
♻️ Duplicate comments (8)
webhook_server/app.py (1)
195-195:⚠️ Potential issuePotential KeyError when repository key is missing
This line will raise a KeyError if GitHub sends an event without a repository field (e.g., ping or org hooks).
- logger = get_logger_with_params(name="main", repository_name=hook_data["repository"]["name"]) + repository_name = hook_data.get("repository", {}).get("name", "unknown-repo") + logger = get_logger_with_params(name="main", repository_name=repository_name)webhook_server/tests/conftest.py (1)
103-107: 🛠️ Refactor suggestionMethods should be marked as async to match production code.
These stub methods (
create_issue_comment,create_review_request,get_files) should be marked as async to properly mimic the asynchronous nature of the refactored production code. This will prevent potential test failures when these methods are awaited.- def create_issue_comment(self, *args, **kwargs): ... + async def create_issue_comment(self, *args, **kwargs): ... - def create_review_request(self, *args, **kwargs): ... + async def create_review_request(self, *args, **kwargs): ... - def get_files(self): ... + async def get_files(self): ...webhook_server/libs/owners_files_handler.py (1)
212-261:⚠️ Potential issuePotential issue with awaiting
valid_users_to_run_commandscached property multiple times.The method awaits the
valid_users_to_run_commandscached property twice. The first time it's awaited, the result is cached, but subsequent attempts to await it will result in a TypeError since the cached value is a list, not an awaitable.This can lead to runtime errors. Fix by saving the result after the first await:
- if reviewed_user not in await self.valid_users_to_run_commands: + valid_users = await self.valid_users_to_run_commands + if reviewed_user not in valid_users: ... - self.logger.debug(f"{self.log_prefix} {reviewed_user} is not in {await self.valid_users_to_run_commands}") + self.logger.debug(f"{self.log_prefix} {reviewed_user} is not in {valid_users}")webhook_server/libs/pull_request_handler.py (5)
74-76:⚠️ Potential issueFix error logging to show the actual failed task.
When logging exceptions from async tasks, you're printing the entire
resultslist instead of the individualresultthat failed. This makes debugging difficult.- if isinstance(result, Exception): - self.logger.error(f"{self.log_prefix} Async task failed: {results}") + if isinstance(result, Exception): + self.logger.error(f"{self.log_prefix} Async task failed: {result}")
270-270:⚠️ Potential issueMake repository.get_pulls() non-blocking.
The direct call to
repository.get_pulls()in an async method will block the event loop when it performs network I/O operations.- for pull_request in self.repository.get_pulls(state="open"): + for pull_request in await asyncio.to_thread(self.repository.get_pulls, state="open"):
526-526:⚠️ Potential issueRemove incorrect await on synchronous method.
The
pull_request_labels_names()method is synchronous and returns a list, not an awaitable coroutine. Awaiting it will raise a TypeError at runtime.- _labels = await self.labels_handler.pull_request_labels_names(pull_request=pull_request) + _labels = self.labels_handler.pull_request_labels_names(pull_request=pull_request)
336-345: 🛠️ Refactor suggestionAdd error handling for GitHub API calls.
The async GitHub API calls in
close_issue_for_merged_or_closed_prcould fail with no error handling, potentially causing unhandled exceptions.async def close_issue_for_merged_or_closed_pr(self, pull_request: PullRequest, hook_action: str) -> None: - for issue in await asyncio.to_thread(self.repository.get_issues): - if issue.body == self._generate_issue_body(pull_request=pull_request): - self.logger.info(f"{self.log_prefix} Closing issue {issue.title} for PR: {pull_request.title}") - await asyncio.to_thread( - issue.create_comment, - f"{self.log_prefix} Closing issue for PR: {pull_request.title}.\nPR was {hook_action}.", - ) - await asyncio.to_thread(issue.edit, state="closed") - - break + try: + for issue in await asyncio.to_thread(self.repository.get_issues): + if issue.body == self._generate_issue_body(pull_request=pull_request): + self.logger.info(f"{self.log_prefix} Closing issue {issue.title} for PR: {pull_request.title}") + try: + await asyncio.to_thread( + issue.create_comment, + f"{self.log_prefix} Closing issue for PR: {pull_request.title}.\nPR was {hook_action}.", + ) + await asyncio.to_thread(issue.edit, state="closed") + break + except Exception as e: + self.logger.error(f"{self.log_prefix} Failed to close issue: {e}") + except Exception as e: + self.logger.error(f"{self.log_prefix} Failed to get issues: {e}")
487-497:⚠️ Potential issueMake add_to_assignees non-blocking in async method.
The direct calls to
pull_request.add_to_assignees()will block the event loop when they perform network I/O operations.async def add_pull_request_owner_as_assingee(self, pull_request: PullRequest) -> None: try: self.logger.info(f"{self.log_prefix} Adding PR owner as assignee") - pull_request.add_to_assignees(pull_request.user.login) + await asyncio.to_thread(pull_request.add_to_assignees, pull_request.user.login) except Exception as exp: self.logger.debug(f"{self.log_prefix} Exception while adding PR owner as assignee: {exp}") if self.github_webhook.root_approvers: self.logger.debug(f"{self.log_prefix} Falling back to first approver as assignee") - pull_request.add_to_assignees(self.github_webhook.root_approvers[0]) + await asyncio.to_thread(pull_request.add_to_assignees, self.github_webhook.root_approvers[0])
🧹 Nitpick comments (7)
webhook_server/libs/owners_files_handler.py (2)
21-32: Method name has a typo and lacks proper initialization safeguards.The method name is spelled correctly as "initialize" but lacks safeguards to enforce initialization before use of related attributes. While there is an
_ensure_initializedmethod, a more robust design would also include a factory method.Consider adding an async factory method to ensure initialization is always performed:
@classmethod async def create(cls, github_webhook: Any, pull_request: PullRequest) -> "OwnersFileHandler": """Create and initialize an OwnersFileHandler instance.""" instance = cls(github_webhook=github_webhook) await instance.initialize(pull_request=pull_request) return instanceThis would allow callers to use a simpler pattern:
owners_handler = await OwnersFileHandler.create(github_webhook=webhook, pull_request=pr)
173-211: Complex logic for determining which OWNERS files apply.The
owners_data_for_changed_filesmethod has complex logic to determine which OWNERS files apply to changed folders. Consider adding more detailed comments to explain the logic, especially around therequire_root_approversflag and the nested conditions.Add more detailed comments explaining the logic:
async def owners_data_for_changed_files(self) -> dict[str, dict[str, Any]]: self._ensure_initialized() data: dict[str, dict[str, Any]] = {} + # Extract all parent directories from changed files changed_folders = {Path(cf).parent for cf in self.changed_files} changed_folder_match: list[Path] = [] require_root_approvers: bool | None = None + # First pass: find all OWNERS files that are in or parent to changed folders for owners_dir, owners_data in self.all_repository_approvers_and_reviewers.items(): if owners_dir == ".": continuewebhook_server/tests/test_pull_request_owners.py (1)
135-158: Misspelled variable name in assertion.The variable name
owners_data_chaged_files_expectedis misspelled (should be "changed" not "chaged"). This doesn't affect functionality but makes the code less readable.- owners_data_chaged_files_expected = { + owners_data_changed_files_expected = { "folder5": { "approvers": ["folder5_approver1", "folder5_approver2"], "reviewers": ["folder5_reviewer1", "folder5_reviewer2"], "root-approvers": False, }, # ... other folders } - assert owners_data_changed_files_result == owners_data_chaged_files_expected + assert owners_data_changed_files_result == owners_data_changed_files_expectedwebhook_server/libs/pull_request_handler.py (1)
351-358: Simplify task append by avoiding double unpacking.The current approach of using
**to pass a dictionary of arguments makes the code less readable and error-prone. Direct argument passing is clearer.- tasks.append( - self.labels_handler._add_label( - **{ - "pull_request": pull_request, - "label": f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}", - }, - ) - ) + tasks.append( + self.labels_handler._add_label( + pull_request=pull_request, + label=f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}" + ) + )webhook_server/libs/runner_handler.py (3)
150-158: Consider making these utility methods staticThe utility methods is_podman_bug and fix_podman_bug don't use instance variables and could be made static.
- def is_podman_bug(self, err: str) -> bool: + @staticmethod + def is_podman_bug(err: str) -> bool: _err = "Error: current system boot ID differs from cached boot ID; an unhandled reboot has occurred" return _err in err.strip() - def fix_podman_bug(self) -> None: + @staticmethod + def fix_podman_bug() -> None: - self.logger.debug(f"{self.log_prefix} Fixing podman bug") + # Consider passing the logger or using a class or module-level logger + print("Fixing podman bug") shutil.rmtree("/tmp/storage-run-1000/containers", ignore_errors=True) shutil.rmtree("/tmp/storage-run-1000/libpod/tmp", ignore_errors=True)
165-167: Update self reference in is_podman_bug callIf you implement the suggested staticmethod change, update this call accordingly.
- if self.is_podman_bug(err=err): + if RunnerHandler.is_podman_bug(err=err): - self.fix_podman_bug() + RunnerHandler.fix_podman_bug()
286-290: Use keyword arguments for better readabilityFor consistency with other calls in the codebase, consider explicitly naming all parameters for better readability.
- async with self._prepare_cloned_repo_dir( - pull_request=pull_request, - is_merged=is_merged, - tag_name=tag, - clone_repo_dir=clone_repo_dir, - ) as _res: + async with self._prepare_cloned_repo_dir( + clone_repo_dir=clone_repo_dir, + pull_request=pull_request, + is_merged=is_merged, + tag_name=tag, + ) as _res:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (27)
.gitignore(1 hunks)Dockerfile(2 hunks)entrypoint.py(2 hunks)entrypoint.sh(0 hunks)pyproject.toml(2 hunks)tox.toml(1 hunks)webhook_server/app.py(5 hunks)webhook_server/libs/check_run_handler.py(6 hunks)webhook_server/libs/config.py(1 hunks)webhook_server/libs/exceptions.py(1 hunks)webhook_server/libs/github_api.py(8 hunks)webhook_server/libs/issue_comment_handler.py(5 hunks)webhook_server/libs/labels_handler.py(7 hunks)webhook_server/libs/owners_files_handler.py(1 hunks)webhook_server/libs/pull_request_handler.py(17 hunks)webhook_server/libs/pull_request_review_handler.py(2 hunks)webhook_server/libs/push_handler.py(3 hunks)webhook_server/libs/runner_handler.py(14 hunks)webhook_server/tests/conftest.py(4 hunks)webhook_server/tests/test_add_reviewer_action.py(2 hunks)webhook_server/tests/test_prepare_retest_wellcome_comment.py(1 hunks)webhook_server/tests/test_pull_request_owners.py(28 hunks)webhook_server/tests/test_pull_request_size.py(2 hunks)webhook_server/utils/github_repository_and_webhook_settings.py(2 hunks)webhook_server/utils/github_repository_settings.py(4 hunks)webhook_server/utils/helpers.py(8 hunks)webhook_server/utils/webhook.py(1 hunks)
💤 Files with no reviewable changes (1)
- entrypoint.sh
✅ Files skipped from review due to trivial changes (1)
- .gitignore
🚧 Files skipped from review as they are similar to previous changes (15)
- entrypoint.py
- webhook_server/tests/test_pull_request_size.py
- webhook_server/libs/exceptions.py
- pyproject.toml
- tox.toml
- Dockerfile
- webhook_server/tests/test_prepare_retest_wellcome_comment.py
- webhook_server/libs/pull_request_review_handler.py
- webhook_server/utils/github_repository_and_webhook_settings.py
- webhook_server/tests/test_add_reviewer_action.py
- webhook_server/libs/push_handler.py
- webhook_server/utils/github_repository_settings.py
- webhook_server/libs/issue_comment_handler.py
- webhook_server/libs/labels_handler.py
- webhook_server/libs/check_run_handler.py
🧰 Additional context used
🧬 Code Graph Analysis (6)
webhook_server/utils/webhook.py (1)
webhook_server/utils/helpers.py (1)
get_github_repo_api(73-77)
webhook_server/libs/config.py (1)
webhook_server/utils/helpers.py (1)
get_github_repo_api(73-77)
webhook_server/app.py (4)
webhook_server/libs/exceptions.py (1)
RepositoryNotFoundError(1-2)webhook_server/libs/github_api.py (1)
GithubWebhook(45-374)webhook_server/utils/helpers.py (1)
get_logger_with_params(22-56)webhook_server/libs/config.py (2)
Config(11-87)root_data(33-39)
webhook_server/utils/helpers.py (2)
webhook_server/libs/exceptions.py (1)
NoApiTokenError(11-12)webhook_server/libs/config.py (2)
Config(11-87)get_value(67-87)
webhook_server/tests/test_pull_request_owners.py (3)
webhook_server/tests/conftest.py (6)
get_contents(50-87)owners_file_handler(140-141)process_github_webhook(135-136)pull_request(111-112)Repository(42-87)github_webhook(116-131)webhook_server/libs/owners_files_handler.py (4)
get_all_repository_approvers_and_reviewers(87-125)owners_data_for_changed_files(173-210)get_all_pull_request_approvers(151-160)get_all_pull_request_reviewers(162-171)webhook_server/libs/pull_request_handler.py (1)
_check_if_pr_approved(583-647)
webhook_server/libs/pull_request_handler.py (7)
webhook_server/tests/test_add_reviewer_action.py (2)
PullRequest(21-29)create_issue_comment(25-26)webhook_server/tests/conftest.py (6)
PullRequest(95-107)github_webhook(116-131)owners_file_handler(140-141)pull_request(111-112)create_issue_comment(103-103)base(100-101)webhook_server/libs/check_run_handler.py (13)
set_verify_check_success(58-59)set_verify_check_queued(55-56)set_merge_check_queued(91-92)set_run_tox_check_queued(61-65)set_run_pre_commit_check_queued(76-80)set_python_module_install_queued(118-122)set_container_build_queued(103-107)set_conventional_title_queued(137-138)set_merge_check_in_progress(94-95)required_check_in_progress(294-312)required_check_failed_or_no_status(224-258)set_merge_check_success(97-98)set_merge_check_failure(100-101)webhook_server/libs/labels_handler.py (4)
_add_label(56-87)_remove_label(43-54)add_size_label(116-135)pull_request_labels_names(39-41)webhook_server/libs/owners_files_handler.py (5)
OwnersFileHandler(14-301)assign_reviewers(212-230)root_approvers(46-51)root_reviewers(38-43)owners_data_for_changed_files(173-210)webhook_server/libs/runner_handler.py (7)
cherry_pick(406-472)run_build_container(237-345)run_podman_command(159-169)run_tox(171-206)run_pre_commit(208-235)run_install_python_module(347-380)run_conventional_title_check(382-401)webhook_server/libs/github_api.py (1)
container_repository_and_tag(319-341)
⏰ Context from checks skipped due to timeout of 90000ms (9)
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: can-be-merged
- GitHub Check: python-module-install
- GitHub Check: pre-commit
- GitHub Check: build-container
- GitHub Check: tox
🔇 Additional comments (42)
webhook_server/utils/webhook.py (1)
30-30: Parameter name update is consistent with function definitionThe parameter name change from
github_apitogithub_app_apialigns with the function signature inwebhook_server/utils/helpers.py. This ensures consistency across the codebase.webhook_server/libs/config.py (1)
50-50: Parameter name update is consistent with function definitionThe parameter name change from
github_apitogithub_app_apialigns with the function signature inwebhook_server/utils/helpers.py. This ensures consistency across the codebase.webhook_server/app.py (5)
4-11: Added JSON import and simplified exceptions importThe switch to synchronous
json.loads()for parsing request bodies is appropriate and reflects the removal ofNoPullRequestErrorfrom the exceptions module.
29-37: Improved logger and HTTP client initializationGood change to use a module-level logger for consistency and to initialize the HTTP client in the lifespan context manager rather than at import time. This addresses cross-event-loop issues identified in previous reviews.
39-73: New async functions for fetching IP allowlists with proper error handlingThe implementation of dedicated async functions for fetching GitHub and Cloudflare IP allowlists is well done. These functions now properly raise exceptions instead of silently returning empty lists, which addresses a previous review concern about failing fast when allowlists can't be fetched.
112-161: Improved lifespan context manager with proper resource managementThe lifespan context manager now correctly initializes the HTTP client, performs async operations, and uses a finally block to ensure proper cleanup. The IP allowlist validation has been improved to handle invalid CIDRs individually.
However, there's still a potential issue with the ALLOWED_IPS handling:
if networks: ALLOWED_IPS = tuple(networks) LOGGER.info(f"IP allowlist initialized successfully with {len(ALLOWED_IPS)} networks.") +else: + ALLOWED_IPS = () # Clear any previous value + LOGGER.info("IP allowlist disabled or empty – all sources accepted.")Without this change, if no networks are loaded (because verification flags are disabled or both endpoints fail), ALLOWED_IPS might retain a stale value from a previous run, as mentioned in past reviews.
189-191: Switch to synchronous JSON parsingUsing
json.loads()on the raw bytes is more direct thanawait request.json(), which is good. Make sure the payload is properly decoded if necessary.webhook_server/utils/helpers.py (4)
22-56: Improved logger configuration with better defaultsGood improvements to the logging setup:
- Setting
mask_sensitivetoTrueby default enhances security- Creating the logs directory if it doesn't exist ensures logging won't fail
- Reducing the max log file size from 50MB to 10MB is more manageable
73-77: Enhanced repository API access with debug loggingAdded debug logging improves observability when accessing repository APIs. The parameter name change also provides consistency with usage elsewhere in the codebase.
80-135: Converted run_command to async implementationSuccessfully converted the command execution from synchronous to asynchronous using
asyncio.create_subprocess_exec. This is a significant improvement that will prevent blocking I/O operations from affecting server performance.
147-200: Enhanced API token validation and error handlingExcellent improvements to the GitHub API token validation:
- Skipping APIs with rate limit of 60 (indicating invalid tokens)
- Better error handling when retrieving user logins
- Raising a proper exception when no valid API/token combination is found
webhook_server/tests/conftest.py (5)
8-10: Import order matters for environment variable setup.Setting the environment variable
WEBHOOK_SERVER_DATA_DIRbefore importingGithubWebhookis important because it ensures that the environment variable is available when theGithubWebhookclass is initialized. Good implementation.
67-73: Good implementation of parameterizable test fixtures.The
changed_filesfixture nicely handles both parameterized test cases and default values, providing flexibility for various test scenarios.
129-131: Add wait forasyncio.to_threadcall.The test needs to properly await the asynchronous GitHub API call. The current implementation correctly awaits the result.
90-93: Great implementation of test data class.The
Labelclass is properly implemented with a name attribute to simulate GitHub's API objects in tests.
115-132: Nice separation of fixtures for better test modularity.The separation of
github_webhookinto multiple fixtures (process_github_webhookandowners_file_handler) improves test modularity and follows good testing practices.webhook_server/libs/owners_files_handler.py (4)
33-36: Good validation of initialization requirement.The
_ensure_initializedmethod properly checks for required attributes and raises a clear error message when initialization hasn't been performed.
87-126: Efficient async implementation of OWNERS file retrieval.The implementation efficiently creates tasks for fetching OWNERS file content asynchronously and uses
asyncio.gatherto run them in parallel, which is a good practice. The method also includes proper error handling and validation.
262-275: Good use of asyncio.functools.cached_property for expensive operations.Using
functools.cached_propertyfor expensive operations like retrieving users with permission to run commands is a good practice, as it prevents redundant API calls.
53-55: Properly wrapping synchronous code with asyncio.to_thread.The
list_changed_filesmethod correctly wraps the synchronouspull_request.get_filescall withasyncio.to_thread, ensuring it doesn't block the event loop.webhook_server/tests/test_pull_request_owners.py (3)
125-132: Good conversion to async test pattern.The test has been properly converted to use async/await pattern with the
@pytest.mark.asynciodecorator. It correctly awaits the asynchronousget_all_repository_approvers_and_reviewersmethod.
8-16: Good approach to defining test data.The
ALL_CHANGED_FILESlist is a clear and concise way to define test data for multiple test scenarios.
96-123: Properly setting up test data for reviewers and approvers.The fixtures for reviewers and approvers are well-structured and provide comprehensive test data.
webhook_server/libs/github_api.py (5)
108-126: Good asynchronous handling of pull request processing flow.The method properly checks if the pull request is a draft and only proceeds with processing for non-draft PRs. It also correctly awaits asynchronous operations like getting the last commit.
131-165: Consistent initialization of OwnersFileHandler.Each event handler branch correctly initializes the OwnersFileHandler and awaits its initialization before passing it to the respective handler. This ensures consistent behavior across different event types.
279-295: Properly wrapping GitHub API calls with asyncio.to_thread.All GitHub API calls are correctly wrapped with
asyncio.to_threadto prevent blocking the event loop. This is a significant improvement for the server's scalability and responsiveness.
306-309: Good conversion of _get_last_commit to async.The method is properly converted to async and correctly wraps the synchronous
pull_request.get_commitscall withasyncio.to_thread.
319-342: Function now handles missing pull_request parameter.The
container_repository_and_tagmethod has been improved to handle the case whenpull_requestis None, returning None early instead of causing an error.webhook_server/libs/pull_request_handler.py (6)
25-51: Effective dependency injection with OwnersFileHandler.The refactoring to use
OwnersFileHandleras an explicit dependency is a good approach. This improves testability and makes dependencies clearer.
53-157: Good async conversion of the main webhook handler.Converting
process_pull_request_webhook_datato async with proper awaiting of dependent tasks is well-implemented. Usingasyncio.gatherwithreturn_exceptions=Trueis the right pattern for concurrent task execution with error handling.
259-273: Good conversion from time.sleep to asyncio.sleep.Correctly using
asyncio.sleepinstead oftime.sleepprevents blocking the event loop during the waiting period, allowing other coroutines to run.
313-315: Properly non-blocking GitHub API call.Good use of
asyncio.to_threadto wrap the synchronouscreate_issue_commentmethod, preventing it from blocking the event loop.
419-423: Properly non-blocking auto-merge enabling.Good use of
asyncio.to_threadto wrap the synchronousenable_automergemethod, preventing it from blocking the event loop.
347-383: Excellent async task orchestration.The
process_opened_or_synchronize_pull_requestmethod is well-refactored to useasyncio.gatherfor concurrent execution of multiple tasks, with proper error handling for individual tasks.webhook_server/libs/runner_handler.py (8)
1-4: Great job on the async conversion!The imports are correctly updated to include asyncio and the proper typing for async generators, which aligns well with the PR objective to "Use more async code".
25-27: The constructor now properly accepts OwnersFileHandlerGood implementation of dependency injection for the OwnersFileHandler, making testing easier and decoupling components.
35-43: Correctly converted to async context managerThe function has been properly converted to an async context manager with the correct return type (AsyncGenerator). This is a good pattern for resource management in async code.
128-144: Fixed walrus operator usageYou've correctly addressed a previous review comment by using
_pull_requestinstead ofpull_requestfor the walrus operator, avoiding parameter shadowing.
159-170: Properly converted run_podman_command to asyncThe run_podman_command method has been correctly converted to async, including proper awaiting of the run_command function and recursive call.
250-258: Nice addition of async user validationGood implementation of async user validation through the OwnersFileHandler to determine if a user is allowed to run commands.
313-346: Good use of asyncio.to_thread for GitHub API callsYou've correctly wrapped the GitHub API call to create_issue_comment with asyncio.to_thread to prevent blocking the event loop. This addresses a previous review comment.
403-404: Well-designed async branch existence checkThe new is_branch_exists method properly uses asyncio.to_thread to wrap the synchronous GitHub API call, maintaining the async flow throughout the code.
| apis_and_tokens = get_apis_and_tokes_from_config(config=self.config) | ||
| self.auto_verified_and_merged_users.extend([_api[0].get_user().login for _api in apis_and_tokens]) | ||
| for _api, _ in apis_and_tokens: | ||
| if _api.rate_limiting[-1] == 60: | ||
| self.logger.warning( | ||
| f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping" | ||
| ) | ||
| continue | ||
|
|
||
| self.auto_verified_and_merged_users.append(_api.get_user().login) | ||
|
|
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Check for valid API token is incomplete.
The method checks if the API rate limit is 60 (indicating an invalid token) but doesn't validate the token in other ways. Consider adding more robust token validation.
🏁 Script executed:
#!/bin/bash
# Check if there are other validation methods for GitHub API tokens
rg -A 3 -B 3 "rate_limiting" --type pyLength of output: 1179
Add robust GitHub API token validation
The current check only skips tokens whose rate_limiting[-1] is 60, but there are other failure modes (expired credentials, insufficient scopes, network issues). We should explicitly validate each token by making a lightweight API call and handling authentication errors.
Suggested changes:
- In
webhook_server/libs/github_api.py(and similarly inwebhook_server/utils/helpers.py), wrap the token check in atry/exceptblock:- Call
_api.get_user()(or the/rate_limitendpoint) inside thetry. - Catch authentication exceptions (e.g.
github.GithubException.BadCredentialsException) and log a warning. - Only append
loginon successful validation.
- Call
- Example diff for
add_api_users_to_auto_verified_and_merged_users:for _api, _ in apis_and_tokens:
-
if _api.rate_limiting[-1] == 60: -
self.logger.warning( -
f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping" -
) -
continue -
self.auto_verified_and_merged_users.append(_api.get_user().login)
-
try: -
user = _api.get_user() # will raise if token is bad -
except github.GithubException as e: -
self.logger.warning( -
f"{self.log_prefix} Invalid or expired token ({e.status}), skipping" -
) -
continue -
self.auto_verified_and_merged_users.append(user.login)
- Optionally, perform a scopes check via
user.raw_data.get("scopes")or a/rate_limitcall to ensure the token has the required permissions before use.
These changes will catch invalid, expired, or under-scoped tokens more reliably.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| apis_and_tokens = get_apis_and_tokes_from_config(config=self.config) | |
| self.auto_verified_and_merged_users.extend([_api[0].get_user().login for _api in apis_and_tokens]) | |
| for _api, _ in apis_and_tokens: | |
| if _api.rate_limiting[-1] == 60: | |
| self.logger.warning( | |
| f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping" | |
| ) | |
| continue | |
| self.auto_verified_and_merged_users.append(_api.get_user().login) | |
| apis_and_tokens = get_apis_and_tokes_from_config(config=self.config) | |
| for _api, _ in apis_and_tokens: | |
| try: | |
| user = _api.get_user() # will raise if token is bad | |
| except github.GithubException as e: | |
| self.logger.warning( | |
| f"{self.log_prefix} Invalid or expired token ({e.status}), skipping" | |
| ) | |
| continue | |
| self.auto_verified_and_merged_users.append(user.login) |
🤖 Prompt for AI Agents
In webhook_server/libs/github_api.py around lines 168 to 177, the current token
validation only checks if the API rate limit is 60, which is insufficient. To
fix this, wrap the call to _api.get_user() in a try/except block that catches
authentication exceptions like github.GithubException.BadCredentialsException.
Log a warning when such exceptions occur and skip appending the user login for
invalid tokens. Optionally, add a scopes check by inspecting user.raw_data or
calling the /rate_limit endpoint to ensure the token has the necessary
permissions before appending the login.
|
/approve |
|
/verified |
|
New container for ghcr.io/myk-org/github-webhook-server:latest published |
Summary by CodeRabbit