Skip to content

Comments

Use more async code#778

Merged
myakove merged 2 commits intomainfrom
more-async
May 18, 2025
Merged

Use more async code#778
myakove merged 2 commits intomainfrom
more-async

Conversation

@myakove
Copy link
Collaborator

@myakove myakove commented May 12, 2025

Summary by CodeRabbit

  • Improvements
    • Enhanced performance and reliability by switching to asynchronous fetching of allowlists for improved responsiveness.
    • Improved error handling and unified logging for greater consistency and clearer diagnostics.
    • Ensured proper resource cleanup during application startup and shutdown for increased stability.
    • Refactored core handlers to explicitly pass pull request objects, improving clarity and concurrency handling.
    • Converted key webhook processing methods to asynchronous functions for better scalability.
    • Simplified error flow by removing obsolete exception handling related to pull requests.
    • Updated webhook payload parsing to improve robustness and logging consistency.
    • Made container build process asynchronous for improved efficiency.
    • Added safeguards to ensure log directories exist before logging to prevent runtime issues.
    • Enhanced test fixtures and updated tests to support explicit pull request parameter usage.
    • Replaced synchronous subprocess calls with asynchronous execution for non-blocking operations.
    • Updated configuration and setup scripts to run asynchronously for improved startup performance.
    • Modernized container startup by replacing shell script entrypoint with direct Python invocation.
    • Improved concurrency by replacing thread pools with asyncio tasks for webhook command processing.
    • Reduced maximum log file size and standardized sensitive data masking in logs.
    • Added new dependencies and test configurations to support asynchronous testing.
    • Introduced asynchronous management of OWNERS files and reviewer assignments for pull requests.
    • Streamlined GitHub webhook event processing with explicit pull request handling and removal of legacy OWNERS logic.

@coderabbitai
Copy link

coderabbitai bot commented May 12, 2025

## Walkthrough

The code updates webhook server logic to fetch GitHub and Cloudflare IP allowlists asynchronously using an `httpx.AsyncClient`. It introduces new async functions for allowlist retrieval, updates the application lifespan context manager to handle async client lifecycle, unifies logging via a module-level logger, and modifies webhook payload parsing to use `json.loads` on raw bytes. Additionally, multiple internal classes are refactored to remove implicit `self.pull_request` usage, replacing it with explicit `PullRequest` parameters in methods, converting several methods to async, and removing the `NoPullRequestError` exception class and its handling.

## Changes

| File(s)                                         | Change Summary                                                                                                                                                                                                                                                                                                                                                                                         |
|------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `webhook_server/app.py`                         | Replaces synchronous allowlist HTTP requests with async functions using `httpx.AsyncClient`; updates lifespan context manager for async fetching and client cleanup; unifies logging; changes webhook payload parsing to use `json.loads` on raw body bytes; removes handling of `NoPullRequestError` exception from webhook processing.                                                                                                           |
| `webhook_server/libs/check_run_handler.py`     | Removes `self.pull_request` attribute; methods now require explicit `pull_request` parameter; converts `all_required_status_checks` from cached property to method with `pull_request` argument; updates calls accordingly; converts `process_pull_request_check_run_webhook_data` to async; updates status-setting methods to async using `asyncio.to_thread`.                                                                                                         |
| `webhook_server/libs/exceptions.py`             | Removes `NoPullRequestError` exception class; adds new `NoApiTokenError` exception class.                                                                                                                                                                                                                                                                                                               |
| `webhook_server/libs/github_api.py`             | Refactors `GithubWebhook` to remove instance attributes for pull request and GitHub API; methods now take explicit `pull_request` parameters; `_get_pull_request` renamed and returns `None` instead of raising; `process` method rewritten to async with explicit pull request handling; updates method signatures and internal calls accordingly; removes exception-based control flow for missing pull requests; removes OWNERS-related methods and logic.                                                                                                           |
| `webhook_server/libs/issue_comment_handler.py`  | Refactors `IssueCommentHandler` methods to accept explicit `pull_request` parameter instead of using `self.pull_request`; converts some methods to async; updates all internal calls to use passed `pull_request`; replaces ThreadPoolExecutor concurrency with asyncio tasks and gather for retest commands; constructor now accepts `OwnersFileHandler`.                                                                                                                                                                                     |
| `webhook_server/libs/labels_handler.py`         | Refactors `LabelsHandler` methods to accept explicit `pull_request` parameter instead of using `self.pull_request`; converts label addition/removal methods to async; updates logging to use passed `pull_request`; constructor now accepts `OwnersFileHandler`; no logic changes.                                                                                                                                                                                                                      |
| `webhook_server/libs/pull_request_handler.py`   | Refactors `PullRequestHandler` to remove `self.pull_request`; methods now accept explicit `pull_request` parameters; converts `process_pull_request_webhook_data` and other internal methods to async; replaces ThreadPoolExecutor concurrency with asyncio; updates all internal calls to use passed `pull_request`; constructor now accepts `OwnersFileHandler`; preserves logic and exception handling.                                                                                                                                      |
| `webhook_server/libs/pull_request_review_handler.py` | Converts `process_pull_request_review_webhook_data` to async and requires explicit `pull_request` parameter; updates internal calls to pass `pull_request`; constructor now accepts `OwnersFileHandler`; removes reliance on `github_webhook` attributes.                                                                                                                                                                                        |
| `webhook_server/libs/push_handler.py`            | Converts `process_push_webhook_data` and `upload_to_pypi` methods to async; uses async context manager and awaits command executions; internal logic unchanged.                                                                                                                                                                                                                                                                                                                         |
| `webhook_server/libs/runner_handler.py`          | Removes `self.pull_request`; methods and context managers now accept explicit `pull_request` parameter; removes import and handling of `NoPullRequestError`; updates internal calls to use passed `pull_request`; converts methods and context manager to async; adds async branch existence check; constructor now accepts optional `OwnersFileHandler`; preserves logic but changes control flow to explicit parameter passing and async command execution.                                                                                                                               |
| `webhook_server/libs/owners_files_handler.py`    | Adds new `OwnersFileHandler` class managing OWNERS files, approvers, reviewers, and user validation asynchronously; loads changed files and OWNERS data; supports assigning reviewers; caches repository collaborators, contributors, and maintainers; integrates with pull requests for authorization and reviewer assignment.                                                                                                                                |
| `webhook_server/tests/conftest.py`                | Adds `Label` class and expands `PullRequest` test stub with additions, deletions, labels, and base ref; adds stub methods for issue comment and review request creation; defines pytest fixture for `pull_request`; changes import order to load `OwnersFileHandler` before `GithubWebhook`.                                                                                                                                                                                    |
| `webhook_server/tests/test_add_reviewer_action.py` | Updates test functions to accept `pull_request` and `owners_file_handler` fixtures and pass them explicitly to `_add_reviewer_by_user_comment`.                                                                                                                                                                                                                                                                                      |
| `webhook_server/tests/test_pull_request_owners.py` | Updates test functions to accept `pull_request` and `owners_file_handler` fixtures; converts tests to async; adds `changed_files` fixture; updates parametrization to use `changed_files`; updates calls to await async methods on `owners_file_handler`.                                                                                                                                                                                                                                                                          |
| `webhook_server/tests/test_pull_request_size.py`  | Removes local `PullRequest` and `Label` classes; imports `PullRequest` from `conftest`; changes test to pass explicit `pull_request` argument to `LabelsHandler.get_size`; includes `owners_file_handler` fixture.                                                                                                                                                                                                                               |
| `webhook_server/tests/test_prepare_retest_wellcome_comment.py` | Adds `owners_file_handler` fixture parameter and passes it to `PullRequestHandler` constructor in test setup.                                                                                                                                                                                                                                                                                             |
| `.gitignore`                                      | Adds `webhook_server/tests/manifests/logs` directory to ignore list.                                                                                                                                                                                                                                                                                                                                    |
| `webhook_server/utils/helpers.py`                 | Ensures the log directory exists before creating log files by checking and creating the "logs" directory under the configured data directory; converts `run_command` to async using `asyncio.create_subprocess_exec` with awaited output handling; tightens return types and adds handling for `BadCredentialsException` in `get_api_with_highest_rate_limit`; raises `NoApiTokenError` if no valid API/token found; sets default masking of sensitive info in logger.                                                                                                                                                    |
| `entrypoint.py`                                   | Imports `asyncio` and runs `repository_and_webhook_settings` asynchronously using `asyncio.run()`; replaces print and synchronous call with direct `uvicorn.run()` invocation.                                                                                                                                                                                                                                                                                                      |
| `webhook_server/utils/github_repository_and_webhook_settings.py` | Converts `repository_and_webhook_settings` function to async and awaits the call to `set_repositories_settings`.                                                                                                                                                                                                                                                                                        |
| `webhook_server/utils/github_repository_settings.py` | Converts `set_repositories_settings` to async and awaits `run_command`; removes `github_api` parameter from `set_branch_protection` and its calls.                                                                                                                                                                                                                                                     |
| `Dockerfile`                                      | Removes copying and execution permission setting of `entrypoint.sh`; replaces entrypoint with direct `uv run entrypoint.py` command.                                                                                                                                                                                                                                                                   |
| `entrypoint.sh`                                   | Deleted. Removed Bash entrypoint script that launched the server via a dynamically generated uvicorn command.                                                                                                                                                                                                                                                                                          |
| `pyproject.toml`                                  | Adds required dependency `asyncstdlib>=3.13.1`; adds optional test dependency group with `pytest-asyncio>=0.26.0`.                                                                                                                                                                                                                                                                                       |
| `tox.toml`                                        | Modifies `unittests` environment command to include `--extra tests` argument in `uv run` invocation.                                                                                                                                                                                                                                                                                                     |

## Possibly related PRs

- myk-org/github-webhook-server#750: Both PRs modify the `process` method in `webhook_server/libs/github_api.py` to be asynchronous and update its usage.
- myk-org/github-webhook-server#756: Both PRs modify `process_webhook` in `webhook_server/app.py` to improve async handling and background task usage.
- myk-org/github-webhook-server#744: Both PRs modify allowlist fetching functions in `webhook_server/app.py` but differ in sync vs async implementation and lifecycle management.

## Suggested labels

`size/L`, `verified`

## Suggested reviewers

- rnetser

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 7684bf0 and 73195e8.

📒 Files selected for processing (1)
  • webhook_server/tests/conftest.py (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: build-container
  • GitHub Check: tox
  • GitHub Check: pre-commit
  • GitHub Check: python-module-install
🔇 Additional comments (6)
webhook_server/tests/conftest.py (6)

8-11: Import order change looks good

The reorganization of imports with OwnersFileHandler before GithubWebhook and setting the environment variable before import reflects the new dependency structure properly.


50-50: Added ref parameter aligns with GitHub API

The addition of the ref parameter to the get_contents method signature correctly aligns the mock with the GitHub API, which requires this parameter for getting repository content by reference.


90-93: Good addition of Label test stub

This simple Label class provides the necessary functionality for simulating GitHub labels in tests.


111-113: Good practice using a dedicated pull_request fixture

Creating a separate fixture for the PullRequest instance improves test modularity and follows the single responsibility principle.


115-132: Well-structured fixture for the new architecture

The expanded fixture correctly handles both the GithubWebhook and OwnersFileHandler instances, properly mocking all necessary dependencies for testing.


135-142: Good fixture composition pattern

Breaking down the main fixture into component-specific fixtures improves test clarity while maintaining backward compatibility.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@myakove-bot
Copy link
Collaborator

Report bugs in Issues

The following are automatically added:

  • Add reviewers from OWNER file (in the root of the repository) under reviewers section.
  • Set PR size label.
  • New issue is created for the PR. (Closed when PR is merged/closed)
  • Run pre-commit if .pre-commit-config.yaml exists in the repo.

Available user actions:

  • To mark PR as WIP comment /wip to the PR, To remove it from the PR comment /wip cancel to the PR.
  • To block merging of PR comment /hold, To un-block merging of PR comment /hold cancel.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To cherry pick a merged PR comment /cherry-pick <target branch to cherry-pick to> in the PR.
    • Multiple target branches can be cherry-picked, separated by spaces. (/cherry-pick branch1 branch2)
    • Cherry-pick will be started when PR is merged
  • To build and push container image command /build-and-push-container in the PR (tag will be the PR number).
    • You can add extra args to the Podman build command
      • Example: /build-and-push-container --build-arg OPENSHIFT_PYTHON_WRAPPER_COMMIT=<commit_hash>
  • To add a label by comment use /<label name>, to remove, use /<label name> cancel
  • To assign reviewers based on OWNERS file use /assign-reviewers
  • To check if PR can be merged use /check-can-merge
  • to assign reviewer to PR use /assign-reviewer @<reviewer>

PR will be approved when the following conditions are met:

  • /approve from one of the approvers.
  • Minimum number of required /lgtm (1) is met.
Approvers and Reviewers
  • Approvers:

    • myakove
    • rnetser
  • Reviewers:

    • myakove
    • rnetser
Supported /retest check runs
  • /retest tox: Retest tox
  • /retest build-container: Retest build-container
  • /retest python-module-install: Retest python-module-install
  • /retest pre-commit: Retest pre-commit
  • /retest all: Retest all
Supported labels
  • hold
  • verified
  • wip
  • lgtm
  • approve

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
webhook_server/app.py (5)

135-154: Deduplicate & optionally fetch in parallel

  1. networks can contain duplicates if the two providers return overlapping CIDRs.
    Building a set avoids redundant membership checks on each request:
-        networks: list[ipaddress._BaseNetwork] = []
+        networks: set[ipaddress._BaseNetwork] = set()
  1. Fetching each provider sequentially adds ~200-400 ms to startup.
    When both flags are enabled you can cut that in half:
import asyncio

tasks = []
if verify_cloudflare_ips:
    tasks.append(get_cloudflare_allowlist())
if verify_github_ips:
    tasks.append(get_github_allowlist())

for cidr_list in await asyncio.gather(*tasks, return_exceptions=False):
    networks.update(ipaddress.ip_network(cidr) for cidr in cidr_list)

This keeps startup snappy and memory-efficient.


157-162: Potential double-close of the shared AsyncClient

await _lifespan_http_client.aclose() is called both in the normal shutdown path (after yield) and in the except block.
If an exception happens after the first close, the second call is redundant and may trigger a warning.

You can guard it:

if _lifespan_http_client and not _lifespan_http_client.is_closed:
    await _lifespan_http_client.aclose()

or rely on a try/finally pattern instead of duplicating the call.


196-198: Avoid re-loading configuration on every request

Config(logger=LOGGER) is instantiated for every webhook hit, even though the values are immutable at runtime.
Cache root_config during startup and inject it via dependency or closure to reduce latency and GC pressure.

# inside lifespan
app.state.root_config = root_config

# in handler
root_config = request.app.state.root_config

215-216: Reuse the global logger instead of creating a new one per request

get_logger_with_params is invoked on each request, producing many logger instances and repeating handler configuration.
Prefer LOGGER.bind(repository_name=...) (struct-log style) or LOGGER.getChild(repo) to keep a single hierarchy and consistent formatting.
This also avoids the slight overhead of repeatedly walking the logging tree.


88-101: Remove dead commented-out code

The legacy synchronous implementations and their old usage examples create noise and hinder readability. Since the async variants are stable, please delete the commented blocks instead of keeping them in the source.

Also applies to: 165-181

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68f5d0a and 01b67bb.

📒 Files selected for processing (1)
  • webhook_server/app.py (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: pre-commit
  • GitHub Check: build-container
  • GitHub Check: python-module-install
  • GitHub Check: tox

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
webhook_server/app.py (3)

38-50: Add explicit timeouts & reuse parsed IPs to harden allow-list fetcher

The helper looks good, but two robustness tweaks are advisable:

  1. Pass a per-request timeout so a mis-behaving endpoint doesn’t block startup forever:
-        response = await _lifespan_http_client.get("https://api.github.com/meta")
+        response = await _lifespan_http_client.get(
+            "https://api.github.com/meta",
+            timeout=httpx.Timeout(5.0),
+        )
  1. The GitHub API already returns CIDR strings. Consider validating them here and returning set[ipaddress._BaseNetwork] to avoid the later double iteration & potential ValueError inside the lifespan loop.

These fixes localise failure handling and make the later startup code simpler.


53-65: Deduplicate & validate Cloudflare CIDRs before returning

Similar to the GitHub helper, the Cloudflare function can:

  • Wrap each CIDR with ipaddress.ip_network(cidr) inside the helper, catching invalid entries once.
  • Return a set to avoid duplicates – Cloudflare occasionally overlaps IPv4 ranges between services.
-        return result.get("ipv4_cidrs", []) + result.get("ipv6_cidrs", [])
+        cidrs = result.get("ipv4_cidrs", []) + result.get("ipv6_cidrs", [])
+        networks: set[ipaddress._BaseNetwork] = set()
+        for cidr in cidrs:
+            try:
+                networks.add(ipaddress.ip_network(cidr))
+            except ValueError:
+                LOGGER.warning(f"Skipping invalid CIDR from Cloudflare response: {cidr}")
+        return list(networks)

This guarantees the caller only receives validated networks.


210-215: Prefer request.json() or decode bytes once to avoid double parsing & large payload memory

Calling await request.body() loads the entire request into memory and keeps a second copy when you pass the same bytes to json.loads. request.json() streams & decodes in one step.

If you need the raw bytes for signature verification, decode once:

payload_body = await request.body()

# verify_signature(payload_body, ...)

try:
    hook_data = json.loads(payload_body.decode("utf-8"))
except json.JSONDecodeError as exc:
    ...

This removes an implicit re-decode done inside json.loads for bytes and makes the charset explicit.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68f5d0a and 01b67bb.

📒 Files selected for processing (1)
  • webhook_server/app.py (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: pre-commit
  • GitHub Check: build-container
  • GitHub Check: python-module-install
  • GitHub Check: tox

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (5)
webhook_server/libs/push_handler.py (1)

20-32: 💡 Verification agent

🧩 Analysis chain

Async wrapper still executes heavy, blocking work on the event-loop thread

process_push_webhook_data is now async, yet it directly invokes two CPU / I/O-heavy synchronous helpers (upload_to_pypi, _run_build_container).
When this coroutine is awaited inside an async web-request handler it will block the loop, wiping out the scalability gains we try to achieve with the refactor.

Suggested quick win – move the work to a thread pool with asyncio.to_thread (or refactor helpers to be truly async):

@@
-import re
+import asyncio
+import re
@@
-            if self.github_webhook.pypi:
-                self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}")
-                self.upload_to_pypi(tag_name=tag_name)
+            if self.github_webhook.pypi:
+                self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}")
+                await asyncio.to_thread(self.upload_to_pypi, tag_name=tag_name)
@@
-            if self.github_webhook.build_and_push_container and self.github_webhook.container_release:
-                self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}")
-                self.runner_handler._run_build_container(push=True, set_check=False, tag=tag_name)
+            if self.github_webhook.build_and_push_container and self.github_webhook.container_release:
+                self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}")
+                await asyncio.to_thread(
+                    self.runner_handler._run_build_container,
+                    push=True,
+                    set_check=False,
+                    tag=tag_name,
+                )

This prevents the event loop from being blocked while keeping the public async API intact.
If you prefer a full async rewrite of the helpers, we can tackle that separately.


🏁 Script executed:

#!/bin/bash
# Check if upload_to_pypi is async or sync
rg -n "async def upload_to_pypi" -C2
rg -n "def upload_to_pypi" -C2

# Check if _run_build_container is async or sync
rg -n "async def _run_build_container" -C2
rg -n "def _run_build_container" -C2

Length of output: 976


Offload Blocking Helpers to Thread Pool

Both upload_to_pypi and _run_build_container are synchronous methods (see definitions in push_handler.py:33 and runner_handler.py:234), so calling them directly inside the async process_push_webhook_data will block the event loop. Offload these calls with asyncio.to_thread to prevent blocking:

Locations to update:

  • webhook_server/libs/push_handler.py: process_push_webhook_data (lines 20–32)
  • webhook_server/libs/push_handler.py: def upload_to_pypi (line 33)
  • webhook_server/libs/runner_handler.py: def _run_build_container (line 234)

Suggested diff:

@@ webhook_server/libs/push_handler.py
-import re
+import asyncio
+import re
@@
-            if self.github_webhook.pypi:
-                self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}")
-                self.upload_to_pypi(tag_name=tag_name)
+            if self.github_webhook.pypi:
+                self.logger.info(f"{self.log_prefix} Processing upload to pypi for tag: {tag_name}")
+                await asyncio.to_thread(self.upload_to_pypi, tag_name=tag_name)
@@
-            if self.github_webhook.build_and_push_container and self.github_webhook.container_release:
-                self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}")
-                self.runner_handler._run_build_container(push=True, set_check=False, tag=tag_name)
+            if self.github_webhook.build_and_push_container and self.github_webhook.container_release:
+                self.logger.info(f"{self.log_prefix} Processing build and push container for tag: {tag_name}")
+                await asyncio.to_thread(
+                    self.runner_handler._run_build_container,
+                    push=True,
+                    set_check=False,
+                    tag=tag_name,
+                )

This change preserves your async API while preventing the event loop from being blocked.

webhook_server/libs/pull_request_review_handler.py (1)

15-32: 💡 Verification agent

🧩 Analysis chain

async function contains only synchronous calls – confirm/await or revert

process_pull_request_review_webhook_data is declared async, but all internal calls (manage_reviewed_by_label, label_by_user_comment) are invoked without await.

If these helper methods were converted to async, we must await them or nothing will execute.
If they stayed synchronous, keeping this wrapper async adds unnecessary complexity.

Please verify the helpers’ signatures and either:

  1. add await in front of each async helper,
  2. or drop the async keyword from this method.

🏁 Script executed:

#!/bin/bash
# Show the class and method context in pull_request_review_handler.py
echo "=== pull_request_review_handler.py (lines 1-200) ==="
sed -n '1,200p' webhook_server/libs/pull_request_review_handler.py

# Find where manage_reviewed_by_label is defined
echo "=== Definition of manage_reviewed_by_label ==="
rg "def manage_reviewed_by_label" -n .

# Find any async definitions of helpers invoked in this method
echo "=== Any async helper definitions ==="
rg -nE "async def .*manage_reviewed_by_label|async def .*label_by_user_comment" -n .

Length of output: 2187


🏁 Script executed:

#!/bin/bash
set -e

# Display the implementation of manage_reviewed_by_label for context
echo "=== manage_reviewed_by_label implementation ==="
sed -n '150,260p' webhook_server/libs/labels_handler.py

# Locate and display label_by_user_comment definition
echo "=== label_by_user_comment definition ==="
rg --context 5 "def label_by_user_comment" -n webhook_server/libs/labels_handler.py

# Find where process_pull_request_review_webhook_data is invoked elsewhere
echo "=== References to process_pull_request_review_webhook_data ==="
rg -n "process_pull_request_review_webhook_data" -n .

Length of output: 4374


Remove unnecessary async from process_pull_request_review_webhook_data and update its call site

Both manage_reviewed_by_label and label_by_user_comment (and their underlying _add_label/_remove_label) are plain, synchronous methods. There’s no await inside process_pull_request_review_webhook_data, so it doesn’t need to be an async coroutine.

Please update accordingly:

  • In webhook_server/libs/pull_request_review_handler.py (around line 15), change
    -    async def process_pull_request_review_webhook_data(self, pull_request: PullRequest) -> None:
    +    def process_pull_request_review_webhook_data(self, pull_request: PullRequest) -> None:
  • In webhook_server/libs/github_api.py (around line 144), remove the await when calling it:
    -    return await PullRequestReviewHandler(github_webhook=self)
    -        .process_pull_request_review_webhook_data(pull_request)
    +    return PullRequestReviewHandler(github_webhook=self)
    +        .process_pull_request_review_webhook_data(pull_request)

This will eliminate the unnecessary coroutine overhead and satisfy linters complaining about “async function contains no await.”

webhook_server/libs/runner_handler.py (1)

96-104: 🛠️ Refactor suggestion

Parameter shadowing may hide the caller-supplied pull_request

Inside _prepare_cloned_repo_dir you re-assign the name pull_request using the walrus operator:

if pull_request := self.github_webhook._get_pull_request():

If the caller passed an explicit pull_request, that value is silently overwritten, which is surprising and could lead to inconsistent logging or merge behaviour.

-                if pull_request := self.github_webhook._get_pull_request():
+                _pr = pull_request or self.github_webhook._get_pull_request()
+                if _pr:
                     rc, out, err = run_command(
-                        command=f"{git_cmd} checkout origin/pr/{pull_request.number}",
+                        command=f"{git_cmd} checkout origin/pr/{_pr.number}",

This keeps the caller’s intent intact.

webhook_server/libs/labels_handler.py (1)

84-98: ⚠️ Potential issue

Fix missing pull_request argument in wait_for_label to prevent runtime TypeError.

wait_for_label() now calls label_exists_in_pull_request, which requires the pull_request argument.
Because the method itself no longer receives that object, TimeoutSampler will invoke label_exists_in_pull_request(label=…) and blow up with:

TypeError: label_exists_in_pull_request() missing 1 required positional argument: 'pull_request'

The two callers (_remove_label, _add_label) also omit the object.
Patch:

-    def wait_for_label(self, label: str, exists: bool) -> bool:
+    def wait_for_label(self, *, pull_request: PullRequest, label: str, exists: bool) -> bool:
@@
-                func=self.label_exists_in_pull_request,
-                label=label,
+                func=self.label_exists_in_pull_request,
+                pull_request=pull_request,
+                label=label,
@@
-            return self.wait_for_label(label=label, exists=False)
+            return self.wait_for_label(pull_request=pull_request, label=label, exists=False)
@@
-        self.wait_for_label(label=label, exists=True)
+        self.wait_for_label(pull_request=pull_request, label=label, exists=True)

This preserves the new explicit-context style and restores functionality.

Also applies to: 38-45, 51-58, 80-83

webhook_server/libs/pull_request_handler.py (1)

471-486: ⚠️ Potential issue

_process_verified_for_update_or_new_pull_request uses stale self.pull_request

Similar to the previous comment, the method receives pull_request but internally uses self.pull_request, which is often unset.

-    self.labels_handler._add_label(pull_request=self.pull_request, label=VERIFIED_LABEL_STR)
+    self.labels_handler._add_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)
...
-    self.labels_handler._remove_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)
+    self.labels_handler._remove_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)

Leaving it unfixed silently skips (or crashes) label management for verified workflows.

🧹 Nitpick comments (10)
webhook_server/libs/pull_request_review_handler.py (1)

17-23: Replace stray triple-quoted string with a comment

The multiline string inside the if block is not a docstring and is discarded at runtime.
Use a regular # comment instead to avoid confusion and minor memory overhead.

-            """
-            Available actions:
-                commented
-                approved
-                changes_requested
-            """
+            # Available actions:
+            #   commented
+            #   approved
+            #   changes_requested
webhook_server/libs/runner_handler.py (3)

375-394: Error message wording & list join formatting

Two small issues hamper readability:

  1. Grammar – “must starts” → “must start”.
  2. ": ,".join() produces foo: ,bar – probably meant ", ".join().
-        if any([title.startswith(f"{_name}:") for _name in allowed_names]):
+        if any(title.startswith(f"{name}:") for name in allowed_names):
@@
-            output["text"] = f"Pull request title must starts with allowed title: {': ,'.join(allowed_names)}"
+            output["text"] = (
+                "Pull-request title must start with one of: "
+                f"{', '.join(allowed_names)}"
+            )

396-420: Variable reuse obscures intent in _is_user_valid_to_run_commands

Using comment both inside the list-comprehension and the subsequent loop obscures that we’re dealing with comment bodies, not objects.

-        comments_from_approvers = [
-            comment.body
-            for comment in pull_request.get_issue_comments()
-            if comment.user.login in allowed_user_to_approve
-        ]
-        for comment in comments_from_approvers:
-            if allow_user_comment in comment:
+        comment_bodies = [
+            c.body
+            for c in pull_request.get_issue_comments()
+            if c.user.login in allowed_user_to_approve
+        ]
+        for body in comment_bodies:
+            if allow_user_comment in body:
                 return True

Clearer naming improves maintainability and avoids shadowing.


457-521: cherry_pick: user-visible error handling ok, but consider returning status

The method writes an issue comment on failure but returns nothing.
Returning a boolean or raising a dedicated exception would let callers act programmatically (e.g., set check-run status).

webhook_server/libs/labels_handler.py (1)

214-225: Typo – rename method to wip_or_hold_labels_exists

wip_or_hold_lables_exists is misspelled. A public API with a typo reduces readability and discoverability.

-    def wip_or_hold_lables_exists(self, labels: list[str]) -> str:
+    def wip_or_hold_labels_exists(self, labels: list[str]) -> str:
webhook_server/libs/check_run_handler.py (1)

216-229: Parameter shadowing makes last_commit_check_runs argument useless

required_check_in_progress() accepts last_commit_check_runs but
immediately overwrites it with:

last_commit_check_runs = list(self.github_webhook.last_commit.get_check_runs())

Either drop the parameter or respect the supplied list; otherwise callers
may assume their pre-filtered list is being used.

webhook_server/libs/issue_comment_handler.py (2)

45-49: Log prints boolean instead of action string

if comment_action := self.hook_data["action"] in ("edited", "deleted"):
    self.logger.debug(f"... action is {comment_action}")

comment_action becomes True/False, so the log message is misleading.
A clearer pattern:

-action = self.hook_data["action"]
-if action in ("edited", "deleted"):
-    self.logger.debug(f"... action is {action}")
+action = self.hook_data["action"]
+if action in ("edited", "deleted"):
+    self.logger.debug(f"... action is {action}")

101-107: Blocking GitHub API call inside async context – consider off-thread execution

create_comment_reaction() invokes the synchronous PyGithub call
pull_request.get_issue_comment(...) from within async def user_commands.
Under heavy load this can block the event loop.

Consider delegating to a thread pool:

loop = asyncio.get_running_loop()
_comment = await loop.run_in_executor(None, pull_request.get_issue_comment, issue_comment_id)
await loop.run_in_executor(None, _comment.create_reaction, reaction)

This keeps the coroutine non-blocking.

webhook_server/libs/github_api.py (2)

286-288: Inefficient retrieval of the last commit

list(pull_request.get_commits())[-1] loads all commits into memory only to access the tail element.
Use the GitHub API’s get_commits() iterator with reversed=True (or slice) to fetch a single item:

-def _get_last_commit(self, pull_request: PullRequest) -> Commit:
-    return list(pull_request.get_commits())[-1]
+def _get_last_commit(self, pull_request: PullRequest) -> Commit:
+    # Fetch only the most recent commit to reduce latency / memory
+    return next(iter(pull_request.get_commits().reversed))

This significantly reduces API latency for large PRs.


501-515: Bulk-create reviewer requests instead of per-user loop

pull_request.create_review_request accepts a list of logins, so we can issue one API call instead of O(n) calls.

-_to_add: list[str] = list(set(self.all_pull_request_reviewers))
-...
-for reviewer in _to_add:
-    if reviewer != pull_request.user.login:
-        try:
-            pull_request.create_review_request([reviewer])
-...
+to_request = [r for r in set(self.all_pull_request_reviewers) if r != pull_request.user.login]
+if to_request:
+    try:
+        pull_request.create_review_request(reviewers=to_request)
+    except GithubException as ex:
+        self.logger.debug(f"{self.log_prefix} Failed to add reviewers {to_request}. {ex}")
+        pull_request.create_issue_comment(f"Cannot add reviewers {', '.join(to_request)}. {ex}")

Fewer HTTP round-trips → faster & safer (GitHub rate limits apply per request).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01b67bb and bbb4097.

📒 Files selected for processing (10)
  • webhook_server/app.py (6 hunks)
  • webhook_server/libs/check_run_handler.py (5 hunks)
  • webhook_server/libs/exceptions.py (0 hunks)
  • webhook_server/libs/github_api.py (12 hunks)
  • webhook_server/libs/issue_comment_handler.py (8 hunks)
  • webhook_server/libs/labels_handler.py (8 hunks)
  • webhook_server/libs/pull_request_handler.py (19 hunks)
  • webhook_server/libs/pull_request_review_handler.py (4 hunks)
  • webhook_server/libs/push_handler.py (1 hunks)
  • webhook_server/libs/runner_handler.py (15 hunks)
💤 Files with no reviewable changes (1)
  • webhook_server/libs/exceptions.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • webhook_server/app.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
webhook_server/libs/issue_comment_handler.py (7)
webhook_server/tests/test_add_reviewer_action.py (3)
  • PullRequest (19-27)
  • create_issue_comment (23-24)
  • create_review_request (26-27)
webhook_server/utils/constants.py (1)
  • REACTIONS (72-80)
webhook_server/libs/github_api.py (1)
  • assign_reviewers (501-514)
webhook_server/libs/pull_request_handler.py (1)
  • check_if_can_be_merged (499-580)
webhook_server/libs/labels_handler.py (3)
  • _remove_label (38-49)
  • _add_label (51-82)
  • label_by_user_comment (137-159)
webhook_server/libs/check_run_handler.py (2)
  • set_verify_check_queued (55-56)
  • set_verify_check_success (58-59)
webhook_server/libs/runner_handler.py (2)
  • cherry_pick (457-520)
  • _is_user_valid_to_run_commands (396-421)

@myakove
Copy link
Collaborator Author

myakove commented May 12, 2025

/test build-container

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (2)
webhook_server/libs/pull_request_handler.py (2)

268-276: 🛠️ Refactor suggestion

Blocking time.sleep inside async-driven flow stalls the event-loop

label_all_opened_pull_requests_merge_state_after_merged is invoked from the async
process_pull_request_webhook_data, yet it performs time.sleep(30), blocking the
entire asyncio loop for half a minute.

Convert the function to async and use await asyncio.sleep() (or schedule it in a
separate executor) to keep the server responsive.


469-484: ⚠️ Potential issue

_process_verified_for_update_or_new_pull_request still references self.pull_request

Similar to the previous issue, using the removed attribute will break the
verified-label logic.

-    self.labels_handler._add_label(pull_request=self.pull_request, label=VERIFIED_LABEL_STR)
+    self.labels_handler._add_label(pull_request=pull_request, label=VERIFIED_LABEL_STR)

Search & replace all remnants of self.pull_request to avoid future regressions.

♻️ Duplicate comments (2)
webhook_server/libs/pull_request_handler.py (2)

348-357: ⚠️ Potential issue

assign_reviewers() is invoked without the mandatory pull_request parameter

GithubWebhook.assign_reviewers was refactored to accept pull_request: PullRequest, yet the thread-pool submission still calls it with no arguments.
At runtime this will raise TypeError: assign_reviewers() missing 1 required positional argument.

-prepare_pull_futures.append(executor.submit(self.github_webhook.assign_reviewers))
+prepare_pull_futures.append(
+    executor.submit(self.github_webhook.assign_reviewers, pull_request)
+)

Without this fix, all opened / synchronised PRs will crash during initialisation and none of the remaining futures will execute.


453-468: ⚠️ Potential issue

Incorrect use of self.pull_request inside label_pull_request_by_merge_state

The method correctly receives pull_request, but still forwards self.pull_request
to the LabelsHandler. In most call-sites self.pull_request is unset, so every
label operation will crash.

-    if merge_state == "behind":
-        self.labels_handler._add_label(pull_request=self.pull_request, label=NEEDS_REBASE_LABEL_STR)
+    if merge_state == "behind":
+        self.labels_handler._add_label(pull_request=pull_request, label=NEEDS_REBASE_LABEL_STR)
 ...
-        self.labels_handler._remove_label(pull_request=self.pull_request, label=NEEDS_REBASE_LABEL_STR)
+        self.labels_handler._remove_label(pull_request=pull_request, label=NEEDS_REBASE_LABEL_STR)
 ...
-        self.labels_handler._add_label(pull_request=self.pull_request, label=HAS_CONFLICTS_LABEL_STR)
+        self.labels_handler._add_label(pull_request=pull_request, label=HAS_CONFLICTS_LABEL_STR)
 ...
-        self.labels_handler._remove_label(pull_request=self.pull_request, label=HAS_CONFLICTS_LABEL_STR)
+        self.labels_handler._remove_label(pull_request=pull_request, label=HAS_CONFLICTS_LABEL_STR)

Apply the same fix anywhere else self.pull_request sneaks back in.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9a255b6 and 788b431.

📒 Files selected for processing (2)
  • webhook_server/libs/pull_request_handler.py (17 hunks)
  • webhook_server/libs/runner_handler.py (25 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • webhook_server/libs/runner_handler.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: python-module-install
  • GitHub Check: tox
  • GitHub Check: pre-commit

@myakove
Copy link
Collaborator Author

myakove commented May 12, 2025

/retest build-container

@myakove
Copy link
Collaborator Author

myakove commented May 12, 2025

/retest build-container

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
webhook_server/libs/runner_handler.py (4)

134-137: Possible unnecessary check

You're checking for pull_request and success but it appears that this code would only be reached if the previous code at line 125 successfully set pull_request, making this check redundant.

- if pull_request and success:
+ if success:  # pull_request is already verified to be truthy in the parent block

38-39: Consider adding type annotations for the return value

The function returns a tuple, but the type annotation for the generator doesn't match what is actually returned from the yield statement.

Update the type annotation to match the actual return value:

def _prepare_cloned_repo_dir(
    self,
    clone_repo_dir: str,
    pull_request: PullRequest | None = None,
    is_merged: bool = False,
    checkout: str = "",
    tag_name: str = "",
-) -> Generator[tuple[bool, Any, Any], None, None]:
+) -> Generator[tuple[bool, str, str], None, None]:

247-252: Simplify the conditional check

The conditional statement can be simplified for better readability.

- if (
-     reviewed_user
-     and pull_request
-     and not self._is_user_valid_to_run_commands(reviewed_user=reviewed_user, pull_request=pull_request)
- ):
+ if reviewed_user and pull_request and not self._is_user_valid_to_run_commands(reviewed_user=reviewed_user, pull_request=pull_request):

168-171: Consider making the pull_request parameter non-optional

Since the method seems to require a PullRequest object to function correctly (it's used to determine base.ref on line 180), consider making it a required parameter without a default value.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 788b431 and 138f32e.

📒 Files selected for processing (5)
  • webhook_server/libs/github_api.py (12 hunks)
  • webhook_server/libs/issue_comment_handler.py (6 hunks)
  • webhook_server/libs/pull_request_handler.py (16 hunks)
  • webhook_server/libs/push_handler.py (2 hunks)
  • webhook_server/libs/runner_handler.py (25 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • webhook_server/libs/push_handler.py
  • webhook_server/libs/issue_comment_handler.py
  • webhook_server/libs/github_api.py
  • webhook_server/libs/pull_request_handler.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: tox
  • GitHub Check: pre-commit
  • GitHub Check: python-module-install
  • GitHub Check: can-be-merged
🔇 Additional comments (4)
webhook_server/libs/runner_handler.py (4)

10-10: Good import addition

Adding the explicit import of PullRequest is a good choice when updating the code to use the type as a parameter.


234-243: Consistent typing for pull_request parameter

While most methods use pull_request: PullRequest as their parameter type, this method uses pull_request: PullRequest | None = None. Ensure this inconsistency is intentional based on the function's requirements.

This appears to be intentionally different since there are checks for pull_request being None throughout the function, and it can be called without a pull request in certain contexts.


262-264: Good parameter forwarding

Correctly updating method calls to forward the pull_request parameter rather than relying on an instance attribute. This is a good refactoring that makes data flow more explicit.


1-522: Overall refactoring direction looks good

The changes consistently move away from using self.pull_request as an instance attribute to requiring an explicit PullRequest parameter. This makes the code more maintainable by:

  1. Making dependencies explicit
  2. Avoiding hidden state
  3. Improving testability

This aligns with good software engineering practices.

@myakove
Copy link
Collaborator Author

myakove commented May 12, 2025

/retest build-container

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
webhook_server/tests/test_add_reviewer_action.py (1)

19-28: Consider removing redundant PullRequest class

This file defines its own PullRequest class (lines 19-28) but also uses the imported fixture. Since the code now uses the fixture from conftest.py, this local class definition could be removed to avoid confusion.

-class PullRequest:
-    def __init__(self):
-        pass
-
-    def create_issue_comment(self, _):
-        return
-
-    def create_review_request(self, _):
-        return
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 138f32e and 046d5df.

📒 Files selected for processing (8)
  • webhook_server/libs/github_api.py (12 hunks)
  • webhook_server/libs/issue_comment_handler.py (6 hunks)
  • webhook_server/libs/pull_request_handler.py (16 hunks)
  • webhook_server/libs/runner_handler.py (24 hunks)
  • webhook_server/tests/conftest.py (2 hunks)
  • webhook_server/tests/test_add_reviewer_action.py (1 hunks)
  • webhook_server/tests/test_pull_request_owners.py (1 hunks)
  • webhook_server/tests/test_pull_request_size.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • webhook_server/libs/github_api.py
  • webhook_server/libs/runner_handler.py
  • webhook_server/libs/pull_request_handler.py
🧰 Additional context used
🧬 Code Graph Analysis (2)
webhook_server/tests/test_pull_request_size.py (2)
webhook_server/tests/conftest.py (2)
  • PullRequest (103-118)
  • pull_request (122-123)
webhook_server/libs/labels_handler.py (1)
  • get_size (100-114)
webhook_server/tests/conftest.py (1)
webhook_server/tests/test_add_reviewer_action.py (3)
  • PullRequest (19-27)
  • create_issue_comment (23-24)
  • create_review_request (26-27)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: tox
  • GitHub Check: build-container
  • GitHub Check: python-module-install
  • GitHub Check: pre-commit
🔇 Additional comments (25)
webhook_server/tests/conftest.py (2)

98-118: Good implementation of Label class and PullRequest enhancements

The addition of the Label class and enhancements to the PullRequest class properly support the refactoring to explicit parameter passing. This centralized test fixture approach improves test consistency across the codebase.


121-123: Good implementation of pull_request fixture

The pull_request fixture provides a clean way to create and share PullRequest instances across tests, which aligns well with the refactoring to explicit parameter passing throughout the codebase.

webhook_server/tests/test_pull_request_owners.py (1)

106-110: Good adaptation of test signature and method call

The test has been properly updated to include the pull_request parameter and pass it explicitly to the get_all_repository_approvers_and_reviewers method, which aligns with the codebase refactoring to explicit parameter passing.

webhook_server/tests/test_pull_request_size.py (2)

4-4: Good import of shared PullRequest class

Importing the PullRequest class from conftest.py eliminates duplicate code and promotes consistency across tests.


21-23: Good implementation of explicit pull_request usage

The test now correctly creates a PullRequest instance with specific additions and deletions and passes it explicitly to the get_size method, aligning with the refactoring approach across the codebase.

webhook_server/tests/test_add_reviewer_action.py (2)

30-35: Good adaptation of test and method call

The test function signature and method call have been properly updated to include and pass the pull_request parameter explicitly, which aligns with the codebase refactoring.


39-44: Good adaptation of negative test case

The negative test case has also been properly updated to include and pass the pull_request parameter explicitly, maintaining consistency with the refactoring approach.

webhook_server/libs/issue_comment_handler.py (18)

6-7: Import added for PullRequest type.

This is a good addition for properly typing the parameter in the refactored methods.


45-46: Converted to async function with explicit PullRequest parameter.

This change aligns with the PR objective to use more async code, while also making the pull request dependency explicit rather than relying on instance state.


55-55: Now references pull_request parameter.

Updated to use the explicit parameter instead of an implied instance attribute.


62-64: Using await with explicit parameter passing.

Correctly awaits the now async method and passes the pull_request parameter explicitly.


69-71: Method converted to async with explicit PullRequest parameter.

Good refactoring to make the pull request dependency explicit and support asynchronous operations.


101-101: Now calls method directly on pull_request parameter.

This properly uses the explicitly passed parameter instead of an instance attribute.


104-106: Updated create_comment_reaction call with explicit parameter.

This change maintains consistency with the new pattern of explicit parameter passing.


109-109: Updated method calls with explicit pull_request parameter.

All method calls consistently pass the pull_request parameter explicitly, maintaining the new pattern throughout the code.

Also applies to: 112-113, 115-116, 118-119, 121-122


125-131: Updated runner_handler.run_build_container with pull_request parameter.

This maintains consistency with the pattern of explicit parameter passing.


138-146: Updated label handling with explicit parameter passing.

All calls to label handlers and pull request edit methods now use the explicitly passed parameter, maintaining consistent code style.


153-159: Updated label operations with explicit parameter passing and await.

All label operations now use the explicit parameter, and the check_if_can_be_merged call correctly uses await since it's now an async method.


176-178: Updated create_comment_reaction method signature.

This method now properly accepts an explicit pull_request parameter instead of using an instance attribute.


180-192: Updated _add_reviewer_by_user_comment with explicit parameter.

The method now accepts and uses the PullRequest parameter directly for all operations.


193-230: Updated process_cherry_pick_command with explicit parameter.

The method correctly uses the passed pull_request parameter for all operations, maintaining consistency with the new pattern.


231-234: Updated process_retest_command with runner validation using explicit parameter.

The method now passes the pull_request parameter to the user validation check.


240-246: Mapping to handler functions maintained correctly.

All the handler functions referenced in the mapping are likely updated in their respective files to accept a pull_request parameter as indicated by the PR summary.


281-285: Updated thread pool execution with pull_request parameter.

The executor now correctly passes the pull_request parameter to the mapped functions when submitting them for execution.


45-289: Overall async implementation is well-executed.

The changes consistently implement the PR objective of using more async code. Methods that interact with external services or perform potentially blocking operations have been properly converted to async, with await keywords added at all call sites.

The refactoring to pass PullRequest explicitly is also thorough and consistently applied throughout the file, improving code clarity by making dependencies explicit rather than relying on instance state.

Copy link
Collaborator

@rnetser rnetser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sss

@rnetser
Copy link
Collaborator

rnetser commented May 18, 2025

/lgtm

4 similar comments
@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/lgtm

@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/lgtm

@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/lgtm

@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/lgtm

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
webhook_server/tests/test_pull_request_owners.py (1)

212-226: 🛠️ Refactor suggestion

Missing setting of parent_committer attribute.

In this test function, process_github_webhook.parent_committer is not set, unlike in the test_check_pr_approved function. This inconsistency could lead to test failures if the code depends on this attribute.

    pull_request_handler = PullRequestHandler(
        github_webhook=process_github_webhook,
        owners_file_handler=owners_file_handler,
    )
+   process_github_webhook.parent_committer = "test"
    check_if_pr_approved = await pull_request_handler._check_if_pr_approved(
♻️ Duplicate comments (8)
webhook_server/app.py (1)

195-195: ⚠️ Potential issue

Potential KeyError when repository key is missing

This line will raise a KeyError if GitHub sends an event without a repository field (e.g., ping or org hooks).

-    logger = get_logger_with_params(name="main", repository_name=hook_data["repository"]["name"])
+    repository_name = hook_data.get("repository", {}).get("name", "unknown-repo")
+    logger = get_logger_with_params(name="main", repository_name=repository_name)
webhook_server/tests/conftest.py (1)

103-107: 🛠️ Refactor suggestion

Methods should be marked as async to match production code.

These stub methods (create_issue_comment, create_review_request, get_files) should be marked as async to properly mimic the asynchronous nature of the refactored production code. This will prevent potential test failures when these methods are awaited.

-    def create_issue_comment(self, *args, **kwargs): ...
+    async def create_issue_comment(self, *args, **kwargs): ...

-    def create_review_request(self, *args, **kwargs): ...
+    async def create_review_request(self, *args, **kwargs): ...

-    def get_files(self): ...
+    async def get_files(self): ...
webhook_server/libs/owners_files_handler.py (1)

212-261: ⚠️ Potential issue

Potential issue with awaiting valid_users_to_run_commands cached property multiple times.

The method awaits the valid_users_to_run_commands cached property twice. The first time it's awaited, the result is cached, but subsequent attempts to await it will result in a TypeError since the cached value is a list, not an awaitable.

This can lead to runtime errors. Fix by saving the result after the first await:

-        if reviewed_user not in await self.valid_users_to_run_commands:
+        valid_users = await self.valid_users_to_run_commands
+        if reviewed_user not in valid_users:
...
-            self.logger.debug(f"{self.log_prefix} {reviewed_user} is not in {await self.valid_users_to_run_commands}")
+            self.logger.debug(f"{self.log_prefix} {reviewed_user} is not in {valid_users}")
webhook_server/libs/pull_request_handler.py (5)

74-76: ⚠️ Potential issue

Fix error logging to show the actual failed task.

When logging exceptions from async tasks, you're printing the entire results list instead of the individual result that failed. This makes debugging difficult.

-                if isinstance(result, Exception):
-                    self.logger.error(f"{self.log_prefix} Async task failed: {results}")
+                if isinstance(result, Exception):
+                    self.logger.error(f"{self.log_prefix} Async task failed: {result}")

270-270: ⚠️ Potential issue

Make repository.get_pulls() non-blocking.

The direct call to repository.get_pulls() in an async method will block the event loop when it performs network I/O operations.

-        for pull_request in self.repository.get_pulls(state="open"):
+        for pull_request in await asyncio.to_thread(self.repository.get_pulls, state="open"):

526-526: ⚠️ Potential issue

Remove incorrect await on synchronous method.

The pull_request_labels_names() method is synchronous and returns a list, not an awaitable coroutine. Awaiting it will raise a TypeError at runtime.

-            _labels = await self.labels_handler.pull_request_labels_names(pull_request=pull_request)
+            _labels = self.labels_handler.pull_request_labels_names(pull_request=pull_request)

336-345: 🛠️ Refactor suggestion

Add error handling for GitHub API calls.

The async GitHub API calls in close_issue_for_merged_or_closed_pr could fail with no error handling, potentially causing unhandled exceptions.

    async def close_issue_for_merged_or_closed_pr(self, pull_request: PullRequest, hook_action: str) -> None:
-        for issue in await asyncio.to_thread(self.repository.get_issues):
-            if issue.body == self._generate_issue_body(pull_request=pull_request):
-                self.logger.info(f"{self.log_prefix} Closing issue {issue.title} for PR: {pull_request.title}")
-                await asyncio.to_thread(
-                    issue.create_comment,
-                    f"{self.log_prefix} Closing issue for PR: {pull_request.title}.\nPR was {hook_action}.",
-                )
-                await asyncio.to_thread(issue.edit, state="closed")
-
-                break
+        try:
+            for issue in await asyncio.to_thread(self.repository.get_issues):
+                if issue.body == self._generate_issue_body(pull_request=pull_request):
+                    self.logger.info(f"{self.log_prefix} Closing issue {issue.title} for PR: {pull_request.title}")
+                    try:
+                        await asyncio.to_thread(
+                            issue.create_comment,
+                            f"{self.log_prefix} Closing issue for PR: {pull_request.title}.\nPR was {hook_action}.",
+                        )
+                        await asyncio.to_thread(issue.edit, state="closed")
+                        break
+                    except Exception as e:
+                        self.logger.error(f"{self.log_prefix} Failed to close issue: {e}")
+        except Exception as e:
+            self.logger.error(f"{self.log_prefix} Failed to get issues: {e}")

487-497: ⚠️ Potential issue

Make add_to_assignees non-blocking in async method.

The direct calls to pull_request.add_to_assignees() will block the event loop when they perform network I/O operations.

    async def add_pull_request_owner_as_assingee(self, pull_request: PullRequest) -> None:
        try:
            self.logger.info(f"{self.log_prefix} Adding PR owner as assignee")
-            pull_request.add_to_assignees(pull_request.user.login)
+            await asyncio.to_thread(pull_request.add_to_assignees, pull_request.user.login)
        except Exception as exp:
            self.logger.debug(f"{self.log_prefix} Exception while adding PR owner as assignee: {exp}")

            if self.github_webhook.root_approvers:
                self.logger.debug(f"{self.log_prefix} Falling back to first approver as assignee")
-                pull_request.add_to_assignees(self.github_webhook.root_approvers[0])
+                await asyncio.to_thread(pull_request.add_to_assignees, self.github_webhook.root_approvers[0])
🧹 Nitpick comments (7)
webhook_server/libs/owners_files_handler.py (2)

21-32: Method name has a typo and lacks proper initialization safeguards.

The method name is spelled correctly as "initialize" but lacks safeguards to enforce initialization before use of related attributes. While there is an _ensure_initialized method, a more robust design would also include a factory method.

Consider adding an async factory method to ensure initialization is always performed:

@classmethod
async def create(cls, github_webhook: Any, pull_request: PullRequest) -> "OwnersFileHandler":
    """Create and initialize an OwnersFileHandler instance."""
    instance = cls(github_webhook=github_webhook)
    await instance.initialize(pull_request=pull_request)
    return instance

This would allow callers to use a simpler pattern:

owners_handler = await OwnersFileHandler.create(github_webhook=webhook, pull_request=pr)

173-211: Complex logic for determining which OWNERS files apply.

The owners_data_for_changed_files method has complex logic to determine which OWNERS files apply to changed folders. Consider adding more detailed comments to explain the logic, especially around the require_root_approvers flag and the nested conditions.

Add more detailed comments explaining the logic:

 async def owners_data_for_changed_files(self) -> dict[str, dict[str, Any]]:
     self._ensure_initialized()
 
     data: dict[str, dict[str, Any]] = {}
 
+    # Extract all parent directories from changed files
     changed_folders = {Path(cf).parent for cf in self.changed_files}
 
     changed_folder_match: list[Path] = []
 
     require_root_approvers: bool | None = None
 
+    # First pass: find all OWNERS files that are in or parent to changed folders
     for owners_dir, owners_data in self.all_repository_approvers_and_reviewers.items():
         if owners_dir == ".":
             continue
webhook_server/tests/test_pull_request_owners.py (1)

135-158: Misspelled variable name in assertion.

The variable name owners_data_chaged_files_expected is misspelled (should be "changed" not "chaged"). This doesn't affect functionality but makes the code less readable.

-    owners_data_chaged_files_expected = {
+    owners_data_changed_files_expected = {
         "folder5": {
             "approvers": ["folder5_approver1", "folder5_approver2"],
             "reviewers": ["folder5_reviewer1", "folder5_reviewer2"],
             "root-approvers": False,
         },
         # ... other folders
     }

-    assert owners_data_changed_files_result == owners_data_chaged_files_expected
+    assert owners_data_changed_files_result == owners_data_changed_files_expected
webhook_server/libs/pull_request_handler.py (1)

351-358: Simplify task append by avoiding double unpacking.

The current approach of using ** to pass a dictionary of arguments makes the code less readable and error-prone. Direct argument passing is clearer.

-        tasks.append(
-            self.labels_handler._add_label(
-                **{
-                    "pull_request": pull_request,
-                    "label": f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}",
-                },
-            )
-        )
+        tasks.append(
+            self.labels_handler._add_label(
+                pull_request=pull_request,
+                label=f"{BRANCH_LABEL_PREFIX}{pull_request.base.ref}"
+            )
+        )
webhook_server/libs/runner_handler.py (3)

150-158: Consider making these utility methods static

The utility methods is_podman_bug and fix_podman_bug don't use instance variables and could be made static.

-    def is_podman_bug(self, err: str) -> bool:
+    @staticmethod
+    def is_podman_bug(err: str) -> bool:
        _err = "Error: current system boot ID differs from cached boot ID; an unhandled reboot has occurred"
        return _err in err.strip()

-    def fix_podman_bug(self) -> None:
+    @staticmethod
+    def fix_podman_bug() -> None:
-        self.logger.debug(f"{self.log_prefix} Fixing podman bug")
+        # Consider passing the logger or using a class or module-level logger
+        print("Fixing podman bug")
        shutil.rmtree("/tmp/storage-run-1000/containers", ignore_errors=True)
        shutil.rmtree("/tmp/storage-run-1000/libpod/tmp", ignore_errors=True)

165-167: Update self reference in is_podman_bug call

If you implement the suggested staticmethod change, update this call accordingly.

-        if self.is_podman_bug(err=err):
+        if RunnerHandler.is_podman_bug(err=err):
-            self.fix_podman_bug()
+            RunnerHandler.fix_podman_bug()

286-290: Use keyword arguments for better readability

For consistency with other calls in the codebase, consider explicitly naming all parameters for better readability.

-        async with self._prepare_cloned_repo_dir(
-            pull_request=pull_request,
-            is_merged=is_merged,
-            tag_name=tag,
-            clone_repo_dir=clone_repo_dir,
-        ) as _res:
+        async with self._prepare_cloned_repo_dir(
+            clone_repo_dir=clone_repo_dir,
+            pull_request=pull_request,
+            is_merged=is_merged,
+            tag_name=tag,
+        ) as _res:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between c44074c and 7684bf0.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (27)
  • .gitignore (1 hunks)
  • Dockerfile (2 hunks)
  • entrypoint.py (2 hunks)
  • entrypoint.sh (0 hunks)
  • pyproject.toml (2 hunks)
  • tox.toml (1 hunks)
  • webhook_server/app.py (5 hunks)
  • webhook_server/libs/check_run_handler.py (6 hunks)
  • webhook_server/libs/config.py (1 hunks)
  • webhook_server/libs/exceptions.py (1 hunks)
  • webhook_server/libs/github_api.py (8 hunks)
  • webhook_server/libs/issue_comment_handler.py (5 hunks)
  • webhook_server/libs/labels_handler.py (7 hunks)
  • webhook_server/libs/owners_files_handler.py (1 hunks)
  • webhook_server/libs/pull_request_handler.py (17 hunks)
  • webhook_server/libs/pull_request_review_handler.py (2 hunks)
  • webhook_server/libs/push_handler.py (3 hunks)
  • webhook_server/libs/runner_handler.py (14 hunks)
  • webhook_server/tests/conftest.py (4 hunks)
  • webhook_server/tests/test_add_reviewer_action.py (2 hunks)
  • webhook_server/tests/test_prepare_retest_wellcome_comment.py (1 hunks)
  • webhook_server/tests/test_pull_request_owners.py (28 hunks)
  • webhook_server/tests/test_pull_request_size.py (2 hunks)
  • webhook_server/utils/github_repository_and_webhook_settings.py (2 hunks)
  • webhook_server/utils/github_repository_settings.py (4 hunks)
  • webhook_server/utils/helpers.py (8 hunks)
  • webhook_server/utils/webhook.py (1 hunks)
💤 Files with no reviewable changes (1)
  • entrypoint.sh
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (15)
  • entrypoint.py
  • webhook_server/tests/test_pull_request_size.py
  • webhook_server/libs/exceptions.py
  • pyproject.toml
  • tox.toml
  • Dockerfile
  • webhook_server/tests/test_prepare_retest_wellcome_comment.py
  • webhook_server/libs/pull_request_review_handler.py
  • webhook_server/utils/github_repository_and_webhook_settings.py
  • webhook_server/tests/test_add_reviewer_action.py
  • webhook_server/libs/push_handler.py
  • webhook_server/utils/github_repository_settings.py
  • webhook_server/libs/issue_comment_handler.py
  • webhook_server/libs/labels_handler.py
  • webhook_server/libs/check_run_handler.py
🧰 Additional context used
🧬 Code Graph Analysis (6)
webhook_server/utils/webhook.py (1)
webhook_server/utils/helpers.py (1)
  • get_github_repo_api (73-77)
webhook_server/libs/config.py (1)
webhook_server/utils/helpers.py (1)
  • get_github_repo_api (73-77)
webhook_server/app.py (4)
webhook_server/libs/exceptions.py (1)
  • RepositoryNotFoundError (1-2)
webhook_server/libs/github_api.py (1)
  • GithubWebhook (45-374)
webhook_server/utils/helpers.py (1)
  • get_logger_with_params (22-56)
webhook_server/libs/config.py (2)
  • Config (11-87)
  • root_data (33-39)
webhook_server/utils/helpers.py (2)
webhook_server/libs/exceptions.py (1)
  • NoApiTokenError (11-12)
webhook_server/libs/config.py (2)
  • Config (11-87)
  • get_value (67-87)
webhook_server/tests/test_pull_request_owners.py (3)
webhook_server/tests/conftest.py (6)
  • get_contents (50-87)
  • owners_file_handler (140-141)
  • process_github_webhook (135-136)
  • pull_request (111-112)
  • Repository (42-87)
  • github_webhook (116-131)
webhook_server/libs/owners_files_handler.py (4)
  • get_all_repository_approvers_and_reviewers (87-125)
  • owners_data_for_changed_files (173-210)
  • get_all_pull_request_approvers (151-160)
  • get_all_pull_request_reviewers (162-171)
webhook_server/libs/pull_request_handler.py (1)
  • _check_if_pr_approved (583-647)
webhook_server/libs/pull_request_handler.py (7)
webhook_server/tests/test_add_reviewer_action.py (2)
  • PullRequest (21-29)
  • create_issue_comment (25-26)
webhook_server/tests/conftest.py (6)
  • PullRequest (95-107)
  • github_webhook (116-131)
  • owners_file_handler (140-141)
  • pull_request (111-112)
  • create_issue_comment (103-103)
  • base (100-101)
webhook_server/libs/check_run_handler.py (13)
  • set_verify_check_success (58-59)
  • set_verify_check_queued (55-56)
  • set_merge_check_queued (91-92)
  • set_run_tox_check_queued (61-65)
  • set_run_pre_commit_check_queued (76-80)
  • set_python_module_install_queued (118-122)
  • set_container_build_queued (103-107)
  • set_conventional_title_queued (137-138)
  • set_merge_check_in_progress (94-95)
  • required_check_in_progress (294-312)
  • required_check_failed_or_no_status (224-258)
  • set_merge_check_success (97-98)
  • set_merge_check_failure (100-101)
webhook_server/libs/labels_handler.py (4)
  • _add_label (56-87)
  • _remove_label (43-54)
  • add_size_label (116-135)
  • pull_request_labels_names (39-41)
webhook_server/libs/owners_files_handler.py (5)
  • OwnersFileHandler (14-301)
  • assign_reviewers (212-230)
  • root_approvers (46-51)
  • root_reviewers (38-43)
  • owners_data_for_changed_files (173-210)
webhook_server/libs/runner_handler.py (7)
  • cherry_pick (406-472)
  • run_build_container (237-345)
  • run_podman_command (159-169)
  • run_tox (171-206)
  • run_pre_commit (208-235)
  • run_install_python_module (347-380)
  • run_conventional_title_check (382-401)
webhook_server/libs/github_api.py (1)
  • container_repository_and_tag (319-341)
⏰ Context from checks skipped due to timeout of 90000ms (9)
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: can-be-merged
  • GitHub Check: python-module-install
  • GitHub Check: pre-commit
  • GitHub Check: build-container
  • GitHub Check: tox
🔇 Additional comments (42)
webhook_server/utils/webhook.py (1)

30-30: Parameter name update is consistent with function definition

The parameter name change from github_api to github_app_api aligns with the function signature in webhook_server/utils/helpers.py. This ensures consistency across the codebase.

webhook_server/libs/config.py (1)

50-50: Parameter name update is consistent with function definition

The parameter name change from github_api to github_app_api aligns with the function signature in webhook_server/utils/helpers.py. This ensures consistency across the codebase.

webhook_server/app.py (5)

4-11: Added JSON import and simplified exceptions import

The switch to synchronous json.loads() for parsing request bodies is appropriate and reflects the removal of NoPullRequestError from the exceptions module.


29-37: Improved logger and HTTP client initialization

Good change to use a module-level logger for consistency and to initialize the HTTP client in the lifespan context manager rather than at import time. This addresses cross-event-loop issues identified in previous reviews.


39-73: New async functions for fetching IP allowlists with proper error handling

The implementation of dedicated async functions for fetching GitHub and Cloudflare IP allowlists is well done. These functions now properly raise exceptions instead of silently returning empty lists, which addresses a previous review concern about failing fast when allowlists can't be fetched.


112-161: Improved lifespan context manager with proper resource management

The lifespan context manager now correctly initializes the HTTP client, performs async operations, and uses a finally block to ensure proper cleanup. The IP allowlist validation has been improved to handle invalid CIDRs individually.

However, there's still a potential issue with the ALLOWED_IPS handling:

 if networks:
     ALLOWED_IPS = tuple(networks)
     LOGGER.info(f"IP allowlist initialized successfully with {len(ALLOWED_IPS)} networks.")
+else:
+    ALLOWED_IPS = ()  # Clear any previous value
+    LOGGER.info("IP allowlist disabled or empty – all sources accepted.")

Without this change, if no networks are loaded (because verification flags are disabled or both endpoints fail), ALLOWED_IPS might retain a stale value from a previous run, as mentioned in past reviews.


189-191: Switch to synchronous JSON parsing

Using json.loads() on the raw bytes is more direct than await request.json(), which is good. Make sure the payload is properly decoded if necessary.

webhook_server/utils/helpers.py (4)

22-56: Improved logger configuration with better defaults

Good improvements to the logging setup:

  1. Setting mask_sensitive to True by default enhances security
  2. Creating the logs directory if it doesn't exist ensures logging won't fail
  3. Reducing the max log file size from 50MB to 10MB is more manageable

73-77: Enhanced repository API access with debug logging

Added debug logging improves observability when accessing repository APIs. The parameter name change also provides consistency with usage elsewhere in the codebase.


80-135: Converted run_command to async implementation

Successfully converted the command execution from synchronous to asynchronous using asyncio.create_subprocess_exec. This is a significant improvement that will prevent blocking I/O operations from affecting server performance.


147-200: Enhanced API token validation and error handling

Excellent improvements to the GitHub API token validation:

  1. Skipping APIs with rate limit of 60 (indicating invalid tokens)
  2. Better error handling when retrieving user logins
  3. Raising a proper exception when no valid API/token combination is found
webhook_server/tests/conftest.py (5)

8-10: Import order matters for environment variable setup.

Setting the environment variable WEBHOOK_SERVER_DATA_DIR before importing GithubWebhook is important because it ensures that the environment variable is available when the GithubWebhook class is initialized. Good implementation.


67-73: Good implementation of parameterizable test fixtures.

The changed_files fixture nicely handles both parameterized test cases and default values, providing flexibility for various test scenarios.


129-131: Add wait for asyncio.to_thread call.

The test needs to properly await the asynchronous GitHub API call. The current implementation correctly awaits the result.


90-93: Great implementation of test data class.

The Label class is properly implemented with a name attribute to simulate GitHub's API objects in tests.


115-132: Nice separation of fixtures for better test modularity.

The separation of github_webhook into multiple fixtures (process_github_webhook and owners_file_handler) improves test modularity and follows good testing practices.

webhook_server/libs/owners_files_handler.py (4)

33-36: Good validation of initialization requirement.

The _ensure_initialized method properly checks for required attributes and raises a clear error message when initialization hasn't been performed.


87-126: Efficient async implementation of OWNERS file retrieval.

The implementation efficiently creates tasks for fetching OWNERS file content asynchronously and uses asyncio.gather to run them in parallel, which is a good practice. The method also includes proper error handling and validation.


262-275: Good use of asyncio.functools.cached_property for expensive operations.

Using functools.cached_property for expensive operations like retrieving users with permission to run commands is a good practice, as it prevents redundant API calls.


53-55: Properly wrapping synchronous code with asyncio.to_thread.

The list_changed_files method correctly wraps the synchronous pull_request.get_files call with asyncio.to_thread, ensuring it doesn't block the event loop.

webhook_server/tests/test_pull_request_owners.py (3)

125-132: Good conversion to async test pattern.

The test has been properly converted to use async/await pattern with the @pytest.mark.asyncio decorator. It correctly awaits the asynchronous get_all_repository_approvers_and_reviewers method.


8-16: Good approach to defining test data.

The ALL_CHANGED_FILES list is a clear and concise way to define test data for multiple test scenarios.


96-123: Properly setting up test data for reviewers and approvers.

The fixtures for reviewers and approvers are well-structured and provide comprehensive test data.

webhook_server/libs/github_api.py (5)

108-126: Good asynchronous handling of pull request processing flow.

The method properly checks if the pull request is a draft and only proceeds with processing for non-draft PRs. It also correctly awaits asynchronous operations like getting the last commit.


131-165: Consistent initialization of OwnersFileHandler.

Each event handler branch correctly initializes the OwnersFileHandler and awaits its initialization before passing it to the respective handler. This ensures consistent behavior across different event types.


279-295: Properly wrapping GitHub API calls with asyncio.to_thread.

All GitHub API calls are correctly wrapped with asyncio.to_thread to prevent blocking the event loop. This is a significant improvement for the server's scalability and responsiveness.


306-309: Good conversion of _get_last_commit to async.

The method is properly converted to async and correctly wraps the synchronous pull_request.get_commits call with asyncio.to_thread.


319-342: Function now handles missing pull_request parameter.

The container_repository_and_tag method has been improved to handle the case when pull_request is None, returning None early instead of causing an error.

webhook_server/libs/pull_request_handler.py (6)

25-51: Effective dependency injection with OwnersFileHandler.

The refactoring to use OwnersFileHandler as an explicit dependency is a good approach. This improves testability and makes dependencies clearer.


53-157: Good async conversion of the main webhook handler.

Converting process_pull_request_webhook_data to async with proper awaiting of dependent tasks is well-implemented. Using asyncio.gather with return_exceptions=True is the right pattern for concurrent task execution with error handling.


259-273: Good conversion from time.sleep to asyncio.sleep.

Correctly using asyncio.sleep instead of time.sleep prevents blocking the event loop during the waiting period, allowing other coroutines to run.


313-315: Properly non-blocking GitHub API call.

Good use of asyncio.to_thread to wrap the synchronous create_issue_comment method, preventing it from blocking the event loop.


419-423: Properly non-blocking auto-merge enabling.

Good use of asyncio.to_thread to wrap the synchronous enable_automerge method, preventing it from blocking the event loop.


347-383: Excellent async task orchestration.

The process_opened_or_synchronize_pull_request method is well-refactored to use asyncio.gather for concurrent execution of multiple tasks, with proper error handling for individual tasks.

webhook_server/libs/runner_handler.py (8)

1-4: Great job on the async conversion!

The imports are correctly updated to include asyncio and the proper typing for async generators, which aligns well with the PR objective to "Use more async code".


25-27: The constructor now properly accepts OwnersFileHandler

Good implementation of dependency injection for the OwnersFileHandler, making testing easier and decoupling components.


35-43: Correctly converted to async context manager

The function has been properly converted to an async context manager with the correct return type (AsyncGenerator). This is a good pattern for resource management in async code.


128-144: Fixed walrus operator usage

You've correctly addressed a previous review comment by using _pull_request instead of pull_request for the walrus operator, avoiding parameter shadowing.


159-170: Properly converted run_podman_command to async

The run_podman_command method has been correctly converted to async, including proper awaiting of the run_command function and recursive call.


250-258: Nice addition of async user validation

Good implementation of async user validation through the OwnersFileHandler to determine if a user is allowed to run commands.


313-346: Good use of asyncio.to_thread for GitHub API calls

You've correctly wrapped the GitHub API call to create_issue_comment with asyncio.to_thread to prevent blocking the event loop. This addresses a previous review comment.


403-404: Well-designed async branch existence check

The new is_branch_exists method properly uses asyncio.to_thread to wrap the synchronous GitHub API call, maintaining the async flow throughout the code.

Comment on lines 168 to 177
apis_and_tokens = get_apis_and_tokes_from_config(config=self.config)
self.auto_verified_and_merged_users.extend([_api[0].get_user().login for _api in apis_and_tokens])
for _api, _ in apis_and_tokens:
if _api.rate_limiting[-1] == 60:
self.logger.warning(
f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping"
)
continue

self.auto_verified_and_merged_users.append(_api.get_user().login)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Check for valid API token is incomplete.

The method checks if the API rate limit is 60 (indicating an invalid token) but doesn't validate the token in other ways. Consider adding more robust token validation.


🏁 Script executed:

#!/bin/bash
# Check if there are other validation methods for GitHub API tokens
rg -A 3 -B 3 "rate_limiting" --type py

Length of output: 1179


Add robust GitHub API token validation

The current check only skips tokens whose rate_limiting[-1] is 60, but there are other failure modes (expired credentials, insufficient scopes, network issues). We should explicitly validate each token by making a lightweight API call and handling authentication errors.

Suggested changes:

  • In webhook_server/libs/github_api.py (and similarly in webhook_server/utils/helpers.py), wrap the token check in a try/except block:
    • Call _api.get_user() (or the /rate_limit endpoint) inside the try.
    • Catch authentication exceptions (e.g. github.GithubException.BadCredentialsException) and log a warning.
    • Only append login on successful validation.
  • Example diff for add_api_users_to_auto_verified_and_merged_users:
         for _api, _ in apis_and_tokens:
  •       if _api.rate_limiting[-1] == 60:
    
  •           self.logger.warning(
    
  •               f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping"
    
  •           )
    
  •           continue
    
  •       self.auto_verified_and_merged_users.append(_api.get_user().login)
    
  •       try:
    
  •           user = _api.get_user()   # will raise if token is bad
    
  •       except github.GithubException as e:
    
  •           self.logger.warning(
    
  •               f"{self.log_prefix} Invalid or expired token ({e.status}), skipping"
    
  •           )
    
  •           continue
    
  •       self.auto_verified_and_merged_users.append(user.login)
    
  • Optionally, perform a scopes check via user.raw_data.get("scopes") or a /rate_limit call to ensure the token has the required permissions before use.

These changes will catch invalid, expired, or under-scoped tokens more reliably.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
apis_and_tokens = get_apis_and_tokes_from_config(config=self.config)
self.auto_verified_and_merged_users.extend([_api[0].get_user().login for _api in apis_and_tokens])
for _api, _ in apis_and_tokens:
if _api.rate_limiting[-1] == 60:
self.logger.warning(
f"{self.log_prefix} API has rate limit set to 60 which indicates an invalid token, skipping"
)
continue
self.auto_verified_and_merged_users.append(_api.get_user().login)
apis_and_tokens = get_apis_and_tokes_from_config(config=self.config)
for _api, _ in apis_and_tokens:
try:
user = _api.get_user() # will raise if token is bad
except github.GithubException as e:
self.logger.warning(
f"{self.log_prefix} Invalid or expired token ({e.status}), skipping"
)
continue
self.auto_verified_and_merged_users.append(user.login)
🤖 Prompt for AI Agents
In webhook_server/libs/github_api.py around lines 168 to 177, the current token
validation only checks if the API rate limit is 60, which is insufficient. To
fix this, wrap the call to _api.get_user() in a try/except block that catches
authentication exceptions like github.GithubException.BadCredentialsException.
Log a warning when such exceptions occur and skip appending the user login for
invalid tokens. Optionally, add a scopes check by inspecting user.raw_data or
calling the /rate_limit endpoint to ensure the token has the necessary
permissions before appending the login.

@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/approve

@myakove
Copy link
Collaborator Author

myakove commented May 18, 2025

/verified

@myakove myakove merged commit c631e25 into main May 18, 2025
8 of 9 checks passed
@myakove myakove deleted the more-async branch May 18, 2025 12:33
@myakove-bot
Copy link
Collaborator

New container for ghcr.io/myk-org/github-webhook-server:latest published

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants