Skip to content

Comments

perf: optimize check_run repository cloning#930

Merged
myakove merged 4 commits intomainfrom
feature/optimize-checkrun-cloning
Nov 20, 2025
Merged

perf: optimize check_run repository cloning#930
myakove merged 4 commits intomainfrom
feature/optimize-checkrun-cloning

Conversation

@myakove
Copy link
Collaborator

@myakove myakove commented Nov 20, 2025

Skip repository cloning for check_run webhooks that don't need it:

  • Skip when action != "completed" (~75% of check_run webhooks)
  • Skip can-be-merged checks with non-success conclusion (~15-20% more)

Benefits:

  • 90-95% reduction in unnecessary repository cloning for check_run events
  • Faster webhook processing (seconds saved per skipped clone)
  • Reduced disk I/O, network I/O, and server load

Implementation:

  • Moved clone operation into check_run event handler
  • Added early exit checks before cloning
  • Other event types (issue_comment, pull_request, etc.) unchanged

Tests: All 67 check_run handler tests pass

Summary by CodeRabbit

  • Performance Improvements

    • Deferred repository cloning so work is performed only when needed, reducing unnecessary overhead.
  • Bug Fixes

    • Better check-run handling: skip processing for non-applicable actions/conclusions and ensure completion metrics are recorded consistently.
  • Documentation

    • Added comprehensive internal guidance covering API conventions, architecture patterns, testing, and operational practices.
  • Tests

    • Added tests validating cloning behavior and check-run event handling (including completed-action scenarios).

✏️ Tip: You can customize this high-level summary in your review settings.

Skip repository cloning for check_run webhooks that don't need it:
- Skip when action \!= "completed" (~75% of check_run webhooks)
- Skip can-be-merged checks with non-success conclusion (~15-20% more)

Benefits:
- 90-95% reduction in unnecessary repository cloning for check_run events
- Faster webhook processing (seconds saved per skipped clone)
- Reduced disk I/O, network I/O, and server load

Implementation:
- Moved clone operation into check_run event handler
- Added early exit checks before cloning
- Other event types (issue_comment, pull_request, etc.) unchanged

Tests: All 67 check_run handler tests pass
Add comprehensive tests to verify repository cloning optimization:
- test_check_run_action_not_completed_skips_clone
- test_can_be_merged_non_success_skips_clone
- test_check_run_completed_normal_clones_repository
- test_can_be_merged_success_clones_repository

Fix test_process_check_run_event:
- Add missing "action": "completed" field to check_run payload
- Required for optimization that checks action before cloning

All 72 tests pass
Add comprehensive documentation for the check_run cloning optimization
under "Critical Implementation Patterns" section.

Documents:
- Early exit conditions (action \!= completed, can-be-merge non-success)
- Implementation pattern with code example
- Benefits (90-95% reduction in cloning, faster processing, reduced resources)
- Test coverage reference
@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

Adds CLAUDE.md internal API documentation, changes check_run processing to early-skip non-completed or non-success can-be-merged checks and defer repository cloning until required, and expands tests to verify cloning behavior (with a duplicated test class present).

Changes

Cohort / File(s) Summary
Documentation
CLAUDE.md
New internal API guidance covering API philosophy, defensive-programming rules, fail-fast patterns, non-blocking GitHub API usage, testing, logging/exception patterns, security, and operational guidance with examples.
Check Run Processing Logic
webhook_server/libs/github_api.py
Import SUCCESS_STR; add guards to skip processing for non-completed check_run actions and for CAN_BE_MERGED checks with non-success conclusions; defer cloning until after early-skip checks and clone only when processing proceeds; extend logging and token-metrics emission.
Check Run Handler Tests
webhook_server/tests/test_check_run_handler.py
Adds TestCheckRunRepositoryCloning tests verifying when _clone_repository is invoked or skipped based on action/conclusion; updated imports (Headers, GithubWebhook). Note: the test class appears duplicated in the file.
GitHub API Tests
webhook_server/tests/test_github_api.py
Adds "action": "completed" to check_run event test data in test_process_check_run_event.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas needing extra attention:
    • webhook_server/libs/github_api.py — verify all early-skip branches maintain correct metric/logging and that cloning is only invoked when repository data is required.
    • webhook_server/tests/test_check_run_handler.py — remove or confirm intentional duplication of TestCheckRunRepositoryCloning.
    • CLAUDE.md — ensure documentation accurately reflects implemented behavior (especially cloning and defensive-check guidance) and does not conflict with code.

Possibly related issues

Possibly related PRs

Suggested labels

size/XL

Suggested reviewers

  • rnetser

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf: optimize check_run repository cloning' directly and concisely summarizes the main change: optimizing repository cloning for check_run events through early-exit checks.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/optimize-checkrun-cloning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@myakove-bot
Copy link
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the "
    "OWNERS file in the repository root
    "
    "* Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are "
    "automatically applied based on changes
    "
    f"* Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
    "
    "* Pre-commit Checks: pre-commit runs "
    "automatically if .pre-commit-config.yaml exists
    "
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest python-module-install - Test Python package installation
  • /retest pre-commit - Run pre-commit hooks and checks
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 1 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, or conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • rnetser

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
CLAUDE.md (1)

1103-1117: Remove duplicate “Adding a New Handler” heading

There are two “### Adding a New Handler” sections with nearly identical step lists. This trips MD024 and makes the doc slightly confusing.

Suggest keeping a single section and merging any extra details there.

webhook_server/libs/github_api.py (1)

461-577: Check_run cloning deferral and early‑exit guards look correct

The new logic:

  • Skips the shared _clone_repository() call when github_event == "check_run".
  • In the check_run branch, returns early (no clone, no handlers) when:
    • action != "completed", or
    • check_run.name == CAN_BE_MERGED_STR and conclusion != SUCCESS_STR.
  • Only after those checks does it log and call _clone_repository(pull_request=pull_request), then run OwnersFileHandler + CheckRunHandler and, for non‑CAN_BE_MERGED_STR checks, PullRequestHandler.check_if_can_be_merged.

This preserves previous behavior (no work for non‑completed or failing can‑be‑merged runs) while eliminating almost all unnecessary clones, and keeps token‑metrics logging consistent.

One potential follow‑up, given the repo’s focus on minimizing API calls: you could move the action / can‑be‑merged guards to just after event_log and before get_pull_request(), so skip cases avoid the extra PR + last‑commit lookups entirely, at the cost of slightly less PR‑annotated logging.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 586f4eb and 6914037.

📒 Files selected for processing (4)
  • CLAUDE.md (1 hunks)
  • webhook_server/libs/github_api.py (3 hunks)
  • webhook_server/tests/test_check_run_handler.py (2 hunks)
  • webhook_server/tests/test_github_api.py (1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 0
File: :0-0
Timestamp: 2025-10-28T16:09:08.689Z
Learning: For this repository, prioritize speed and minimizing API calls in reviews and suggestions: reuse webhook payload data, batch GraphQL queries, cache IDs (labels/users), and avoid N+1 patterns.
📚 Learning: 2025-05-13T12:06:27.297Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 778
File: webhook_server/libs/pull_request_handler.py:327-330
Timestamp: 2025-05-13T12:06:27.297Z
Learning: In the GitHub webhook server, synchronous GitHub API calls (like create_issue_comment, add_to_assignees, etc.) in async methods should be awaited using asyncio.to_thread or loop.run_in_executor to prevent blocking the event loop.

Applied to files:

  • CLAUDE.md
📚 Learning: 2024-10-29T10:42:50.163Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:925-926
Timestamp: 2024-10-29T10:42:50.163Z
Learning: In `webhook_server_container/libs/github_api.py`, the method `self._keep_approved_by_approvers_after_rebase()` must be called after removing labels when synchronizing a pull request. Therefore, it should be placed outside the `ThreadPoolExecutor` to ensure it runs sequentially after label removal.

Applied to files:

  • CLAUDE.md
📚 Learning: 2025-10-28T16:09:08.689Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 0
File: :0-0
Timestamp: 2025-10-28T16:09:08.689Z
Learning: For this repository, prioritize speed and minimizing API calls in reviews and suggestions: reuse webhook payload data, batch GraphQL queries, cache IDs (labels/users), and avoid N+1 patterns.

Applied to files:

  • CLAUDE.md
📚 Learning: 2025-10-28T13:04:00.466Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 878
File: webhook_server/libs/handlers/runner_handler.py:491-571
Timestamp: 2025-10-28T13:04:00.466Z
Learning: In webhook_server/libs/handlers/runner_handler.py, the run_build_container method is designed with the pattern that push=True is always called with set_check=False in production code, so no check-run status needs to be finalized after push operations.

Applied to files:

  • webhook_server/tests/test_check_run_handler.py
📚 Learning: 2024-10-29T08:09:57.157Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:2089-2100
Timestamp: 2024-10-29T08:09:57.157Z
Learning: In `webhook_server_container/libs/github_api.py`, when the function `_keep_approved_by_approvers_after_rebase` is called, existing approval labels have already been cleared after pushing new changes, so there's no need to check for existing approvals within this function.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-08T09:19:56.185Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 586
File: webhook_server_container/libs/github_api.py:1947-1956
Timestamp: 2024-10-08T09:19:56.185Z
Learning: In `webhook_server_container/libs/github_api.py`, the indentation style used in the `set_pull_request_automerge` method is acceptable as per the project's coding standards.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-14T14:13:21.316Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 588
File: webhook_server_container/libs/github_api.py:1632-1637
Timestamp: 2024-10-14T14:13:21.316Z
Learning: In the `ProcessGithubWehook` class in `webhook_server_container/libs/github_api.py`, avoid using environment variables to pass tokens because multiple commands with multiple tokens can run at the same time.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2025-10-30T00:18:06.176Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 878
File: webhook_server/libs/github_api.py:111-118
Timestamp: 2025-10-30T00:18:06.176Z
Learning: In webhook_server/libs/github_api.py, when creating temporary directories or performing operations that need repository names, prefer using self.repository_name (from webhook payload, always available) over dereferencing self.repository.name or self.repository_by_github_app.name, which may be None. This avoids AttributeError and keeps the code simple and reliable.

Applied to files:

  • webhook_server/libs/github_api.py
🧬 Code graph analysis (2)
webhook_server/tests/test_check_run_handler.py (4)
webhook_server/libs/github_api.py (2)
  • GithubWebhook (77-848)
  • process (368-623)
webhook_server/libs/handlers/check_run_handler.py (1)
  • process_pull_request_check_run_webhook_data (48-133)
webhook_server/libs/handlers/owners_files_handler.py (1)
  • initialize (30-56)
webhook_server/libs/handlers/pull_request_handler.py (1)
  • check_if_can_be_merged (928-1047)
webhook_server/libs/github_api.py (1)
webhook_server/utils/helpers.py (1)
  • format_task_fields (135-154)
🪛 LanguageTool
CLAUDE.md

[uncategorized] ~543-~543: The official name of this software platform is spelled with a capital “H”.
Context: ...- All handlers follow a common pattern: __init__(github_webhook, ...) → `process_event(event_d...

(GITHUB)


[uncategorized] ~826-~826: The official name of this software platform is spelled with a capital “H”.
Context: ...ooks that don't need it. Location: webhook_server/libs/github_api.py lines 534-570 **Early exit con...

(GITHUB)


[style] ~836-~836: ‘without success’ might be wordy. Consider a shorter alternative.
Context: ...utral, skipped` - Cannot automerge without success conclusion Implementation pattern:...

(EN_WORDINESS_PREMIUM_WITHOUT_SUCCESS)


[typographical] ~991-~991: Consider using an en dash here instead of a hyphen.
Context: ...ry method/property access can block for 100ms-2 seconds - Blocking = frozen server ...

(QB_NEW_EN_DASH_RULE_EN)


[uncategorized] ~1105-~1105: The official name of this software platform is spelled with a capital “H”.
Context: ...ook_server/libs/handlers/2. Implementinit(self, github_webhook, ...)andprocess_event(event...

(GITHUB)


[uncategorized] ~1106-~1106: The official name of this software platform is spelled with a capital “H”.
Context: ... and process_event(event_data) 3. Use self.github_webhook.unified_api for GitHub operati...

(GITHUB)


[uncategorized] ~1113-~1113: The official name of this software platform is spelled with a capital “H”.
Context: ...ook_server/libs/handlers/2. Implementinit(self, github_webhook, ...)andprocess_event(event...

(GITHUB)


[uncategorized] ~1114-~1114: The official name of this software platform is spelled with a capital “H”.
Context: ... and process_event(event_data) 3. Use self.github_webhook for GitHub operations 4. Add t...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)
CLAUDE.md

519-519: Multiple headings with the same content

(MD024, no-duplicate-heading)


525-525: Multiple headings with the same content

(MD024, no-duplicate-heading)


839-839: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


1110-1110: Multiple headings with the same content

(MD024, no-duplicate-heading)

🔇 Additional comments (2)
webhook_server/tests/test_github_api.py (1)

727-734: Check_run test correctly updated for action="completed" guard

Including "action": "completed" in check_run_data keeps this test aligned with the new check_run early‑exit logic in GithubWebhook.process() and ensures the “happy path” still exercises cloning + handlers.

Looks good as is.

webhook_server/tests/test_check_run_handler.py (1)

6-8: New check_run cloning tests accurately lock in desired behavior

The TestCheckRunRepositoryCloning suite cleanly verifies:

  • No clone for non‑completed actions and for can-be-merged with non‑success conclusions.
  • A single _clone_repository(pull_request=...) call for completed non‑can-be-merged checks and for can-be-merged with success.
  • That these paths still go through the expected handlers and return None.

Mock setup mirrors existing GithubWebhook tests and avoids extra API/git calls, keeping the tests fast while tightly constraining the new optimization.

Looks solid.

Also applies to: 983-1322

- Align PyGithub usage guidance with non-blocking pattern using asyncio.to_thread()
- Remove duplicate 'Adding a New Handler' section
- Address CodeRabbit AI review comments (HIGH and LOW priority)
@myakove
Copy link
Collaborator Author

myakove commented Nov 20, 2025

/approve
/verified

@myakove myakove merged commit 18fc33b into main Nov 20, 2025
7 of 9 checks passed
@myakove myakove deleted the feature/optimize-checkrun-cloning branch November 20, 2025 21:45
@myakove-bot
Copy link
Collaborator

New container for ghcr.io/myk-org/github-webhook-server:latest published

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
CLAUDE.md (1)

843-843: Add blank line before fenced code block per Markdown formatting rules.

Line 843 is missing a blank line before the code block (MD031 violation). Insert a blank line between **Implementation pattern:** and the opening triple backticks.

 **Implementation pattern:**
+
 ```python
 elif self.github_event == "check_run":
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6914037 and 2c18d9b.

📒 Files selected for processing (1)
  • CLAUDE.md (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 0
File: :0-0
Timestamp: 2025-10-28T16:09:08.689Z
Learning: For this repository, prioritize speed and minimizing API calls in reviews and suggestions: reuse webhook payload data, batch GraphQL queries, cache IDs (labels/users), and avoid N+1 patterns.
📚 Learning: 2025-05-13T12:06:27.297Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 778
File: webhook_server/libs/pull_request_handler.py:327-330
Timestamp: 2025-05-13T12:06:27.297Z
Learning: In the GitHub webhook server, synchronous GitHub API calls (like create_issue_comment, add_to_assignees, etc.) in async methods should be awaited using asyncio.to_thread or loop.run_in_executor to prevent blocking the event loop.

Applied to files:

  • CLAUDE.md
📚 Learning: 2024-10-29T10:42:50.163Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:925-926
Timestamp: 2024-10-29T10:42:50.163Z
Learning: In `webhook_server_container/libs/github_api.py`, the method `self._keep_approved_by_approvers_after_rebase()` must be called after removing labels when synchronizing a pull request. Therefore, it should be placed outside the `ThreadPoolExecutor` to ensure it runs sequentially after label removal.

Applied to files:

  • CLAUDE.md
📚 Learning: 2025-10-28T16:09:08.689Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 0
File: :0-0
Timestamp: 2025-10-28T16:09:08.689Z
Learning: For this repository, prioritize speed and minimizing API calls in reviews and suggestions: reuse webhook payload data, batch GraphQL queries, cache IDs (labels/users), and avoid N+1 patterns.

Applied to files:

  • CLAUDE.md
🪛 LanguageTool
CLAUDE.md

[uncategorized] ~547-~547: The official name of this software platform is spelled with a capital “H”.
Context: ...- All handlers follow a common pattern: __init__(github_webhook, ...) → `process_event(event_d...

(GITHUB)


[uncategorized] ~830-~830: The official name of this software platform is spelled with a capital “H”.
Context: ...ooks that don't need it. Location: webhook_server/libs/github_api.py lines 534-570 **Early exit con...

(GITHUB)


[style] ~840-~840: ‘without success’ might be wordy. Consider a shorter alternative.
Context: ...utral, skipped` - Cannot automerge without success conclusion Implementation pattern:...

(EN_WORDINESS_PREMIUM_WITHOUT_SUCCESS)


[typographical] ~995-~995: Consider using an en dash here instead of a hyphen.
Context: ...ry method/property access can block for 100ms-2 seconds - Blocking = frozen server ...

(QB_NEW_EN_DASH_RULE_EN)


[uncategorized] ~1109-~1109: The official name of this software platform is spelled with a capital “H”.
Context: ...ook_server/libs/handlers/2. Implementinit(self, github_webhook, ...)andprocess_event(event...

(GITHUB)


[uncategorized] ~1110-~1110: The official name of this software platform is spelled with a capital “H”.
Context: ... and process_event(event_data) 3. Use self.github_webhook.unified_api for GitHub operati...

(GITHUB)

🪛 markdownlint-cli2 (0.18.1)
CLAUDE.md

523-523: Multiple headings with the same content

(MD024, no-duplicate-heading)


529-529: Multiple headings with the same content

(MD024, no-duplicate-heading)


843-843: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (2)
CLAUDE.md (2)

1-100: Excellent comprehensive internal API documentation.

The backward compatibility policy, anti-defensive programming philosophy, and decision tree provide clear guidance that will help developers write safer, more performant code. The fail-fast principle section is particularly well-articulated with concrete before/after examples.


421-436: Documentation and dependency version verification confirmed—no issues found.

PyGithub v2.4.0 includes the .close() method, which aligns with the pyproject.toml specification of pygithub>=2.4.0. The CLAUDE.md guidance correctly and consistently emphasizes wrapping all PyGithub operations (methods, properties, and lazy-loaded attributes) with asyncio.to_thread() across all documented sections (lines 421–436, 651–715, and 932–1000). The code examples are accurate and follow async best practices for non-blocking operations in FastAPI.

Comment on lines +826 to +875
### Repository Cloning Optimization for check_run Events

**Optimization implemented:** Repository cloning is skipped for check_run webhooks that don't need it.

**Location:** `webhook_server/libs/github_api.py` lines 534-570

**Early exit conditions (no clone needed):**
1. **Action != "completed"** (~75% of check_run webhooks)
- Actions: `queued`, `in_progress`, `created`, `requested`
- These webhooks are informational only, no processing needed

2. **Can-be-merged with non-success conclusion** (~15-20% of remaining webhooks)
- Check name: `can-be-merged`
- Conclusions: `failure`, `cancelled`, `timed_out`, `action_required`, `neutral`, `skipped`
- Cannot automerge without success conclusion

**Implementation pattern:**
```python
elif self.github_event == "check_run":
# Check if we need to process this check_run
action = self.hook_data.get("action", "")
if action != "completed":
# Log and return early (no clone)
return None

# Check if this is can-be-merged with non-success conclusion
check_run_name = self.hook_data.get("check_run", {}).get("name", "")
check_run_conclusion = self.hook_data.get("check_run", {}).get("conclusion", "")

if check_run_name == CAN_BE_MERGED_STR and check_run_conclusion != SUCCESS_STR:
# Log and return early (no clone)
return None

# Only clone when actually needed
await self._clone_repository(pull_request=pull_request)
# ... rest of processing
```

**Benefits:**
- **90-95% reduction** in unnecessary repository cloning for check_run events
- **Faster webhook processing** - saves 5-30 seconds per skipped clone (depending on repo size)
- **Reduced resource usage** - less disk I/O, network I/O, and CPU usage
- **Lower server load** - especially during high webhook volume periods

**Other event types unchanged:**
- `issue_comment` - still clones before processing
- `pull_request` - still clones before processing
- `pull_request_review` - still clones before processing

**Tests:** `webhook_server/tests/test_check_run_handler.py` - `TestCheckRunRepositoryCloning` class
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Repository cloning optimization for check_run events is well-documented.

The implementation details match PR objectives: skip ~75% of webhooks when action != "completed", and additional ~15-20% for can-be-merged checks with non-success conclusions. The pattern and early-exit logic are clearly explained. References to webhook_server/libs/github_api.py lines 534-570 are helpful.

Verify that the skip conditions in the actual implementation match this documentation:


🏁 Script executed:

#!/bin/bash
# Verify check_run cloning optimization matches documentation
rg -B5 -A20 'action != "completed"' webhook_server/libs/github_api.py
rg -B3 -A10 'CAN_BE_MERGED_STR.*SUCCESS_STR' webhook_server/libs/github_api.py

Length of output: 2471


I need to verify a few additional details to ensure the documentation fully matches the implementation:

I need to verify the valid check_run actions more carefully, as the documentation lists some values that don't match GitHub's webhook specification.

I've identified some inaccuracies in the documentation. Let me verify the implementation details further:Based on my verification, I've found critical inaccuracies in the documentation. The implementation code itself is correct and properly handles the skip conditions, but the documentation contains a significant factual error about which actions are being skipped.

Correct the documentation: check_run webhook action types are misidentified.

Repository and organization webhooks only receive payloads for the created and completed event types for check_run events. The documentation incorrectly lists "queued" and "in_progress" as webhook actions—these are check run status values (queued, in_progress, requested, waiting, pending, or completed), not action types.

The implementation correctly checks action != "completed", which properly handles webhook events. However, the documentation's claimed ~75% reduction is based on a false premise since webhooks only send "created" and "completed" actions to repository webhooks (not "queued" and "in_progress"). The actual optimization primarily benefits the second condition (can-be-merged with non-success conclusion).

Update the documentation to:

  • Remove the erroneous actions: "queued", "in_progress", "created", "requested"
  • Clarify: For repository webhooks, only "created" and "completed" actions are received
  • Revise the percentage estimate since the ~75% skipping doesn't apply to standard webhooks
  • Note that the primary optimization is for can-be-merged checks with non-success conclusions
🧰 Tools
🪛 LanguageTool

[uncategorized] ~830-~830: The official name of this software platform is spelled with a capital “H”.
Context: ...ooks that don't need it. Location: webhook_server/libs/github_api.py lines 534-570 **Early exit con...

(GITHUB)


[style] ~840-~840: ‘without success’ might be wordy. Consider a shorter alternative.
Context: ...utral, skipped` - Cannot automerge without success conclusion Implementation pattern:...

(EN_WORDINESS_PREMIUM_WITHOUT_SUCCESS)

🪛 markdownlint-cli2 (0.18.1)

843-843: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
In CLAUDE.md around lines 826 to 875, the check_run actions list and the ~75%
skip estimate are incorrect: remove "queued", "in_progress", "requested" (these
are status values, not webhook actions), update the text to state that
repository/organization webhooks only receive "created" and "completed" actions
for check_run, revise or remove the ~75% skip estimate (adjust wording to avoid
the misleading percentage), and emphasize that the main optimization is skipping
clones for can-be-merged checks with non-success conclusions while retaining the
existing code behavior that checks action != "completed".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants