Skip to content

[Feature]: Fetch GitHub PR Data and Introduce Agent-Based PR Summaries #5

@xueyulinn

Description

@xueyulinn

Problem

The backend can already receive GitHub webhooks and post a fixed comment, but it does not yet fetch pull request details from GitHub or generate a meaningful PR summary. Without GitHub PR data integration, the system cannot understand the PR context well enough to produce useful summaries. Without an initial agent integration, the summary logic also remains limited to static or hardcoded output.

Proposed Solution

Integrate GitHub APIs for retrieving pull request-related information and introduce an initial agent-based summarization flow.

Initial scope:

  • Fetch core PR data from GitHub after receiving a supported pull_request webhook
  • Retrieve the key context needed for summarization, such as:
    • PR title and body
    • changed files
    • diff or patch metadata
    • repository and author context as needed
  • Define a structured internal DTO or domain model for PR summary input
  • Introduce an initial agent step that consumes the collected PR context and produces a concise summary
  • Post the generated summary back to the PR as a GitHub comment
  • Add logging and basic error handling for GitHub API calls and agent execution

Recommended rollout:

  • Phase 1: Fetch PR data and validate the end-to-end integration path
  • Phase 2: Introduce a basic agent-generated summary with a constrained prompt and output format
  • Phase 3: Improve summary quality, formatting, and failure handling

Use Case

When a pull request is opened or updated, the system should automatically gather relevant PR information from GitHub, generate a concise summary, and post that summary as a comment on the PR. This helps reviewers quickly understand the purpose and scope of the change.

Expected User Experience

After a supported PR event is received, the pull request should receive a summary comment within a short time. The summary should describe the main intent of the PR and the most important changes in a clear, compact format.
Failures should be visible in logs without blocking webhook receipt.

Alternatives Considered

  • Continue posting fixed comments without fetching PR details
    This verifies API connectivity but does not provide meaningful review value.
  • Build a rule-based summary generator without using an agent
    This is simpler, but summary quality and adaptability will likely be limited.
  • Introduce the agent before integrating GitHub PR data retrieval
    This would make debugging harder because data collection and agent behavior would be coupled too early.

Priority

Medium

Additional Context

This feature should be the next step after the current GitHub webhook and fixed-comment flow. The main goal is to move from infrastructure validation to useful PR-aware summarization, while keeping the first agent integration
narrow and controlled.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions