# Unit 4

## GitHub Webhook Integration

Hello\! I see you need the previous content converted to English. Here is the comprehensive, production-ready guide to integrating **GitHub Webhooks** with your **FastAPI** application.

# 🚀 Building the Code Review Assistant: GitHub Webhook Integration

Welcome back\! So far, you've learned how to set up a **FastAPI backend**, manage changesets, and handle errors and validation in your **Code Review Assistant** project. In this lesson, we'll take the next step by connecting your application to **GitHub** using **webhooks**.

A **webhook** is a way for one application to send **real-time data** to another application when a specific event happens. In our case, we want GitHub to notify our Code Review Assistant whenever someone opens or updates a pull request. This allows us to automatically start a code review process as soon as new code is submitted.

By the end of this lesson, you'll know how to receive GitHub webhook events, verify their authenticity, extract pull request data, and trigger your code review workflow—all automatically.

-----

## Recall: FastAPI Endpoints and Request Handling

Before we dive in, let's quickly remind ourselves how FastAPI handles HTTP requests. In previous lessons, you learned how to define endpoints using FastAPI and how to read data from incoming requests.

For example, to create a **`POST`** endpoint and read the request body, you might write:

```python
from fastapi import FastAPI, Request
app = FastAPI()

@app.post("/example")
async def example_endpoint(request: Request):
    data = await request.json()
    return {"received": data}
```

Here, `@app.post("/example")` defines a **`POST`** endpoint. The `request: Request` parameter allows you to access the raw request, and `await request.json()` reads the JSON body sent by the client.

This pattern is the foundation for receiving webhook events, which are just HTTP **`POST`** requests sent by GitHub to your application.

-----

## Setting Up a GitHub Webhook Endpoint in FastAPI

Let's start by creating a FastAPI endpoint that can receive webhook events from GitHub.

First, we need to define a **`POST`** endpoint. Since GitHub sends the webhook data as a raw payload, we'll read the body as **bytes**:

```python
from fastapi import APIRouter, Request

router = APIRouter()

@router.post("/webhook/github")
async def github_webhook(request: Request):
    payload = await request.body()
    # We will process the payload in the next steps
    return {"status": "received"}
```

  * `@router.post("/webhook/github")` creates a new **`POST`** endpoint at `/webhook/github`.
  * `payload = await request.body()` reads the raw bytes sent by GitHub. This is critical because we may need the **exact bytes** for signature verification.

At this point, your application can receive webhook events from GitHub. However, we need to make sure these events are actually coming from GitHub and not from someone else.

-----

## Verifying GitHub Webhook Signatures

Security is paramount. GitHub allows you to set a **secret** when configuring a webhook. When GitHub sends a webhook event, it includes a **signature** in the headers. Your application should verify this signature to ensure the request is genuine.

Here's how you can verify the signature using **HMAC** and **SHA-256**:

```python
import hmac
import hashlib
from fastapi import Header, HTTPException

def verify_github_signature(payload: bytes, signature: str, secret: str) -> bool:
    if not signature or not signature.startswith('sha256='):
        return False
        
    expected = 'sha256=' + hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()
    
    return hmac.compare_digest(expected, signature)
```

  * `payload` is the raw request body.
  * `signature` is the value from the **`X-Hub-Signature-256`** header.
  * `secret` is the webhook secret you set in GitHub.

To use this in your endpoint:

```python
@router.post("/webhook/github")
async def github_webhook(
    request: Request,
    x_hub_signature_256: str = Header(None)):
    
    payload = await request.body()
    GITHUB_SECRET = "your-secret"  # Replace with your actual secret
    
    if not verify_github_signature(payload, x_hub_signature_256, GITHUB_SECRET):
        raise HTTPException(status_code=401, detail="Invalid signature")
        
    # Continue processing the payload
    return {"status": "received"}
```

If the signature is invalid, the endpoint returns a **$401$ Unauthorized** error. This helps protect your application from unwanted or malicious requests.

-----

## Extracting and Parsing Pull Request Data

Once you've verified the webhook, you need to extract the pull request information from the payload. GitHub sends the data as JSON, so you can parse it like this:

```python
import json

@router.post("/webhook/github")
async def github_webhook(
    request: Request,
    x_hub_signature_256: str = Header(None)):
    
    payload = await request.body()
    # Assume signature is already verified
    
    event = json.loads(payload)
    
    if event.get('action') in ['opened', 'synchronize']:
        pr = event['pull_request']
        title = pr['title']
        description = pr.get('body', '')
        author = pr['user']['login']
        diff_url = pr['diff_url']
        
        print("Pull Request Title:", title)
        print("Description:", description)
        print("Author:", author)
        print("Diff URL:", diff_url)
        
    return {"status": "received"}
```

  * `event = json.loads(payload)` parses the JSON payload.
  * We check if the action is **`'opened'`** or **`'synchronize'`**, which are the events we care about for new or updated pull requests.
  * We extract the pull request title, description, author, and the URL to the diff.

**Example Output:**

```
Pull Request Title: Add new feature
Description: This PR adds a new feature to the application
Author: developer123
Diff URL: https://github.com/repo/pull/123.diff
```

This information is essential for tracking changes and starting the review process.

-----

## Storing Changesets and Scheduling Reviews

Now that you have the pull request data, you need to save it in your database and schedule a code review. In previous lessons, you learned how to use **SQLAlchemy** models for changesets. Here's how you might use them in this context:

```python
from sqlalchemy.orm import Session
from fastapi import Depends, BackgroundTasks

# Dummy session and models for demonstration
def get_session():
    class DummySession:
        def add(self, obj): pass
        def flush(self): pass
        def commit(self): pass
        def close(self): pass
    return DummySession()

class Changeset:
    def __init__(self, title, description, author, status='pending'):
        self.title = title
        self.description = description
        self.author = author
        self.status = status
        self.id = 1  # Dummy ID

@router.post("/webhook/github")
async def github_webhook(
    request: Request,
    background_tasks: BackgroundTasks,
    x_hub_signature_256: str = Header(None),
    db: Session = Depends(get_session)):
    
    payload = await request.body()
    # Assume signature is already verified
    
    event = json.loads(payload)
    
    if event.get('action') in ['opened', 'synchronize']:
        pr = event['pull_request']
        
        changeset = Changeset(
            title=pr['title'],
            description=pr.get('body', ''),
            author=pr['user']['login'],
            status='pending'
        )
        
        db.add(changeset)
        db.flush()
        db.commit()
        
        # Schedule background review
        background_tasks.add_task(
            process_pr_review,
            changeset.id,
            pr['number'],
            event['repository']['full_name']
        )
        
    return {"status": "received"}
```

  * We create a new **`Changeset`** object with the pull request data.
  * We add it and **commit** it to the database.
  * We use **`background_tasks.add_task()`** to schedule the review process without blocking the webhook response.

This approach ensures that your application quickly responds to GitHub and then processes the review in the background.

-----

## Summary and Practice Preview

In this lesson, you learned how to connect your Code Review Assistant to GitHub using webhooks. You saw how to:

  * Set up a FastAPI endpoint to receive webhook events.
  * **Verify webhook signatures** for security.
  * Parse pull request data from the webhook payload.
  * Store changeset information and schedule **background review tasks**.

These steps allow your application to automatically react to new or updated pull requests, making your code review process faster and more reliable.

In the next practice exercises, you'll get hands-on experience with each of these steps. You will implement and test webhook handling, signature verification, and changeset storage. This will help you solidify your understanding and prepare you for more advanced integrations in the future.

## Building Your First Webhook Endpoint

Now that you understand how FastAPI handles requests and how webhooks work, it's time to build your first webhook endpoint! You'll create a basic endpoint that can receive and respond to GitHub webhook events.

Your goal is to complete the github_webhook function by implementing these key steps:

Read the incoming request body as raw bytes
Parse the JSON payload from GitHub
Check whether the webhook action is opened or synchronize
Return the appropriate response message
When GitHub sends a webhook for a new or updated pull request, your endpoint should respond with "PR event received." For any other type of event, it should respond with "Event ignored."

This foundation will prepare you for more advanced webhook features, such as signature verification and data processing, that come next.

```python
from fastapi import APIRouter, Request
import json
import hmac
import hashlib
import re

router = APIRouter()

class Changeset:
    def __init__(self, title, description, author, status='pending'):
        self.title = title
        self.description = description
        self.author = author
        self.status = status
        self.id = 1  # Dummy ID for testing

def get_session():
    """Return a dummy session for testing"""
    class DummySession:
        def add(self, obj): pass
        def flush(self): pass
        def commit(self): pass
        def close(self): pass
    return DummySession()

def verify_github_signature(payload: bytes, signature: str, secret: str) -> bool:
    """Verify GitHub webhook signature"""
    if not signature or not signature.startswith('sha256='):
        return False

    expected = 'sha256=' + hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()

    return hmac.compare_digest(expected, signature)

def parse_pr_diff(diff_content: str) -> dict:
    """Parse PR diff content and extract file changes"""
    files = {}
    current_file = None
    
    for line in diff_content.split('\n'):
        if line.startswith('diff --git'):
            # Extract file path from diff header
            match = re.search(r'diff --git a/(.*?) b/(.*?)$', line)
            if match:
                current_file = match.group(1)
                files[current_file] = []
        elif current_file and (line.startswith('+') or line.startswith('-')):
            files[current_file].append(line)
    
    return files

def process_pr_review(changeset_id: int, pr_number: int, repo_name: str):
    """Process PR review in background"""
    print(f"Processing review for changeset {changeset_id}, PR {pr_number} in {repo_name}")

@router.post("/webhook/github")
async def github_webhook(request: Request):
    """Handle GitHub pull request webhooks"""
    # TODO: Read the request body as bytes
    
    # TODO: Parse the payload as JSON
    
    # TODO: Check if the action is 'opened' or 'synchronize'
    # TODO: Return {"message": "PR event received"} for these actions
    # TODO: Return {"message": "Event ignored"} for other actions

```

Here is the completed `github_webhook` function. It reads the raw request body, parses the JSON payload, and checks for the relevant 'opened' or 'synchronize' actions to return the appropriate message.

```python
from fastapi import APIRouter, Request
import json
import hmac
import hashlib
import re
from fastapi.responses import JSONResponse

router = APIRouter()

class Changeset:
    def __init__(self, title, description, author, status='pending'):
        self.title = title
        self.description = description
        self.author = author
        self.status = status
        self.id = 1  # Dummy ID for testing

def get_session():
    """Return a dummy session for testing"""
    class DummySession:
        def add(self, obj): pass
        def flush(self): pass
        def commit(self): pass
        def close(self): pass
    return DummySession()

def verify_github_signature(payload: bytes, signature: str, secret: str) -> bool:
    """Verify GitHub webhook signature"""
    if not signature or not signature.startswith('sha256='):
        return False

    expected = 'sha256=' + hmac.new(
        secret.encode(),
        payload,
        hashlib.sha256
    ).hexdigest()

    return hmac.compare_digest(expected, signature)

def parse_pr_diff(diff_content: str) -> dict:
    """Parse PR diff content and extract file changes"""
    files = {}
    current_file = None
    
    for line in diff_content.split('\n'):
        if line.startswith('diff --git'):
            # Extract file path from diff header
            match = re.search(r'diff --git a/(.*?) b/(.*?)$', line)
            if match:
                current_file = match.group(1)
                files[current_file] = []
        elif current_file and (line.startswith('+') or line.startswith('-')):
            files[current_file].append(line)
    
    return files

def process_pr_review(changeset_id: int, pr_number: int, repo_name: str):
    """Process PR review in background"""
    print(f"Processing review for changeset {changeset_id}, PR {pr_number} in {repo_name}")

@router.post("/webhook/github")
async def github_webhook(request: Request):
    """Handle GitHub pull request webhooks"""
    # ✅ Read the request body as bytes
    payload = await request.body()
    
    # ✅ Parse the payload as JSON
    try:
        event = json.loads(payload)
    except json.JSONDecodeError:
        return JSONResponse(status_code=400, content={"message": "Invalid JSON payload"})
    
    # Check if the event is a pull request and extract the action
    action = event.get('action')
    
    # ✅ Check if the action is 'opened' or 'synchronize'
    if action in ['opened', 'synchronize']:
        # ✅ Return {"message": "PR event received"} for these actions
        return {"message": "PR event received"}
    else:
        # ✅ Return {"message": "Event ignored"} for other actions
        # This handles events like 'closed', 'assigned', 'labeled', or other webhook types (e.g., 'issue_comment')
        return {"message": f"Event ignored: action '{action}'"}

```

## Debugging Webhook Security Verification

## Completing Webhook Payload Data Extraction

## Storing Webhook Data in Database

## Completing the Webhook to Review Pipeline