# 🛠️ Week 11-12 · Notebook 05 · CI/CD with GitHub Actions for the Manufacturing Copilot

This notebook details how to build a robust Continuous Integration and Continuous Deployment (CI/CD) pipeline using GitHub Actions. This pipeline will automate the testing, security scanning, and deployment of our Manufacturing Copilot to staging and production environments.


## 🎯 Learning Objectives

- **Design Multi-Environment Workflows:** Create separate GitHub Actions workflows for Continuous Integration (`ci.yml`) and Deployment (`deploy.yml`).
- **Implement Quality Gates:** Integrate automated checks into the CI pipeline, including unit tests, container vulnerability scans, and custom "prompt linting."
- **Automate Staging Deployment:** Configure the pipeline to automatically deploy the application to a staging environment after a pull request is merged.
- **Gate Production Deployment:** Implement a manual approval step in the workflow to ensure that a human reviews and authorizes any deployment to the production environment.
- **Collect Audit Evidence:** Create a script to automatically gather and archive release artifacts (like test results and scan reports) for compliance and auditing purposes.


## 🧩 Scenario: A Safe and Auditable Path to Production

The development team for the Manufacturing Copilot is ready to automate their release process. The company's release policy is strict and designed to prevent outages and ensure compliance.

**The Policy:**
1.  **Pull Request (PR) Checks:** No code can be merged into the `main` branch unless it passes all automated quality and security checks.
2.  **Automatic Staging Deployment:** Once a PR is merged to `main`, the new version must be automatically deployed to a `staging` environment for final testing.
3.  **Manual Production Approval:** Deployment to `production` is not automatic. A designated approver (e.g., the Head of Maintenance) must give explicit approval within the GitHub Actions workflow. This approval must be linked to a change management ticket.
4.  **Evidence Archiving:** All artifacts related to a release (test results, scan reports, approver's name) must be archived for future audits.


## 🧱 CI/CD Workflow Structure

Our pipeline will be a sequence of jobs, each with a specific responsibility, flowing from code check-in to production deployment.

`[Code Push]` -> `[CI Checks]` -> `[Build & Scan]` -> `(Merge to Main)` -> `[Deploy to Staging]` -> `[Manual Approval]` -> `[Deploy to Production]` -> `[Archive Evidence]`

We will implement this using two separate workflow files in the `.github/workflows/` directory:
1.  `ci.yml`: Runs on every pull request.
2.  `deploy.yml`: Runs on every push to the `main` branch.


### 🔧 `ci.yml` (The Quality Gate)

This workflow runs on every pull request targeting the `main` branch. It acts as a "quality gate," preventing low-quality or insecure code from being merged. We'll use a combination of tools to ensure code correctness, style, and security.

**Key Features:**
- **Path Filtering:** The workflow only runs when relevant files (`app/`, `tests/`, `prompts/`) are changed, saving CI minutes.
- **Dependency Caching:** Caches Poetry dependencies to speed up subsequent runs.
- **Linting & Formatting:** Uses `ruff` for high-speed linting and format checking.
- **Unit Testing:** Runs the `pytest` suite.
- **Static Analysis:** Integrates **CodeQL** to find potential security vulnerabilities in the Python code. This is a critical step for writing secure applications.

Below is the complete workflow file that should be placed in `.github/workflows/ci.yml`.

```yaml
# .github/workflows/ci.yml

name: CI Quality Gate

on:
  pull_request:
    branches: [ main ]
    paths:
      - 'app/**'
      - 'tests/**'
      - 'prompts/**'
      - '.github/workflows/ci.yml'

jobs:
  quality-checks:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Set up Python and Poetry
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
          cache: 'poetry'

      - name: Install dependencies
        run: poetry install --no-root

      - name: Run linter and code formatter check
        run: |
          poetry run ruff check .
          poetry run ruff format --check .

      - name: Run unit tests
        run: |
          poetry run pytest
        env:
          # Disable telemetry to avoid hangs in CI
          POETRY_TELEMETRY_DISABLED: 1

      - name: Custom prompt linter
        # This step assumes you have prompt files in a 'prompts/' directory
        # It will fail if the directory doesn't exist, reminding you to create them.
        run: |
          if [ -d "prompts" ]; then
            poetry run python tools/prompt_lint.py prompts/
          else
            echo "No 'prompts' directory found, skipping prompt linting."
          fi


  security-scan:
    runs-on: ubuntu-latest
    needs: quality-checks
    permissions:
      contents: read # for actions/checkout
      security-events: write # for github/codeql-action/upload-sarif
      actions: read # for github/codeql-action/init to get workflow details
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Initialize CodeQL
        uses: github/codeql-action/init@v3
        with:
          languages: python
          # Queries can be customized for more specific security checks
          # queries: +security-extended, +security-and-quality

      - name: Autobuild
        # Attempts to build any compiled languages. For Python, this is a no-op.
        uses: github/codeql-action/autobuild@v3

      - name: Perform CodeQL Analysis
        uses: github/codeql-action/analyze@v3
        with:
          category: "/language:python"

```

### 🚀 `deploy.yml` (The Path to Production)

This workflow is triggered after a PR is merged into `main`. It handles building the container image, pushing it to a registry, and deploying to different environments. It includes the critical manual approval step for production.

**Key Features:**
- **Permissions:** The workflow is granted specific permissions (`id-token: write`) to support passwordless authentication to Google Cloud using Workload Identity Federation.
- **Reusable Build:** The container image is built once and referenced in subsequent deployment jobs using `needs.build-and-push-image.outputs.image_uri`.
- **GitHub Environments:** Uses GitHub's "Environments" feature to manage environment-specific variables and protection rules (like manual approval).
- **Manual Approval Gate:** The `trstringer/manual-approval` action pauses the workflow until a designated approver clicks "Approve". This is a crucial control for production deployments.
- **Smoke Tests:** After each deployment, a simple `curl` command acts as a smoke test to ensure the service is responsive.

Below is the complete workflow file that should be placed in `.github/workflows/deploy.yml`.

```yaml
# .github/workflows/deploy.yml

name: Deploy to Staging and Production

on:
  push:
    branches: [ main ]
  workflow_dispatch:

# These permissions are essential for secure, passwordless deployment
permissions:
  contents: write # To write release notes or create PRs if needed
  pull-requests: write 
  id-token: write # Required for Workload Identity Federation to GCP

jobs:
  build-and-push-image:
    runs-on: ubuntu-latest
    outputs:
      image_uri: ${{ steps.push.outputs.image_uri }}
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Log in to Google Artifact Registry
        # This uses a service account key stored as a secret.
        # In a real-world scenario, prefer Workload Identity Federation here as well.
        uses: docker/login-action@v3
        with:
          registry: ${{ vars.GCP_ARTIFACT_REGISTRY_LOCATION }}-docker.pkg.dev
          username: _json_key
          password: ${{ secrets.GCP_SA_KEY }}

      - name: Build and push Docker image
        id: build
        uses: docker/build-push-action@v5
        with:
          context: ./app # We are building the Dockerfile inside the 'app' directory
          push: true
          tags: ${{ vars.GCP_ARTIFACT_REGISTRY_LOCATION }}-docker.pkg.dev/${{ vars.GCP_PROJECT_ID }}/${{ vars.GCP_ARTIFACT_REGISTRY_REPO }}/manufacturing-copilot:${{ github.sha }}
          labels: "sha=${{ github.sha }}"
          cache-from: type=gha
          cache-to: type=gha,mode=max

      - name: Set image URI output
        id: push
        run: |
          echo "image_uri=${{ vars.GCP_ARTIFACT_REGISTRY_LOCATION }}-docker.pkg.dev/${{ vars.GCP_PROJECT_ID }}/${{ vars.GCP_ARTIFACT_REGISTRY_REPO }}/manufacturing-copilot:${{ github.sha }}" >> $GITHUB_OUTPUT

  deploy-to-staging:
    runs-on: ubuntu-latest
    needs: build-and-push-image
    environment:
      name: staging
      url: ${{ steps.deploy.outputs.url }} # The URL will be displayed in the GitHub UI
    steps:
      - name: Authenticate to Google Cloud
        uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: ${{ secrets.GCP_WIF_PROVIDER }}
          service_account: ${{ secrets.GCP_STAGING_SA }}

      - name: Deploy to Cloud Run (Staging)
        id: deploy
        uses: google-github-actions/deploy-cloudrun@v2
        with:
          service: manufacturing-copilot-staging
          project_id: ${{ vars.GCP_PROJECT_ID }}
          region: ${{ vars.GCP_REGION }}
          image: ${{ needs.build-and-push-image.outputs.image_uri }}
          env_vars: |
            ENVIRONMENT=staging
            LOG_LEVEL=DEBUG

      - name: Run smoke tests on staging
        run: |
          echo "Running smoke tests against ${{ steps.deploy.outputs.url }}"
          sleep 10 # Give the service a moment to start
          curl -f ${{ steps.deploy.outputs.url }}/health

  gate-to-production:
    runs-on: ubuntu-latest
    needs: deploy-to-staging
    environment:
      name: production # This environment must be configured with a "Required reviewer"
    steps:
      - name: Manual Approval Gate
        uses: trstringer/manual-approval@v1
        with:
          secret: ${{ secrets.GITHUB_TOKEN }}
          approvers: ${{ vars.PRODUCTION_APPROVERS }} # Stored as a repository variable
          minimum-approvals: 1
          issue-link: ${{ vars.JIRA_CHANGE_TICKET_URL }} # Stored as a repository variable

  deploy-to-production:
    runs-on: ubuntu-latest
    needs: gate-to-production
    environment:
      name: production
      url: ${{ steps.deploy.outputs.url }}
    steps:
      - name: Authenticate to Google Cloud (Production)
        uses: google-github-actions/auth@v2
        with:
          workload_identity_provider: ${{ secrets.GCP_WIF_PROVIDER }}
          service_account: ${{ secrets.GCP_PROD_SA }}

      - name: Deploy to Cloud Run (Production)
        id: deploy
        uses: google-github-actions/deploy-cloudrun@v2
        with:
          service: manufacturing-copilot-prod
          project_id: ${{ vars.GCP_PROJECT_ID }}
          region: ${{ vars.GCP_REGION }}
          image: ${{ needs.build-and-push-image.outputs.image_uri }}
          env_vars: |
            ENVIRONMENT=production
            LOG_LEVEL=INFO

      - name: Archive Release Evidence
        # This script would gather artifacts and create an audit record
        run: |
          echo "Archiving evidence for SHA ${{ github.sha }} approved by ${{ github.actor }}"
          # poetry run python tools/archive_release.py --sha ${{ github.sha }} --approver ${{ github.actor }} --run-id ${{ github.run_id }}
          
      - name: Run smoke tests on production
        run: |
          echo "Running smoke tests against ${{ steps.deploy.outputs.url }}"
          sleep 10
          curl -f ${{ steps.deploy.outputs.url }}/health
```

In [None]:
# --- tools/prompt_lint.py ---
# A custom script to enforce quality standards on our LLM prompts.

import sys
from pathlib import Path
import yaml

def lint_prompts(directory: Path) -> int:
    """
    Lints all YAML prompt files in a directory for specific quality rules.
    Returns the number of errors found.
    """
    error_count = 0
    try:
        prompt_files = list(directory.rglob("*.yaml"))
    except FileNotFoundError:
        print(f"Directory not found: {directory}")
        return 1
    
    if not prompt_files:
        print(f"No prompt files found in {directory} to lint.")
        # Return 0 because no files means no errors.
        return 0

    print(f"Linting {len(prompt_files)} prompt files in '{directory}'...")

    for path in prompt_files:
        try:
            with open(path, 'r') as f:
                prompt_data = yaml.safe_load(f)

            # Rule 1: Check for a 'version' key
            if 'version' not in prompt_data:
                print(f"❌ ERROR: {path} - Missing 'version' key.")
                error_count += 1

            # Rule 2: Check for a 'description' key
            if 'description' not in prompt_data:
                print(f"❌ ERROR: {path} - Missing 'description' key.")
                error_count += 1
            
            # Rule 3: Check that description is not empty
            elif not prompt_data.get('description', '').strip():
                print(f"❌ ERROR: {path} - 'description' cannot be empty.")
                error_count += 1

            # Rule 4: Check for a 'template' key
            if 'template' not in prompt_data:
                print(f"❌ ERROR: {path} - Missing 'template' key.")
                error_count += 1

            # Rule 5: Check for placeholders like 'TODO' or 'FIXME' in the template
            template = prompt_data.get('template', '')
            if 'TODO' in template or 'FIXME' in template:
                print(f"❌ ERROR: {path} - Found placeholder 'TODO' or 'FIXME' in template.")
                error_count += 1

        except yaml.YAMLError as e:
            print(f"❌ ERROR: Could not parse {path} - {e}")
            error_count += 1
        except Exception as e:
            print(f"❌ ERROR: An unexpected error occurred with {path} - {e}")
            error_count += 1
            
    if error_count == 0:
        print("✅ All prompts passed linting.")
    else:
        print(f"\nFound {error_count} errors in prompts.")
        
    return error_count

if __name__ == "__main__":
    if len(sys.argv) > 1:
        prompt_dir = Path(sys.argv[1])
        errors = lint_prompts(prompt_dir)
        sys.exit(1 if errors > 0 else 0)
    else:
        print("Usage: python tools/prompt_lint.py <directory_to_lint>")
        sys.exit(1)


## 🧾 Evidence Collection for Audits

For compliance, we need to prove that our release process was followed. The `archive_release.py` script gathers this proof into a single, auditable artifact.

| Artifact / Evidence         | Source in GitHub Actions                                     | Purpose                                                              |
| --------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------- |
| **Code Version (SHA)**      | `github.sha` context variable.                               | Uniquely identifies the exact code version that was deployed.        |
| **Test Results**            | `pytest` output, uploaded as a workflow artifact.            | Proof that all unit and smoke tests passed.                          |
| **Vulnerability Scan Report** | `Trivy` output, uploaded as a workflow artifact.             | Proof that the container image was scanned for known vulnerabilities. |
| **Approver Identity**       | `github.actor` from the manual approval step.                | Records who authorized the production deployment.                    |
| **Deployment Timestamp**    | The timestamp of the workflow run.                           | Records when the deployment occurred.                                |
| **Link to Change Ticket**   | The `issue-link` parameter from the manual approval step.    | Connects the deployment back to the formal change management process. |


In [None]:
# --- tools/archive_release.py ---
import json
from datetime import datetime
from pathlib import Path
import argparse

def archive_release_evidence(git_sha: str, approver: str, workflow_run_id: str):
    """
    Gathers release evidence into a single JSON file for auditing.
    In a real workflow, it would also download artifacts like test reports.
    """
    evidence_payload = {
        "release_id": f"REL-{datetime.utcnow().strftime('%Y%m%d')}-{git_sha[:7]}",
        "deployment_timestamp_utc": datetime.utcnow().isoformat(),
        "code_version_sha": git_sha,
        "production_approver": approver,
        "github_workflow_run_id": workflow_run_id,
        "artifact_links": {
            "test_results": f"https://github.com/your-org/your-repo/actions/runs/{workflow_run_id}#artifacts",
            "vulnerability_scan": f"https://github.com/your-org/your-repo/actions/runs/{workflow_run_id}#artifacts",
            "sbom": f"https://github.com/your-org/your-repo/actions/runs/{workflow_run_id}#artifacts"
        },
        "status": "SUCCESS"
    }

    # Create a directory to store evidence artifacts
    evidence_dir = Path("release_evidence")
    evidence_dir.mkdir(exist_ok=True)
    
    file_path = evidence_dir / f"{evidence_payload['release_id']}.json"
    with open(file_path, 'w') as f:
        json.dump(evidence_payload, f, indent=4)
        
    print(f"Successfully archived release evidence to: {file_path}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Archive release evidence.")
    parser.add_argument("--sha", required=True, help="Git SHA of the release.")
    parser.add_argument("--approver", required=True, help="GitHub username of the approver.")
    parser.add_argument("--run-id", required=True, help="GitHub Actions workflow run ID.")
    
    args = parser.parse_args()
    
    archive_release_evidence(
        git_sha=args.sha,
        approver=args.approver,
        workflow_run_id=args.run_id
    )


## 🧪 Lab Assignment: Build Your CI/CD Pipeline

1.  **Create the Workflow Files:**
    -   In your project's root directory, create a `.github/workflows` folder.
    -   Inside this folder, create `ci.yml` and `deploy.yml` and copy the code from this notebook into them.

2.  **Configure Branch Protection:**
    -   Go to your GitHub repository's settings > Branches.
    -   Add a branch protection rule for your `main` branch.
    -   Enable "Require status checks to pass before merging."
    -   Select the `quality-checks` and `security-checks` jobs from your `ci.yml` workflow.
    -   (Optional) Enable "Require a pull request before merging."

3.  **Add Repository Secrets:**
    -   Go to Settings > Secrets and variables > Actions.
    -   Add the necessary secrets for your `deploy.yml` workflow, such as `GCP_WIF_PROVIDER` and `GCP_STAGING_SA`. You will need to create these in your cloud provider.

4.  **Test the Pipeline:**
    -   Create a new branch and make a small change (e.g., add a comment to `main.py`).
    -   Push the branch and open a pull request. You should see the `ci.yml` workflow start automatically.
    -   After the CI checks pass, merge the pull request. This should trigger the `deploy.yml` workflow, starting with the deployment to staging.


## ✅ Checklist for this Notebook

- [X] A `ci.yml` workflow is designed to act as a quality gate for all pull requests.
- [X] A `deploy.yml` workflow is designed to handle multi-environment deployments.
- [X] A manual approval step is included to gate deployments to the production environment.
- [X] A custom prompt linting script is created to enforce standards on LLM prompts.
- [X] An evidence archiving script is created to collect auditable proof of the release process.
- [ ] **TODO:** Complete the Lab Assignment to implement and test these workflows in your own repository.


## 📚 References and Further Reading

-   [GitHub Actions Documentation](https://docs.github.com/en/actions)
-   [Using Environments for Deployment](https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment)
-   [Google GitHub Actions: Deploy to Cloud Run](https://github.com/google-github-actions/deploy-cloudrun)
-   [Securing Deployments with Manual Approvals](https://docs.github.com/en/actions/managing-workflow-runs/reviewing-deployments)
-   [Workload Identity Federation for GCP](https://cloud.google.com/iam/docs/workload-identity-federation) - The recommended way to authenticate to GCP from GitHub Actions.
