Skip to content

naorpeled/pull-request

Repository files navigation

Pull Request.dev - OSS Contribution Chatbot

A chatbot that helps developers find good first issues in open source projects, powered by the goodfirstissues dataset.

Features

  • Chat-based interface for finding open source issues
  • Language filtering - Select your preferred programming languages
  • LLM-powered recommendations - Get personalized issue suggestions
  • Privacy-first - Chat messages are not stored in our database
  • Consent-gated analytics - Mixpanel and Google Analytics (opt-in only)

Tech Stack

  • Frontend: Next.js 16 (App Router), React 19, TypeScript
  • Backend: Next.js API Routes, Vercel AI SDK
  • Storage: Redis (via official redis package)
  • LLM: OpenAI GPT-4o-mini (via Vercel AI SDK)
  • Analytics: Mixpanel, Google Analytics 4
  • Deployment: Vercel

Setup

Prerequisites

  • Node.js 20+ (required for Next.js 16)
  • pnpm (install via npm install -g pnpm or see pnpm.io)
  • Vercel account
  • OpenAI API key
  • Redis instance (e.g., Upstash Redis, Redis Cloud, or self-hosted)
  • (Optional) Mixpanel account and Google Analytics 4 property

Installation

  1. Clone the repository:
git clone <repository-url>
cd pull-request-minimal
  1. Install dependencies:
pnpm install
  1. Set up environment variables:

Create a .env.local file with the following variables:

# OpenAI API Key (required)
OPENAI_API_KEY=your_openai_api_key

# Redis (required)
REDIS_URL=redis://default:password@host:port
# Or for Redis with TLS:
# REDIS_URL=rediss://default:password@host:port

# Cron Secret for dataset refresh (required for production)
CRON_SECRET=your_random_secret_string

# Analytics (optional - only if you want analytics)
NEXT_PUBLIC_GA_MEASUREMENT_ID=G-XXXXXXXXXX
NEXT_PUBLIC_MIXPANEL_TOKEN=your_mixpanel_token

Getting Redis Connection URL

You can use any Redis provider:

  1. Upstash Redis (recommended for Vercel deployments):

    • Go to Upstash Console
    • Create a new Redis database
    • Copy the REDIS_URL from the dashboard
  2. Redis Cloud:

    • Sign up at Redis Cloud
    • Create a database
    • Copy the connection URL
  3. Self-hosted Redis:

    • Format: redis://[username]:[password]@[host]:[port]
    • For TLS: rediss://[username]:[password]@[host]:[port]

Initial Dataset Load

Before using the chatbot, you need to load the dataset:

  1. Manual trigger (for development):
curl -X POST http://localhost:3000/api/admin/refresh-dataset \
  -H "Authorization: Bearer your_cron_secret"
  1. Vercel Cron (for production):
    • Add a cron job in vercel.json (see below)
    • Or use Vercel Cron Jobs in the dashboard

Running Locally

pnpm dev

Open http://localhost:3000 in your browser.

Deployment

Deploy to Vercel

  1. Push your code to GitHub/GitLab/Bitbucket
  2. Import the project in Vercel
  3. Add all environment variables in Vercel dashboard
  4. Deploy

Vercel Cron Configuration

Create a vercel.json file in the root directory:

{
  "crons": [
    {
      "path": "/api/admin/refresh-dataset",
      "schedule": "0 2 * * *"
    }
  ]
}

This will refresh the dataset daily at 2 AM UTC.

Alternatively, you can set up cron jobs in the Vercel dashboard:

  • Go to your project → Settings → Cron Jobs
  • Add a new cron job:
    • Path: /api/admin/refresh-dataset
    • Schedule: 0 2 * * * (daily at 2 AM UTC)

Important: Make sure to set the CRON_SECRET environment variable in Vercel. The cron job will automatically include the x-vercel-cron header, but you can also protect it with a Bearer token.

API Endpoints

POST /api/chat

Chat endpoint that returns streaming LLM responses with issue recommendations.

Request:

{
  "message": "Find me Python issues",
  "preferredLanguages": ["Python"]
}

Response: Server-Sent Events (SSE) stream with:

  • LLM text chunks
  • Final recommendedIssues array with issue details

GET /api/preferences

Get user preferences (languages, consent status).

Response:

{
  "preferredLanguages": ["Python", "JavaScript"],
  "consent": true
}

PUT /api/preferences

Update user preferences.

Request:

{
  "preferredLanguages": ["Python"],
  "consent": true
}

POST /api/admin/refresh-dataset

Refresh the dataset from the goodfirstissues repository.

Headers:

  • Authorization: Bearer <CRON_SECRET> (or x-vercel-cron header for Vercel Cron)

Response:

{
  "success": true,
  "issueCount": 1234,
  "languageCount": 45
}

Privacy & Analytics

  • Chat messages are NOT stored in our database
  • Analytics are opt-in only - Users must consent before any tracking
  • Anonymous metrics only - No personal information is collected
  • See /privacy page for full details

Rate Limiting

The chat API is rate-limited to 20 requests per minute per user (based on anonymous ID + IP address).

Project Structure

├── app/
│   ├── api/
│   │   ├── chat/              # Chat API endpoint
│   │   ├── preferences/       # User preferences API
│   │   └── admin/
│   │       └── refresh-dataset/ # Dataset refresh endpoint
│   ├── privacy/               # Privacy policy page
│   ├── layout.tsx             # Root layout
│   ├── page.tsx               # Main chat page
│   └── globals.css            # Global styles
├── components/
│   ├── ChatInterface.tsx      # Main chat UI
│   ├── LanguagePicker.tsx     # Language selection component
│   ├── IssueCards.tsx         # Issue recommendation cards
│   ├── ConsentBanner.tsx      # Analytics consent banner
│   └── PrivacyDisclaimer.tsx  # Privacy notice component
├── lib/
│   ├── analytics.ts           # Analytics helpers (GA4, Mixpanel)
│   ├── redis.ts               # Redis client utilities
│   ├── kv.ts                  # KV utilities (uses Redis)
│   └── dataset.ts             # Dataset normalization and filtering
└── README.md

Environment Variables Reference

Variable Required Description
OPENAI_API_KEY Yes OpenAI API key for LLM
REDIS_URL Yes Redis connection URL (format: redis://[username]:[password]@[host]:[port])
CRON_SECRET Yes (prod) Secret for protecting cron endpoints
NEXT_PUBLIC_GA_MEASUREMENT_ID No Google Analytics 4 measurement ID
NEXT_PUBLIC_MIXPANEL_TOKEN No Mixpanel project token

License

MIT

About

a contribution a day keeps the doctor away

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors