A chatbot that helps developers find good first issues in open source projects, powered by the goodfirstissues dataset.
- Chat-based interface for finding open source issues
- Language filtering - Select your preferred programming languages
- LLM-powered recommendations - Get personalized issue suggestions
- Privacy-first - Chat messages are not stored in our database
- Consent-gated analytics - Mixpanel and Google Analytics (opt-in only)
- Frontend: Next.js 16 (App Router), React 19, TypeScript
- Backend: Next.js API Routes, Vercel AI SDK
- Storage: Redis (via official redis package)
- LLM: OpenAI GPT-4o-mini (via Vercel AI SDK)
- Analytics: Mixpanel, Google Analytics 4
- Deployment: Vercel
- Node.js 20+ (required for Next.js 16)
- pnpm (install via
npm install -g pnpmor see pnpm.io) - Vercel account
- OpenAI API key
- Redis instance (e.g., Upstash Redis, Redis Cloud, or self-hosted)
- (Optional) Mixpanel account and Google Analytics 4 property
- Clone the repository:
git clone <repository-url>
cd pull-request-minimal- Install dependencies:
pnpm install- Set up environment variables:
Create a .env.local file with the following variables:
# OpenAI API Key (required)
OPENAI_API_KEY=your_openai_api_key
# Redis (required)
REDIS_URL=redis://default:password@host:port
# Or for Redis with TLS:
# REDIS_URL=rediss://default:password@host:port
# Cron Secret for dataset refresh (required for production)
CRON_SECRET=your_random_secret_string
# Analytics (optional - only if you want analytics)
NEXT_PUBLIC_GA_MEASUREMENT_ID=G-XXXXXXXXXX
NEXT_PUBLIC_MIXPANEL_TOKEN=your_mixpanel_tokenYou can use any Redis provider:
-
Upstash Redis (recommended for Vercel deployments):
- Go to Upstash Console
- Create a new Redis database
- Copy the
REDIS_URLfrom the dashboard
-
Redis Cloud:
- Sign up at Redis Cloud
- Create a database
- Copy the connection URL
-
Self-hosted Redis:
- Format:
redis://[username]:[password]@[host]:[port] - For TLS:
rediss://[username]:[password]@[host]:[port]
- Format:
Before using the chatbot, you need to load the dataset:
- Manual trigger (for development):
curl -X POST http://localhost:3000/api/admin/refresh-dataset \
-H "Authorization: Bearer your_cron_secret"- Vercel Cron (for production):
- Add a cron job in
vercel.json(see below) - Or use Vercel Cron Jobs in the dashboard
- Add a cron job in
pnpm devOpen http://localhost:3000 in your browser.
- Push your code to GitHub/GitLab/Bitbucket
- Import the project in Vercel
- Add all environment variables in Vercel dashboard
- Deploy
Create a vercel.json file in the root directory:
{
"crons": [
{
"path": "/api/admin/refresh-dataset",
"schedule": "0 2 * * *"
}
]
}This will refresh the dataset daily at 2 AM UTC.
Alternatively, you can set up cron jobs in the Vercel dashboard:
- Go to your project → Settings → Cron Jobs
- Add a new cron job:
- Path:
/api/admin/refresh-dataset - Schedule:
0 2 * * *(daily at 2 AM UTC)
- Path:
Important: Make sure to set the CRON_SECRET environment variable in Vercel. The cron job will automatically include the x-vercel-cron header, but you can also protect it with a Bearer token.
Chat endpoint that returns streaming LLM responses with issue recommendations.
Request:
{
"message": "Find me Python issues",
"preferredLanguages": ["Python"]
}Response: Server-Sent Events (SSE) stream with:
- LLM text chunks
- Final
recommendedIssuesarray with issue details
Get user preferences (languages, consent status).
Response:
{
"preferredLanguages": ["Python", "JavaScript"],
"consent": true
}Update user preferences.
Request:
{
"preferredLanguages": ["Python"],
"consent": true
}Refresh the dataset from the goodfirstissues repository.
Headers:
Authorization: Bearer <CRON_SECRET>(orx-vercel-cronheader for Vercel Cron)
Response:
{
"success": true,
"issueCount": 1234,
"languageCount": 45
}- Chat messages are NOT stored in our database
- Analytics are opt-in only - Users must consent before any tracking
- Anonymous metrics only - No personal information is collected
- See
/privacypage for full details
The chat API is rate-limited to 20 requests per minute per user (based on anonymous ID + IP address).
├── app/
│ ├── api/
│ │ ├── chat/ # Chat API endpoint
│ │ ├── preferences/ # User preferences API
│ │ └── admin/
│ │ └── refresh-dataset/ # Dataset refresh endpoint
│ ├── privacy/ # Privacy policy page
│ ├── layout.tsx # Root layout
│ ├── page.tsx # Main chat page
│ └── globals.css # Global styles
├── components/
│ ├── ChatInterface.tsx # Main chat UI
│ ├── LanguagePicker.tsx # Language selection component
│ ├── IssueCards.tsx # Issue recommendation cards
│ ├── ConsentBanner.tsx # Analytics consent banner
│ └── PrivacyDisclaimer.tsx # Privacy notice component
├── lib/
│ ├── analytics.ts # Analytics helpers (GA4, Mixpanel)
│ ├── redis.ts # Redis client utilities
│ ├── kv.ts # KV utilities (uses Redis)
│ └── dataset.ts # Dataset normalization and filtering
└── README.md
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
Yes | OpenAI API key for LLM |
REDIS_URL |
Yes | Redis connection URL (format: redis://[username]:[password]@[host]:[port]) |
CRON_SECRET |
Yes (prod) | Secret for protecting cron endpoints |
NEXT_PUBLIC_GA_MEASUREMENT_ID |
No | Google Analytics 4 measurement ID |
NEXT_PUBLIC_MIXPANEL_TOKEN |
No | Mixpanel project token |
MIT