InternNexus

Production-ready internship aggregator with AI-powered job matching

Try the live app to search internships, apply filters, and test resume-based job matching.

InternNexus aggregates internship opportunities from multiple job boards (Greenhouse, Lever, Ashby) and provides intelligent filtering, categorization, visa sponsorship search, and AI-powered resume matching.

✨ Features

15,000+ Jobs from 145+ companies
Multi-Source Aggregation: Greenhouse, Lever, Workday, Ashby, SmartRecruiters
Hybrid Search: Keyword + semantic (vector) search combined for best results
Boolean Search: Advanced syntax (AND, OR, NOT, "exact", field:value)
AI-Powered Matching: Resume-to-job matching using local LLM embeddings
Smart Categorization: Automatic job categorization (Software Engineering, Data Science, PM, etc.)
Advanced Filtering: Category, location, visa sponsorship, FAANG+, work mode
Pipeline Resume: Interrupted runs can be resumed from last successful step
Production Ready: Rate limiting, JWT auth, OAuth, comprehensive testing

🏗️ Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐
│   Frontend  │────▶│    Backend   │────▶│  PostgreSQL │
│  (Next.js)  │     │   (FastAPI)  │     │  + pgvector │
└─────────────┘     └──────────────┘     └─────────────┘
                            │
                            ▼
               ┌─────────────────────────────┐
               │        External Services    │
               │  OpenAI-compatible embeddings │
               │  Greenhouse / Lever / Ashby │
               └─────────────────────────────┘

Tech Stack:

Frontend: Next.js 16, TypeScript, Tailwind CSS
Backend: FastAPI, SQLAlchemy 2.0, Pydantic
Database: PostgreSQL 18 + pgvector extension
Cache: In-memory TTL cache (optional external Redis)
AI: Ollama or an OpenAI-compatible API (embeddings)
Geo: pycountry (ISO country/state lookups)

🚀 Quick Start

Prerequisites

Docker and docker compose
pnpm (frontend package manager)
uv (Python package manager)
Python 3.12+
OpenAI-compatible API or Ollama (for embeddings)

1. Clone & Setup

git clone <repository-url>
cd internjobs
cp .env.example .env
# Edit .env with your settings

2. Start Infrastructure

docker compose up -d db

3. Install & Run Backend

cd backend
uv sync --group dev

# Run database migrations
uv run alembic -c alembic.ini upgrade head

# Start the backend server
uv run uvicorn app.main:app --reload

Note: pycountry will be installed automatically for location normalization.

4. Install & Run Frontend

cd frontend
pnpm install
pnpm dev

5. Ingest Jobs

cd pipeline
uv sync --group dev
uv run internnexus-pipeline --skip-discover

Done! Visit http://localhost:3000

Local Terminal Workflow

For day-to-day development, run only Postgres in Docker and run the app services in terminals. Set POSTGRES_HOST=localhost in the repo root .env when using the local Docker database.

docker compose up -d db

Terminal 1:

cd backend
uv run alembic -c alembic.ini upgrade head
uv run uvicorn app.main:app --reload

Terminal 2:

cd pipeline
uv run internnexus-pipeline --skip-discover

Terminal 3:

cd frontend
pnpm dev

Bootstrap Admin Access

After signing in once with Google, promote your user from the local database. Replace the email value with the Google email you used to sign in.

docker exec jobs-db sh -c 'psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "insert into admins (id, user_id, role, granted_by, notes) select gen_random_uuid(), id, '\''super_admin'\'', id, '\''local dev bootstrap'\'' from users where email = '\''you@example.com'\'' on conflict (user_id) do update set role = excluded.role, granted_by = excluded.granted_by, granted_at = now(), notes = excluded.notes;"'

📊 Data Pipeline

The ingestion system runs 7 sequential steps:

Step	Action	Description
1	Discover	Verify companies have active job boards
2	Sync inactive	Mark existing jobs inactive before refresh
3	Ingest	Fetch jobs from APIs, deduplicate, and upsert
4	Delete inactive	Remove jobs no longer present upstream
5	Cleanup	Normalize location data (city/state/country)
6	Classify	Categorize jobs with the configured model
7	Embed	Generate vector embeddings for matching

# Run from pipeline/
cd pipeline
uv run internnexus-pipeline

# Skip discovery (faster, uses cached companies)
uv run internnexus-pipeline --skip-discover

# Run continuously (interval from config)
uv run internnexus-pipeline -c

# Run with custom interval
uv run internnexus-pipeline -c --interval 3600

# Single step execution
uv run internnexus-pipeline --step discover
uv run internnexus-pipeline --step ingest
uv run internnexus-pipeline --step cleanup
uv run internnexus-pipeline --step embed

# Utility commands
uv run internnexus-pipeline --dry-run    # Preview without changes
uv run internnexus-pipeline --resume     # Resume failed run
uv run internnexus-pipeline --check      # Health checks only
uv run internnexus-pipeline --fresh      # Clear incomplete runs

# Re-process ALL locations (careful!)
uv run internnexus-pipeline --step cleanup --all

📚 Documentation

Documentation is still lightweight. For now, use README.md, backend/.env.example, and the code in backend/, frontend/, and pipeline/ as the primary reference.

🧪 Testing

# Backend
cd backend && uv run pytest tests
cd backend && uv run pytest tests --cov=app

# Pipeline
cd pipeline && uv run pytest tests

# Frontend
cd frontend && pnpm run lint && pnpm test

🔧 Configuration

Key environment variables:

# Database
POSTGRES_DB=internnexus
POSTGRES_USER=internnexus
POSTGRES_PASSWORD=secure_password

# Redis (optional; leave empty for in-memory cache)
REDIS_URL=

# Auth (min 32 characters)
AUTH_SECRET=your-super-secret-key-min-32-chars

# AI Provider
EMBEDDING_PROVIDER=ollama
OPENAI_BASE_URL=http://localhost:11434
EMBEDDING_MODEL=nomic-embed-text

Use EMBEDDING_PROVIDER=openai-compatible with an OpenAI-compatible embeddings endpoint.

See .env.example for additional configuration options.

🔍 Search Syntax

InternNexus supports advanced boolean search syntax:

Query	Result
`python`	Hybrid search (keyword + semantic)
`python AND remote`	Both terms required
`python OR java`	Either term
`python NOT senior`	Exclude senior roles
`"software engineer"`	Exact phrase match
`title:python`	Search only in title
`company:google`	Search only in company

Example: title:python AND remote NOT senior → Python remote roles, excluding senior positions.

🤝 Contributing

We welcome contributions! Please follow standard GitHub fork and PR workflow.

Development Setup

# 1. Fork and clone
git clone https://github.com/your-username/internjobs.git

# 2. Create branch
git checkout -b feature/your-feature

# 3. Make changes and run the checks for the surfaces you touched
cd backend && uv run pytest tests
cd pipeline && uv run pytest tests
cd frontend && pnpm run lint && pnpm test

# 4. Commit and push
git commit -m "Add your feature"
git push origin feature/your-feature

# 5. Create Pull Request

📄 License

MIT License - see LICENSE file

🙏 Acknowledgments

SimplifyJobs for job data sources
FastAPI for the excellent framework
llama.cpp for local AI capabilities

Built for job seekers everywhere

Name		Name	Last commit message	Last commit date
Latest commit History 333 Commits
.gitea/workflows		.gitea/workflows
backend		backend
frontend		frontend
packages/internnexus-core		packages/internnexus-core
pipeline		pipeline
searxng		searxng
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InternNexus

✨ Features

🏗️ Architecture

🚀 Quick Start

Prerequisites

1. Clone & Setup

2. Start Infrastructure

3. Install & Run Backend

4. Install & Run Frontend

5. Ingest Jobs

Local Terminal Workflow

Bootstrap Admin Access

📊 Data Pipeline

📚 Documentation

🧪 Testing

🔧 Configuration

🔍 Search Syntax

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

InternNexus

✨ Features

🏗️ Architecture

🚀 Quick Start

Prerequisites

1. Clone & Setup

2. Start Infrastructure

3. Install & Run Backend

4. Install & Run Frontend

5. Ingest Jobs

Local Terminal Workflow

Bootstrap Admin Access

📊 Data Pipeline

📚 Documentation

🧪 Testing

🔧 Configuration

🔍 Search Syntax

🤝 Contributing

Development Setup

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages