Skip to content

CoderSATTY/Jobs-Tinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

💼 SwipeHire

AI-Powered Job Discovery Platform with Semantic Matching & Conversational RAG

SwipeHire is a modern job discovery platform that leverages advanced NLP and machine learning to match candidates with opportunities using semantic understanding of resumes and job descriptions.


Architecture Overview

System Architecture

The platform follows a microservices-inspired architecture deployed on serverless infrastructure (Modal) with a modern frontend (Next.js on Vercel).

Architecture Diagram

Key Components:

Component Technology Description
Client Layer Next.js 16, Vercel Edge React-based SSR frontend with CDN distribution
API Gateway Firebase Auth, WebSocket JWT authentication and real-time bidirectional streaming
Serverless Compute Modal (ASGI) FastAPI backend with Parser Agent, Semantic Ranker, RAG Engine
NLP Pipeline Tesseract, Poppler, Gemini OCR extraction, PDF rendering, LLM reasoning, vector embeddings
Persistence Firestore, Vector Store NoSQL database with semantic embedding index
Ingestion JobSpy Scraper Multi-source job aggregation from LinkedIn, Indeed, Glassdoor

Data Pipeline Flow

The 7-stage pipeline processes data from ingestion through real-time delivery:

Pipeline Flow

Pipeline Stages Explained:

Stage Component Technical Description
1. Data Acquisition JobSpy Scraper Multi-source job aggregation using web scraping from LinkedIn, Indeed, Glassdoor
2. Document Processing Tesseract OCR + Poppler PDF rendering via Poppler, text extraction via Tesseract OCR engine
3. NLP & Embeddings Parser Agent + Embedder Gemini 3 Flash Preview for entity extraction, text-embedding-004 for semantic vectors
4. Storage & Indexing Firestore + Vector Store NoSQL persistence with approximate nearest neighbor (ANN) indexing
5. Retrieval & Ranking Semantic Retriever + Ranker Pure cosine similarity scoring between resume and job embeddings
6. RAG & Response RAG Engine + LLM In-memory context retrieval with streaming SSE responses
7. Real-time Delivery WebSocket Stream Bidirectional WSS for instant job card updates

Core Pipeline Details

1. Resume Parsing & Information Extraction

Multi-stage parsing pipeline to extract structured data from unstructured resume documents:

  • Document Ingestion: PDF/DOCX support via Poppler and python-docx
  • OCR Processing: Tesseract for scanned document text extraction
  • LLM-Powered Extraction: Gemini 3 Flash Preview extracts entities (name, skills, education, experience)
  • Schema Validation: Structured JSON output conforming to defined schemas

2. Semantic Embedding Generation

Dense vector representations for similarity computation:

  • Embedding Model: Google's text-embedding-004
  • Dual Encoding: Separate embeddings for resumes and job descriptions
  • Batch Processing: Efficient corpus embedding for 3000+ jobs

3. Recommendation & Ranking Engine

Pure semantic similarity ranking using cosine similarity:

  • Semantic Similarity: Cosine similarity between resume and job embeddings
  • Real-time Ranking: Jobs ranked and streamed via WebSocket in descending order

4. Conversational RAG (Retrieval-Augmented Generation)

Context-aware conversational AI for job-specific queries:

  • Context Injection: User resume + job posting injected into prompts
  • Streaming Responses: Real-time token streaming via Server-Sent Events (SSE)
  • Use Cases: Fit analysis, interview prep, cover letter generation, skills gap identification

Tech Stack

Layer Technology
Frontend Next.js 16, React, TypeScript, Tailwind CSS, Framer Motion
Backend Python 3.11+, FastAPI (ASGI), Modal Serverless
AI/ML Google Gemini 3 Flash Preview, text-embedding-004
Database Firebase Firestore, Vector Store
Auth Firebase Authentication (JWT)
Real-time WebSocket (WSS), Server-Sent Events (SSE)
OCR Tesseract, Poppler
Deployment Vercel (Frontend), Modal (Backend)

API Endpoints

Endpoint Method Description
/parse-resume POST OCR + LLM parsing of resume documents
/save-profile GET Fetch ranked job recommendations
/ws/jobs WS Real-time streaming of ranked jobs
/match POST Save job to user's matches
/matches GET Retrieve saved matches
/match/{id} DELETE Remove a saved match
/matches DELETE Clear all matches
/chat POST RAG-powered contextual chat
/health GET Health check endpoint

Getting Started

Prerequisites

  • Node.js 18+
  • Python 3.11+
  • Firebase project with Firestore enabled
  • Google AI API key (Gemini)
  • Tesseract OCR installed
  • Poppler installed

Running Locally

Backend Setup

cd backend

# Install dependencies
pip install -r requirements.txt

# Set environment variables
set GEMINI_API_KEY=your-gemini-api-key

# Run the local server
uvicorn server:app --reload --port 8000

Frontend Setup

cd frontend

# Install dependencies
npm install

# Create .env.local file
echo NEXT_PUBLIC_API_URL=http://localhost:8000 > .env.local

# Run development server
npm run dev

The app will be available at http://localhost:3000


Production Deployment

Backend (Modal)

cd backend

# Create volume for credentials
modal volume create tfj-data
modal volume put tfj-data firebase-credentials.json /firebase-credentials.json
modal volume put tfj-data system_prompt.txt /system_prompt.txt

# Create secret for API key
modal secret create gemini-secret GEMINI_API_KEY=your_key

# Deploy to Modal
modal deploy modal_server.py

Frontend (Vercel)

cd frontend
vercel --prod -e NEXT_PUBLIC_API_URL=https://your-modal-url.modal.run

Features

  • ✅ Resume parsing with OCR + LLM extraction
  • ✅ Semantic job matching with vector embeddings
  • ✅ Real-time job recommendations via WebSocket
  • ✅ Swipe-based discovery interface (Tinder-style UX)
  • ✅ RAG-powered job-specific chatbot
  • ✅ Match persistence and management
  • ✅ Responsive, animated UI with Framer Motion
  • ✅ Serverless deployment on Modal + Vercel

Deployed Link

https://swipehire-lime.vercel.app/

License

MIT License - See LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors