Skip to content

nnigam96/night-shift

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Night Shift

A local-first, privacy-preserving agentic workflow for autonomous recruitment operations.

Night Shift targets a specific high-volume data problem: managing years of recruiter outreach without sacrificing data privacy. It runs inference, embedding, and storage entirely on an Apple Silicon device and follows a strict zero-data-exfiltration principle.

Architecture

Night Shift adopts a Lambda Architecture to balance historical processing with real-time responsiveness.

1. Data Pipeline

  • Batch layer (cold start): Ingests raw .mbox exports (27k+ emails), then a janitor process strips HTML artifacts, signatures, and legal disclaimers before vectorization.
  • Speed layer (warm state): Uses a multi-tenant Azure AD application with constrained scopes (Mail.Read) to fetch incremental updates and sync drafts while respecting enterprise firewalls.

2. Agentic Brain (LangGraph)

An asynchronous LangGraph state machine replaces fragile linear chains:

  1. Classifier (Gatekeeper): Quantized Llama 3.1 in JSON mode separates genuine recruiter intent from automated spam.
  2. Historian (RAG): Queries a local FAISS vector store for interaction history with specific firms or senders.
  3. Drafter: Synthesizes a context-aware reply using the candidate resume plus the retrieved context.

3. Human-in-the-Loop

Automation pauses before send; drafts surface in a Streamlit dashboard for one-click approval or manual edits.

Tech Stack

Component Technology Decision Driver
Compute Ollama (Llama 3.1) Privacy-first inference on device; eliminates API costs and prevents PII egress.
Orchestration LangGraph Explicit state machine handles loops, thread history, and interrupt-before-send logic.
Vector Store FAISS (CPU) Local persistence is faster and cheaper than cloud DBs for fewer than 100k vectors.
Ingestion Native Python Parsing raw .mbox bytes bypasses corporate network restrictions and API rate limits.
UI Streamlit Decouples the review surface from backend logic for rapid iteration.
Outbound Email Gmail API + OAuth2 Local token storage keeps send authority on device; no third-party relay dependencies.

Setup and Usage

Prerequisites

  • Python 3.12+ (Conda recommended)
  • Ollama installed and serving (ollama serve)
  • Azure client ID for live-sync features

1. Environment

# Clone and set up environment
conda env create -f environment.yml
conda activate night-shift

# Pull local models
ollama pull llama3.1
ollama pull nomic-embed-text

2. Configuration

Create .env in the project root:

AZURE_CLIENT_ID=your-client-id-here
AZURE_TENANT_ID=common
OLLAMA_BASE_URL=http://localhost:11434

3. Gmail OAuth Setup

Night Shift sends final replies via the Gmail API. Create OAuth credentials in Google Cloud (Desktop App type), download the JSON secret, and place it in the project root:

night-shift/
├─ gmail_credentials.json   # downloaded client secret
└─ gmail_token.json         # auto-created after the first auth flow

Both files are git-ignored; keep them local.

4. Data Ingestion (Batch Job)

  1. Export Apple Mail or Outlook inbox to data/university_inbox.mbox.
  2. Clean and filter to isolate high-signal threads:
python -m src.ingest_local
  1. Vectorize to build the retrieval corpus:
python -m src.vectorize

5. Running the Agent

  1. Generate drafts with the LangGraph pipeline:
python -m src.night_shift
  1. Review, edit, and send via the dashboard (invokes Gmail immediately on send):
streamlit run ui/dashboard.py

Scale and Performance

  • Dataset: ~27k historic emails processed.
  • Filtering: heuristic and regex filters remove ~85% of low-value noise before inference.
  • Latency: end-to-end draft generation in under 15 seconds on M-series chips using 4-bit quantization.

Security

  • Authentication: OAuth2 via MSAL (public client flow).
  • Storage: local SQLite plus FAISS indices stored inside the git-ignored data/ directory.

About

TLDR, A RAG project for inbox scubbing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors