A local-first, privacy-preserving agentic workflow for autonomous recruitment operations.
Night Shift targets a specific high-volume data problem: managing years of recruiter outreach without sacrificing data privacy. It runs inference, embedding, and storage entirely on an Apple Silicon device and follows a strict zero-data-exfiltration principle.
Night Shift adopts a Lambda Architecture to balance historical processing with real-time responsiveness.
- Batch layer (cold start): Ingests raw
.mboxexports (27k+ emails), then a janitor process strips HTML artifacts, signatures, and legal disclaimers before vectorization. - Speed layer (warm state): Uses a multi-tenant Azure AD application with constrained scopes (
Mail.Read) to fetch incremental updates and sync drafts while respecting enterprise firewalls.
An asynchronous LangGraph state machine replaces fragile linear chains:
- Classifier (Gatekeeper): Quantized Llama 3.1 in JSON mode separates genuine recruiter intent from automated spam.
- Historian (RAG): Queries a local FAISS vector store for interaction history with specific firms or senders.
- Drafter: Synthesizes a context-aware reply using the candidate resume plus the retrieved context.
Automation pauses before send; drafts surface in a Streamlit dashboard for one-click approval or manual edits.
| Component | Technology | Decision Driver |
|---|---|---|
| Compute | Ollama (Llama 3.1) | Privacy-first inference on device; eliminates API costs and prevents PII egress. |
| Orchestration | LangGraph | Explicit state machine handles loops, thread history, and interrupt-before-send logic. |
| Vector Store | FAISS (CPU) | Local persistence is faster and cheaper than cloud DBs for fewer than 100k vectors. |
| Ingestion | Native Python | Parsing raw .mbox bytes bypasses corporate network restrictions and API rate limits. |
| UI | Streamlit | Decouples the review surface from backend logic for rapid iteration. |
| Outbound Email | Gmail API + OAuth2 | Local token storage keeps send authority on device; no third-party relay dependencies. |
- Python 3.12+ (Conda recommended)
- Ollama installed and serving (
ollama serve) - Azure client ID for live-sync features
# Clone and set up environment
conda env create -f environment.yml
conda activate night-shift
# Pull local models
ollama pull llama3.1
ollama pull nomic-embed-textCreate .env in the project root:
AZURE_CLIENT_ID=your-client-id-here
AZURE_TENANT_ID=common
OLLAMA_BASE_URL=http://localhost:11434Night Shift sends final replies via the Gmail API. Create OAuth credentials in Google Cloud (Desktop App type), download the JSON secret, and place it in the project root:
night-shift/
├─ gmail_credentials.json # downloaded client secret
└─ gmail_token.json # auto-created after the first auth flow
Both files are git-ignored; keep them local.
- Export Apple Mail or Outlook inbox to
data/university_inbox.mbox. - Clean and filter to isolate high-signal threads:
python -m src.ingest_local- Vectorize to build the retrieval corpus:
python -m src.vectorize- Generate drafts with the LangGraph pipeline:
python -m src.night_shift- Review, edit, and send via the dashboard (invokes Gmail immediately on send):
streamlit run ui/dashboard.py- Dataset: ~27k historic emails processed.
- Filtering: heuristic and regex filters remove ~85% of low-value noise before inference.
- Latency: end-to-end draft generation in under 15 seconds on M-series chips using 4-bit quantization.
- Authentication: OAuth2 via MSAL (public client flow).
- Storage: local SQLite plus FAISS indices stored inside the git-ignored
data/directory.