A daily-run job search pipeline (Steps 1-6) that fetches job postings, disqualifies irrelevant roles through a multi-gate filter, scores the remainder with an LLM (Gemini free tier), and ranks them into a priority list. Steps 7+ (resume tailoring, outreach, application tracking) are described for context but are not included in this repository.
Pipeline state: post-recovery patch; first clean validation completed; still experimental and under human review.
This project recently went through a late-April 2026 recovery after filtering/scoring logic became too aggressive and priority output collapsed from dozens of rows per day to 1-5 rows per day. One clean forced validation run has completed successfully, but this should not be treated as proof of long-term stability.
Human review is requested before further automation expansion.
Primary review areas:
- hard reject vs soft reject separation
- remote/hybrid location safety
- AP/staff accountant/payroll systems scoring
- prevention of silent data loss
- LLM parse/API failure handling
- whether the system should produce a review queue for borderline or failed jobs
See REVIEW_REQUEST.md for the full reviewer brief and open questions.
This is a real-world job search pipeline built over roughly five weeks of daily use by a billing/AP/payroll systems/staff accountant specialist targeting remote and regional roles. It was refined through failures and is now open-sourced to share the architecture and invite community input.
The core insight: most job search tools just list jobs. This pipeline scores, ranks, and learns -- cutting through noise so you spend time applying to roles that match, not reading 50 postings to find 3 worth sending a resume to.
- A plug-and-play tool. The scoring rubric, keyword weights, disqualifiers, and cluster definitions are all tuned for one person's profile.
- A web scraper wrapper. JobSpy handles the actual scraping; this pipeline adds filtering, scoring, ranking, and data retention on top of it.
- Production software. It runs locally on Windows via PowerShell and Python. No Docker, no cloud deployment, no database -- just files.
Early versions of this pipeline silently overwrote historical data on every run. After discovering that 30+ days of scoring data had been lost, every script was audited and a timestamped archive pattern was applied uniformly. The full findings are documented in PIPELINE_DATA_AUDIT.md.
Beyond data retention, the pipeline addresses three scoring problems:
- False positives -- jobs that sound related (e.g., "billing" in a nursing context) pass naive keyword filters and waste LLM quota.
- False negatives -- strong-fit jobs with unusual title wording get buried by keyword-only scoring.
- Static weights -- a system that scores "billing coordinator" the same in month 1 as month 6, even after learning which clusters produce interviews.
┌──────────────────┐
│ fetch_jobs.py │ JobSpy scrapes Indeed, LinkedIn, etc.
└────────┬─────────┘
│ new_jobs_YYYY-MM-DD_HHMM.csv
┌────────▼─────────┐
│ build_queue.py │ Dedup, standardize, archive
└────────┬─────────┘
│ queue.csv
┌────────▼─────────┐
│ pre_scoring.py │ Cluster, zone, environment, pre_score (0-1)
└────────┬─────────┘
│ queue_prescored.csv
┌────────▼─────────┐
│ disqualify.py │ 14-gate hard-rejection engine
└────────┬─────────┘
│ (adds is_disqualified flag)
┌────────▼─────────┐
│ prepare_scoring │ Split into batches of 8; skip disqualified
│ _batch.py │ and already-rejected URLs
└────────┬─────────┘
│ batch_*.md
┌────────▼─────────┐
│ gemini_score.py │ Gemini 2.5 Flash Lite via system_instruction
│ grok_score.py │ (xAI Grok fallback if Gemini quota exhausted)
└────────┬─────────┘
│ queue_scored.csv (TIER 1 / TIER 2 / MONITOR / SKIP)
┌────────▼─────────┐
│ priority_scoring │ Composite formula + time decay → 0-100
│ .py │ Tier 1 >= 85, Tier 2 >= 60
└────────┬─────────┘
│ priority.csv
┌────────▼─────────┐
│ generate_resume │ Tailored .docx for Tier 1 jobs (optional)
└──────────────────┘
Steps 1-6 run daily via run_daily.ps1. Steps 7+ cover outreach,
trajectory tracking, and analytics; they are not included in this repository.
disqualify.py runs a 14-gate hard-rejection engine before any job reaches
Gemini. Gates catch:
- No billing/AR signal in title or description at all
- Wrong role archetype (clinical coder, benefits admin, etc.)
- Executive seniority mismatch
- Healthcare-specific EMR/EHR requirements the candidate lacks
- Out-of-state hybrid or onsite roles
- Required tool stacks the candidate hasn't used
- Specialty niches outside the candidate's experience
Result: 76-92% of fetched jobs are rejected before the LLM sees them, preserving free-tier quota for roles that actually warrant evaluation.
The full scoring rubric lives in State/scoring_rubric.md and is injected
as system_instruction on every Gemini call. Batch files contain only job
data. This means:
- Swapping the scorer (Gemini, Grok, Claude, local model) requires changing one variable, not rewriting the prompt.
- The rubric is version-controlled and human-readable.
- Feedback constraints (
State/feedback.json) are appended to the rubric per-run without modifying the base file.
priority_scoring.py combines three signals into a 0-100 score:
final = (pre_score * 0.35) + (priority_base * 0.50) + (cluster_weight * 0.15)
Then applies time decay (jobs posted more than 7 days ago are penalized).
Gemini's rubric judgment (priority_base) is the dominant signal at 0.50.
The weights are tunable via State/scoring_weights.json.
The pipeline supports Bayesian weight updates: scanning ResumeVersions/
for company-specific PDFs (a strong passive signal that an application was
submitted) and matching them to past priority outputs. Using Bayesian
dampening, it adjusts scoring_weights.json over time -- rewarding clusters
that generate applications, penalizing ignored ones. The update script relies
on excluded personal data folders and is not included in this repository.
Every intermediate file is archived with _YYYY-MM-DD_HHMM timestamps.
The current design attempts to prevent silent overwrites by archiving key intermediate files and recording failure states. See PIPELINE_DATA_AUDIT.md
for the full per-file retention map.
JobSearchOptimizer/
├── JobFetcher/
│ └── fetch_jobs.py # Job scraper (JobSpy + Adzuna fallback)
├── Scripts/
│ ├── build_queue.py # Dedup + standardize -> queue.csv
│ ├── pre_scoring.py # Keyword scoring, zone/cluster
│ ├── disqualify.py # Multi-gate filter
│ ├── prepare_scoring_batch.py # Batch prep for LLM
│ ├── gemini_score.py # Gemini scorer (primary)
│ ├── grok_score.py # Grok scorer (fallback)
│ ├── priority_scoring.py # Composite formula + decay
│ ├── generate_resume.py # Tailored .docx builder (see note below)
│ ├── validate_queue_scored.py # Schema + range checks
│ ├── prune_reject_log.py # Prune stale rejected URLs
│ └── priority_drift_monitor.py # Detect scoring drift over time
├── State/
│ ├── scoring_rubric.md # LLM system instruction (the rubric)
│ ├── scoring_weights.json # Per-cluster priority weights
│ └── feedback.json # Session-level overrides (EXCLUDE)
├── Automation/ # Pipeline output -- EXCLUDE from repo
├── Profile/ # Resume bullets -- EXCLUDE from repo
├── ResumeVersions/ # Application PDFs -- EXCLUDE from repo
├── run_daily.ps1 # Headless daily pipeline runner (Steps 1-6)
├── PIPELINE.md # Full technical reference
└── PIPELINE_DATA_AUDIT.md # Per-file data retention audit
Note on
generate_resume.py(Step 6): This script requiresbuild_resume_base.pyandProfile/experience_library.json, which contain personal resume data and are not included in this repository. The script exits gracefully with an error message if those files are missing -- the pipeline runner warns and continues. Steps 1-5 run fully without it.
This pipeline is tuned for one person's job search. To adapt it:
- Edit
disqualify.py-- replace gate logic with your target domain's disqualifiers (the 14 gates are well-commented). - Edit
State/scoring_rubric.md-- replace the scoring criteria with your profile and target roles. - Edit
KEYWORD_WEIGHTSinpre_scoring.py-- match your domain's vocabulary. - Edit
Scripts/fetch_jobs.py-- set your search terms, locations, and sites.
Prerequisites:
- Python 3.10+ with:
google-genai,python-jobspy,python-docx,openai - Google AI Studio API key (free tier) -- set
GOOGLE_API_KEYinkeys.env - PowerShell 5.1+ (Windows)
- Windows Task Scheduler (optional, for daily automation)
Install dependencies:
python -m pip install google-genai python-jobspy python-docx openai --break-system-packagesRun a single daily cycle (Steps 1-5):
.\run_daily.ps1 -phase allOutput: Automation/priority.csv -- your ranked Tier 1/2 jobs for the day.
Step 6 (generate_resume.py) will warn and skip gracefully if its excluded
dependencies are missing. Customize or omit it as needed.
Run specific phases:
.\run_daily.ps1 -phase pre-score # Steps 1-3 only (fetch + filter)
.\run_daily.ps1 -phase post-score # Steps 4-6 only (score + rank)# API keys
keys.env
# Personal data -- never commit
Profile/
ResumeVersions/
Automation/
State/feedback.json
# Pipeline state (large, regeneratable)
JobFetcher/seen_jobs.json
Automation/rejected_urls_seen.txt
pipeline_checkpoint.json
wave8_state.json
# Python
__pycache__/
*.pyc
*.pyo
# Archives (large, regeneratable)
Automation/PRESCORED_ARCHIVE/
Automation/SCORING_BATCH_ARCHIVE/
Automation/SCRAPES_RAW/Note: This is a personal job-search pipeline, not a general-purpose tool. The
scoring rubric, keyword weights, and disqualifier logic are tuned for a specific
candidate profile. If you adapt it, start with State/scoring_rubric.md and
disqualify.py -- those two files drive most of the filtering behavior.
Human review, issues, and critical feedback are welcome. This repo is shared as a transparent architecture/recovery case study, not as a polished general-purpose product.
Community input is especially welcome on these open questions:
- Scoring formula -- Is the 0.35/0.50/0.15 weight split between pre_score, Gemini score, and cluster weight optimal? Would cross-validation against actual application outcomes improve it?
- Gate efficiency -- Are there false positive or false negative patterns
in
disqualify.pythat a new rule could catch cleanly? - Reject log taxonomy -- Which gate failures deserve permanent vs TTL suppression? Currently only salary floor and collections-primary are permanent.
- Dedup pruning --
rejected_urls_seen.txt(permanent log) anddisqualified_soft.csv/gemini_skipped.csv(TTL logs) each have different retention needs. What are sensible policies for each? - Cross-platform -- The runner is PowerShell on Windows. A
run_daily.shbash equivalent would make this usable on Linux/macOS. - Monitoring integration --
priority_drift_monitor.py,priority_tuning_assistant.py, andpriority_action_alignment.pyexist but are not wired into the daily pipeline. Help integrating them would be valuable. - LLM backends -- Adding support for other free-tier LLMs (Mistral, local LLaMA via Ollama) as additional fallback scorers.
Please read PIPELINE.md before contributing code. For architecture or design changes, open an issue first.
MIT -- use, modify, and share freely.
Built with significant assistance from Claude (Anthropic), ChatGPT, Gemini, Microsoft Copilot and DeepSeek AI.