Skip to content

singhalpooja9/JobScannerMCP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JobScannerMCP

One MCP server, any host. A Model Context Protocol server that exposes a job-posting scanner as clean, typed tools — then drives the unmodified server from two different agent hosts (Claude and Goose) to show that MCP is a portability layer, not a buzzword.

CI python MCP status

What this is: a small, working FastMCP server (search jobs, check scan health, score fit, draft outreach) with a built-in guardrail — plus the config to run it from Claude Code and a validated Goose recipe. Build the tools once; any MCP host can use them.

What this is not: a live scraper or a product. It serves synthetic sample data so it runs offline with no keys and no private code.


Why I built it

"MCP" gets said a lot; far fewer people have actually built a server and consumed it from more than one host. I wanted a concrete artifact that shows the whole loop — clean tool schemas, a governance guardrail, and genuine host-portability — on a domain I know (a job scanner). The payoff is "build once, integrate everywhere": the same four tools work in Claude, Goose, Cursor, or anything that speaks MCP.

(Clean-room + synthetic data only — no real résumé, no private scanner code, no credentials.)


The idea in one picture

                    ┌─────────────────────────────┐
   Claude Code  ───▶│                             │
   (MCP client)     │   job-scanner  (FastMCP)    │
                    │                             │
   Goose        ───▶│   tools:  search_jobs        │──▶  synthetic
   (recipe)         │           get_company_health │     postings +
                    │           score_fit          │     company health
                    │           draft_outreach 🔒  │     (offline, no keys)
                    │   resource: companies://…    │
                    └─────────────────────────────┘
        two hosts, ONE unmodified server        🔒 = dry-run guardrail (never sends)

Quickstart (no API key)

git clone https://github.com/singhalpooja9/JobScannerMCP
cd JobScannerMCP
pip install -e ".[dev]"

pytest -q                          # 8 in-memory MCP tests, fully offline
python -m jobscanner.server        # run the server over stdio

The tools

Tool What it does Notes
search_jobs(keywords, country, remote_only, limit) keyword search over postings read-only
get_company_health(company) per-company scan status (ok/blocked/…) explains why a company returned nothing
score_fit(job_id, profile) 0–100 fit + label for a posting deterministic heuristic; for a calibrated LLM judge see JobFitJudge
draft_outreach(job_id, tone) drafts a note — sent is always false 🔒 the guardrail: the server cannot send

Plus an MCP resource (companies://registry) and a reusable prompt (find_roles).

The guardrail (governance by design)

draft_outreach is the only "write-ish" tool, and it is dry-run only — it returns a draft with sent: false and never has a code path that sends. A host (or a hijacked prompt) cannot make this server take a real-world action. That "the server enforces safety, not the prompt" stance is the governance point, and it's covered by a test.


Run it from two hosts

Host 1 — Claude Code / Claude Desktop

Add the server from recipes/claude_mcp_config.json to your MCP config (set the absolute path to your clone), then ask Claude:

"Use search_jobs to find conversational-AI roles, score each with score_fit, and summarize the good ones."

Host 2 — Goose

The same server, wired into a validated Goose recipe (recipes/find_roles.yaml):

goose recipe validate recipes/find_roles.yaml       # ✓ recipe file is valid
goose run --recipe recipes/find_roles.yaml --params keywords="conversational AI"
goose recipe deeplink recipes/find_roles.yaml       # shareable link

Same tools, two hosts, zero server changes — that's the whole point.


Use it for your own scanner / data

Point the tools at your own postings — no code changes:

  1. Replace data/postings.jsonl (one JSON posting per line: id, company, title, description required; location, department, remote, url optional).
  2. Replace data/companies.yaml with your company/health list.
  3. Restart the server. Both hosts pick up the new data automatically.

To swap the naive score_fit heuristic for a real, calibrated LLM judge, drop in the companion project JobFitJudge.


What this repo demonstrates (concepts + stack)

Concept Where it lives
MCP server (tools + resource + prompt, typed schemas) jobscanner/server.py
MCP client / host portability recipes/claude_mcp_config.json + recipes/find_roles.yaml
Guardrails / safe-by-design tools draft_outreach dry-run (sent=false), tested
Tool schema design typed args + docstrings the model reads
In-memory protocol testing tests/ via FastMCP Client(mcp) — no subprocess
Structured responses Pydantic models + Goose recipe response.json_schema
Offline-first / deterministic CI synthetic data, green with zero API keys

Stack: Python · FastMCP (Model Context Protocol) · Pydantic · PyYAML · pytest + pytest-asyncio · GitHub Actions · Goose recipe · Claude MCP config.

Project layout

jobscanner/
  server.py      # the FastMCP server: 4 tools + 1 resource + 1 prompt + guardrail
  store.py       # load synthetic postings + company health (offline)
  fit.py         # deterministic fit heuristic (swap for JobFitJudge's LLM judge)
  models.py      # Job + CompanyHealth (Pydantic)
data/
  postings.jsonl # synthetic postings   (REPLACE with your own)
  companies.yaml # synthetic company/health registry
recipes/
  claude_mcp_config.json  # Host 1: Claude Code / Desktop MCP config
  find_roles.yaml         # Host 2: validated Goose recipe
tests/           # 8 offline in-memory MCP tests

Honest scope & limitations

  • Working MCP server, not a product. Synthetic data, no live scraping, no auth.
  • score_fit is a naive keyword heuristic — deliberately, so this repo stays about the protocol layer. Rigorous fit scoring lives in the companion JobFitJudge repo.
  • All shipped data is synthetic. No real résumé, no private code, no credentials.

Part of a larger series

Repo 2 of a small set exploring the agentic-AI ecosystem hands-on — evaluation (JobFitJudge), MCP (this repo), multi-agent orchestration, RAG, and spec-driven development. More at singhalpooja.com.


Built by Pooja Singhal — Senior Technical Program Manager.

About

One MCP server, any host. A FastMCP job-scanner server driven from both Claude and Goose — typed tools, a dry-run guardrail, in-memory tests. Python · FastMCP · Model Context Protocol.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages