Ghostline

Automated GitHub lead generation and outreach tool for Chox. Discovers developers actively building with LangChain and LangGraph, extracts their public contact information, scores them against a multi-signal framework, exports qualified leads to Google Sheets daily, and enables personalized cold email outreach with human review before anything is sent.

Background — What Is Chox?

Chox is an AI agent governance layer — infrastructure that sits between an AI agent and the external APIs it calls (Stripe, databases, Twilio, file systems, etc.). It classifies every tool call by action type and risk, evaluates it against configurable policy rules, and logs a shadow verdict. In shadow mode the verdict is recorded but the call goes through — giving developers visibility into what would have been blocked before they turn enforcement on. When enforcement is enabled, a block verdict stops the agent from calling the underlying API entirely.

The meaningful distinction from existing tools: LangSmith governs what the LLM says. Chox governs what the agent does.

The ideal Chox customer is a developer using LangGraph, LangChain, CrewAI, or a similar framework to build agents that make real, consequential API calls — moving money, modifying databases, sending messages. Ghostline finds these developers on GitHub.

Quick Start

Requires Python 3.10+.

Clone the repo

git clone https://github.com/your-org/ghostline.git
cd ghostline

Create a virtual environment

python3 -m venv .venv
source .venv/bin/activate

Install dependencies
```
pip install -r requirements.txt
```
Configure environment variables
```
cp .env.example .env
```
Edit .env and fill in the three required values (see table below).
Set up a GitHub Personal Access Token
- Go to https://github.com/settings/tokens
- Generate a classic token with no scopes selected (public data only)
- Authentication is needed to raise the API rate limit from 60 requests/hour (unauthenticated) to 5,000 requests/hour
- Paste the token into GITHUB_TOKEN in your .env file
Set up a Google Cloud service account
- Go to Google Cloud Console
- Create a new project (or use an existing one)
- Enable the Google Sheets API and Google Drive API
- Go to IAM & Admin > Service Accounts and create a new service account
- Click the service account, go to Keys > Add Key > Create new key > JSON
- Download the JSON file and save it as service_account.json in the project root
- Copy the service account email address (it looks like name@project.iam.gserviceaccount.com)
Create and share a Google Sheet
- Create a new Google Sheet
- Share it with the service account email address (Editor permission)
- Copy the spreadsheet ID from the URL: https://docs.google.com/spreadsheets/d/{SPREADSHEET_ID}/edit
- Paste the ID into SPREADSHEET_ID in your .env file
Run
```
python run.py
```

Environment Variables

Variable	Description	Example
`GITHUB_TOKEN`	GitHub Personal Access Token (classic, no scopes needed)	`ghp_xxxxxxxxxxxx`
`SPREADSHEET_ID`	Google Sheets spreadsheet ID from the sheet URL	`1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms`
`SERVICE_ACCOUNT_FILE`	Path to Google Cloud service account JSON key file	`service_account.json`

How It Works

The pipeline runs in seven steps:

Discover -- Searches GitHub for repositories matching LangChain/LangGraph import patterns across six search queries, paginating up to three pages each.
Qualify -- Filters out forks, tutorials, demos, official example repos, and low-signal projects using name/description blocklists and structural heuristics.
Extract emails -- For each unique repository owner, runs a four-method email fallback chain: GitHub profile, commit metadata, public events API, and bio regex parsing.
Score -- Calculates a 0-100 lead score based on tool use signals, production maturity indicators, social proof, and developer profile signals. Assigns tier-1 (score >= 60) or tier-2 (score >= 30) classification and infers a pain point category.
Write to Sheets -- Deduplicates against existing rows and batch-appends new leads to the configured Google Sheet.
Report -- Prints a formatted run summary to stdout and appends it to runs.log.

Daily Automation

Use the included shell wrapper and cron configuration to run Ghostline automatically each day.

Manual run via wrapper:

./scripts/run.sh

This activates the virtual environment, runs the pipeline, and logs output to logs/run_YYYY-MM-DD.log.

Cron setup (runs daily at 6:00 AM):

# Review the cron schedule
cat scripts/cron.txt

# Install (replaces existing crontab)
crontab scripts/cron.txt

# Or merge into existing crontab
crontab -l > /tmp/existing_cron
cat scripts/cron.txt >> /tmp/existing_cron
crontab /tmp/existing_cron

Cron output is appended to logs/cron.log.

Outreach Agent

The outreach agent reads scored leads from your Google Sheet, fetches each lead's GitHub README, uses Claude to generate a personalized cold email, lets you review and approve them in the terminal, and sends approved emails via Gmail SMTP. The entire pipeline is a stateful LangGraph graph with a human-in-the-loop checkpoint between generation and sending.

Prerequisites

Before setting up the outreach agent you must have completed the lead generation setup above — the outreach agent reads from the same Google Sheet that run.py writes to.

Step 1 — Install the additional dependencies

pip install -r requirements.txt

The outreach agent adds langgraph, langchain-core, anthropic, and langgraph-checkpoint-sqlite on top of the existing packages. Running the command above installs everything.

Step 2 — Use your Gmail account

The agent sends emails from a Gmail account. Use your existing personal Gmail — no new account needed.

Step 3 — Generate a Gmail app password

Google requires an app password for SMTP access (your regular Gmail password will not work).

Enable 2-Step Verification on your Google account at myaccount.google.com/security (required)
Go to myaccount.google.com/apppasswords
Select Mail and click Generate
Google shows a 16-character password — copy it immediately (it is only shown once)
This is your SMTP_PASSWORD — paste it into .env

Step 4 — Get an Anthropic API key

Go to https://console.anthropic.com and sign in or create an account
Navigate to API Keys and click Create Key
Copy the key (starts with sk-ant-)
Paste it into ANTHROPIC_API_KEY in your .env file

Cost is approximately $0.15 per 10-email batch using Claude Sonnet.

Step 5 — Add outreach variables to your `.env` file

Open your .env file and add the following block:

# ── Outreach agent ────────────────────────────────────────
# Gmail SMTP credentials
SMTP_USERNAME=yourgmail@gmail.com
SMTP_PASSWORD=xxxx xxxx xxxx xxxx

# Anthropic Claude API
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxx

# Sender identity (shown in From header of every email)
SENDER_NAME=Fayzan and Dilraj, Co-founders of Chox
SENDER_EMAIL=yourgmail@gmail.com

Required: SMTP_USERNAME, SMTP_PASSWORD, ANTHROPIC_API_KEY — the agent refuses to start if any of these are missing.

Optional: SENDER_NAME, SENDER_EMAIL — these have defaults but you should set them explicitly.

Step 6 — Review send limits in `outreach/outreach_config.py`

Open outreach/outreach_config.py and check these two values:

BATCH_SIZE = 10          # emails generated and reviewed per run
MAX_EMAILS_PER_DAY = 20  # hard ceiling on sends per calendar day

If this is your first week sending, lower MAX_EMAILS_PER_DAY to 5. Sending 20 cold emails per day from a Gmail account on day one will get you flagged as spam. Ramp up gradually:

Week	Recommended `MAX_EMAILS_PER_DAY`
1–2	5
3–4	10
5+	15–20

Step 7 — Verify setup with a dry run

A dry run executes the full pipeline — loads leads, fetches READMEs, generates emails, presents them for your review — but skips the actual send. Nothing is emailed and the Google Sheet is not updated.

python run_outreach.py --dry-run

What you will see:

Config validation (fails fast if any required env var is missing)
A business hours warning if you are running outside Mon–Fri 9am–5pm (advisory only)
Lead loading from the Google Sheet
GitHub README fetching (one API call per lead, rate-limited)
Claude generating emails for each lead
The review terminal — one email at a time with lead context

In the review terminal, your options for each email are:

Key	Action
`A`	Approve this email
`R`	Reject this email (lead stays uncontacted, available for the next run)
`E`	Edit the email body in `$EDITOR` (defaults to nano), then approve the edited version
`B`	Approve this email and all remaining emails in the batch — no more prompts
`Q`	Quit — saves state to checkpoint, no emails sent, resume later with `--resume`

After reviewing, the dry run prints a summary of what would have been sent and exits.

If anything errors during the dry run, fix it before running live.

Step 8 — Run live

python run_outreach.py

Same flow as the dry run, except after review it:

Sends approved emails via Gmail SMTP with 90–180 second randomized delays between each send
Writes back contacted=TRUE, contacted_at, and contact_method=email to the Google Sheet for every successfully sent lead
Marks bounced addresses as response_status=bounced in the sheet
Prints a run summary (sent / failed / bounced / rejected counts)

Tip: Review the first 2–3 emails carefully before pressing B to bulk-approve. Once you are confident in the email quality, B after a spot-check is the normal workflow.

Resuming an interrupted run

If you press Q during review or hit Ctrl-C at any point, the pipeline state is saved to ghostline_outreach.db in the project root. Resume from exactly where you left off:

python run_outreach.py --resume

The resume picks up the already-generated emails — it does not re-fetch READMEs or re-call Claude.

Other flags

# Override batch size for this run only (does not change config)
python run_outreach.py --batch-size 5

# Combine flags
python run_outreach.py --dry-run --batch-size 3

Outreach environment variables

Variable	Required	Description	Example
`SMTP_USERNAME`	Yes	Your Gmail address	`you@gmail.com`
`SMTP_PASSWORD`	Yes	Gmail app password (16 chars, from myaccount.google.com/apppasswords)	`abcd efgh ijkl mnop`
`ANTHROPIC_API_KEY`	Yes	Anthropic API key for Claude	`sk-ant-xxxx`
`SENDER_NAME`	No	Display name in the From header	`Fayzan and Dilraj, Co-founders of Chox`
`SENDER_EMAIL`	No	Reply-To address (defaults to SMTP_USERNAME)	`you@gmail.com`

Outreach troubleshooting

Error	Cause	Fix
`Missing required environment variable(s)`	`SMTP_USERNAME`, `SMTP_PASSWORD`, or `ANTHROPIC_API_KEY` not set in `.env`	Add the missing variable(s) and re-run
`SMTPAuthenticationError`	App password is wrong or 2FA is not enabled on the Gmail account	Re-generate the app password at myaccount.google.com/apppasswords
`No uncontacted leads available`	All leads in the sheet have already been contacted	Run `python run.py` to discover new leads first
`No checkpoint found for today's thread`	Used `--resume` but no run has been started today	Run without `--resume` to start a fresh pipeline
`ModuleNotFoundError: langgraph`	Dependencies not installed	Run `pip install -r requirements.txt`
Emails generating but all look generic	README fetch failed for most leads (404s)	Check GitHub token is set and valid in `.env`
`Editor 'X' not found`	`$EDITOR` env var points to a binary that does not exist	Run `export EDITOR=nano` before running the agent

Troubleshooting

Error	Cause	Fix
`Failed to authenticate with GitHub API`	`GITHUB_TOKEN` is missing, empty, or invalid	Verify the token at github.com/settings/tokens and update `.env`
`Spreadsheet not found`	`SPREADSHEET_ID` is wrong or the sheet is not shared with the service account	Double-check the ID in the sheet URL and confirm the sheet is shared with the service account email
`Service account file not found`	The JSON key file path in `SERVICE_ACCOUNT_FILE` does not exist	Verify the file is in the project root or update the path in `.env`
`Core API budget critically low`	GitHub rate limit exhausted during the run	Wait for the rate limit to reset (check `X-RateLimit-Reset` header) and run again
`No new leads found`	Normal if the tool ran recently	Leads are deduplicated by GitHub username. New leads appear as new developers push LangChain/LangGraph code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ghostline

Background — What Is Chox?

Quick Start

Environment Variables

How It Works

Daily Automation

Outreach Agent

Prerequisites

Step 1 — Install the additional dependencies

Step 2 — Use your Gmail account

Step 3 — Generate a Gmail app password

Step 4 — Get an Anthropic API key

Step 5 — Add outreach variables to your `.env` file

Step 6 — Review send limits in `outreach/outreach_config.py`

Step 7 — Verify setup with a dry run

Step 8 — Run live

Resuming an interrupted run

Other flags

Outreach environment variables

Outreach troubleshooting

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
discovery		discovery
docs		docs
outreach		outreach
scripts		scripts
shared		shared
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
run_outreach.py		run_outreach.py
score_leads.py		score_leads.py

Folders and files

Latest commit

History

Repository files navigation

Ghostline

Background — What Is Chox?

Quick Start

Environment Variables

How It Works

Daily Automation

Outreach Agent

Prerequisites

Step 1 — Install the additional dependencies

Step 2 — Use your Gmail account

Step 3 — Generate a Gmail app password

Step 4 — Get an Anthropic API key

Step 5 — Add outreach variables to your .env file

Step 6 — Review send limits in outreach/outreach_config.py

Step 7 — Verify setup with a dry run

Step 8 — Run live

Resuming an interrupted run

Other flags

Outreach environment variables

Outreach troubleshooting

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 5 — Add outreach variables to your `.env` file

Step 6 — Review send limits in `outreach/outreach_config.py`

Packages