Journey

Find the right journalists, at the right publications, with the right contact info — automatically.

An AI-powered Python agent that discovers publications covering your topic, identifies the journalists and editors who write about it, extracts their contact information (email, Twitter/X, LinkedIn), and saves everything to clean, deduplicated CSV files.

Built for founders, PR teams, and anyone who wants to get published. Point it at a topic, and it does the research for you.

Journey is an AI agent that turns what you're building into targeted media opportunities.

It finds journalists already writing about your space — so you can pitch with relevance, not guesswork.

Why Journey?

Builders ship. Most never get seen.

Journey fixes distribution.

Use cases

SaaS launches
Open source releases
Product updates
Founder stories

Philosophy

Don't spray 1,000 emails.

Find the 10 people already writing about you.

What It Does

$ python journey.py "AI startups SaaS" --max-pubs 10

============================================================
Searching for publications covering: AI startups SaaS
============================================================

  Identified 12 relevant publications.

    - TechCrunch (online media)
    - The Information (online media)
    - Bloomberg (news outlet)
    - The Wall Street Journal (newspaper)
    - Forbes (magazine)
    - PitchBook News (trade publication)
    ...

  [1/10] Looking up journalists at TechCrunch...
    Found 8 journalist(s):
      - Russell Brandom (AI Editor)
      - Rebecca Bellan (Senior Reporter)
      - Julie Bort (Venture Editor)
      ...

  [2/10] Looking up journalists at The Information...
    Found 10 journalist(s):
      - Amir Efrati (Co-Executive Editor)
      - Katie Roof (Deputy Bureau Chief, VC)
      ...

============================================================
Enriching contact information...
============================================================

============================================================
PROGRESS REPORT
============================================================
  Publications:    10 total, 8 scanned, 2 pending
  Journalists:     39 total
  With email:      16
  With Twitter:    10
  With LinkedIn:   5
  Has any contact: 22
  Needs enriching: 17
============================================================

Real results from a single run — 10 publications, 39 journalists, 22 with verified contact info, in under 2 minutes.

Example Output

Publications Found

Publication	Type	Journalists Found
TechCrunch	online media	8
The Information	online media	10
Bloomberg	news outlet	5
Forbes	magazine	5
PitchBook News	trade publication	4
Andreessen Horowitz (a16z)	blog	4
The Wall Street Journal	newspaper	2
AI Magazine	magazine	1

Sample Contacts

Name	Title	Publication	Contact Found
Russell Brandom	AI Editor	TechCrunch	email
Julie Bort	Venture Editor	TechCrunch	email, twitter
Amir Efrati	Co-Executive Editor	The Information	email, twitter
Katie Roof	Deputy Bureau Chief, VC	The Information	twitter
Rachel Metz	AI Reporter	Bloomberg	email, twitter
Alex Konrad	Senior Editor	Forbes	twitter, linkedin
Rosie Bradbury	Senior Reporter	PitchBook News	email, linkedin
Kate Clark	Reporter	The Wall Street Journal	twitter

Features

Publication Discovery — Searches the web and uses AI to identify the most relevant newspapers, magazines, online media, blogs, and trade publications covering your topic.
Journalist Lookup — For each publication, finds journalists, editors, reporters, and writers who cover your beat. Extracts editorial/team pages and cross-references with web search results.
Contact Enrichment — Searches for email addresses, Twitter/X handles, and LinkedIn profiles. Batched in groups of 5 per publication for speed.
Resumable Runs — Saves progress after every publication. If a run is interrupted, re-run the same command and it picks up exactly where it left off. No duplicate work, no wasted API calls.
Enrich-Only Mode — Already have journalists but need more contact info? Run --enrich-only to skip discovery and just fill in missing contacts.
Persistent CSVs — Each topic gets a single pair of CSV files that grow over time. New publications and journalists are appended; existing data is never overwritten.
Progress Reports — See at a glance how many publications are scanned, how many journalists are found, and how many still need contact enrichment.
Retry with Backoff — API calls automatically retry on transient errors (429, 5xx) with exponential backoff.
Smart Deduplication — Publications deduped by name + domain, journalists by normalized name + publication. Handles punctuation, casing, and whitespace.
Data Sanitization — All data is cleaned and normalized before saving. No nulls, no stray whitespace, no garbage in your CSV.

Quick Start

(linux, windows is slightly different venv)

git clone https://github.com/aiassistsecure/journey.git
cd journey
python3 -m venv venv
source venv/bin/activate   # different on Windows depending on which shell
pip install -r requirements.txt
cp .env.example .env        # Add your AiAssist.net API key
python3 journey.py "your topic here"

Requirements

Python 3.10+
requests library
An AiAssist.net API key (any plan, start free 7 days) with access to:
- /v1/chat/completions — AI analysis (GPT-5.4 via OpenAI provider)
- /v1/search — Web search
- /v1/web/extract — Web page extraction

Usage

Full Scan

Discover publications, find journalists, enrich contacts — the full pipeline:

python journey.py "AI startups SaaS artificial intelligence"

Limit Publications

Focus on the top N most relevant publications:

python journey.py "AI startups SaaS" --max-pubs 10

Enrich Only

Already ran a scan but want to fill in more contacts? Skip discovery entirely:

python journey.py "AI startups SaaS" --enrich-only

Resume

Interrupted mid-run? Just run the same command again. It picks up where it left off:

# Run 1 — gets through 5 of 10 publications, then times out
python journey.py "AI startups SaaS" --max-pubs 10

# Run 2 — resumes at publication 6, skips the first 5
python journey.py "AI startups SaaS" --max-pubs 10

# Run 3 — all 10 done, just enriches any remaining contacts
python journey.py "AI startups SaaS" --max-pubs 10

Output Schema

All output is saved to the output/ directory (gitignored). Each topic gets two persistent CSV files.

Publications — `output/pubs_<topic>.csv`

Column	Description
`name`	Publication name
`url`	Main website URL
`type`	newspaper, magazine, online media, blog, trade publication
`relevance`	Why this publication covers your topic
`editorial_page_url`	URL of their about/team/staff page

Contacts — `output/contacts_<topic>.csv`

Column	Description
`name`	Journalist's full name
`title`	Role — Senior Reporter, Editor-in-Chief, Staff Writer, etc.
`publication`	Which publication they work at
`publication_url`	Publication's website
`beat`	Topics they cover
`email`	Email address (verified from web sources)
`twitter`	Twitter/X handle
`linkedin`	LinkedIn profile URL
`source_url`	Where the journalist info was found

How It Works

┌─────────────┐     ┌──────────────┐     ┌────────────────┐     ┌─────────────┐
│   Search     │────>│   Analyze    │────>│    Extract     │────>│   Enrich    │
│              │     │              │     │                │     │             │
│ Web search   │     │ GPT-5.4      │     │ Editorial      │     │ Batch web   │
│ for pubs     │     │ ranks &      │     │ pages, staff   │     │ search for  │
│ covering     │     │ identifies   │     │ directories,   │     │ email,      │
│ your topic   │     │ relevant     │     │ bylines →      │     │ Twitter,    │
│              │     │ publications │     │ journalists    │     │ LinkedIn    │
└─────────────┘     └──────────────┘     └────────────────┘     └─────────────┘
                                                                       │
                                                                       v
                                                              ┌─────────────┐
                                                              │    Save     │
                                                              │             │
                                                              │ Dedupe,     │
                                                              │ sanitize,   │
                                                              │ write CSV   │
                                                              │ (resumable) │
                                                              └─────────────┘

Search — Runs 3 parallel web searches with different query angles to cast a wide net.
Analyze — GPT-5.4 analyzes all search results to identify and rank the most relevant publications.
Extract — For each publication, searches for journalists by name, extracts editorial/team pages, and cross-references to build a verified list.
Enrich — For journalists missing contact info, runs batched web searches (5 per batch) and uses AI to extract verified contact details.
Save — Results are deduplicated, sanitized, and saved to CSV after each publication. Nothing is lost if the process is interrupted.

File Structure

journey/
├── journey.py          Main entry point — the agent
├── requirements.txt    Python dependencies
├── .env.example        Template for API key setup
├── .gitignore          Excludes output/, .env, __pycache__/
├── output/             CSV output directory (gitignored)
│   ├── pubs_<topic>.csv
│   └── contacts_<topic>.csv
├── README.md
└── LICENSE             MIT

Technical Details

Detail	Value
AI Model	GPT-5.4 (via AiAssist.net OpenAI proxy)
Search Depth	`advanced` (full page analysis)
Max Content Extraction	15,000 characters per page
Enrichment Batch Size	5 journalists per API call
Retry Policy	3 attempts, exponential backoff (1.5x)
Retryable Status Codes	408, 409, 429, 500, 502, 503, 504
Connection Pooling	`requests.Session` reused across all calls
Deduplication	Publications by name + domain, journalists by normalized name + publication
Data Integrity	All fields sanitized, nulls converted to empty strings, whitespace stripped
Fabrication Policy	Never — only includes details verified from web sources

Use Cases

Founders — Find journalists who cover your space to pitch your startup story.
PR Teams — Build targeted media lists for press releases and product launches.
Content Marketers — Identify publications accepting guest posts or contributed articles.
Researchers — Map the media landscape covering a specific industry or topic.
Job Seekers — Find editors and hiring managers at publications you want to write for.

License

MIT — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Journey — Journalist & Publication Finder Agent

Journey

Why Journey?

Use cases

Philosophy

What It Does

Example Output

Publications Found

Sample Contacts

Features

Quick Start

(linux, windows is slightly different venv)

Requirements

Usage

Full Scan

Limit Publications

Enrich Only

Resume

Output Schema

Publications — `output/pubs_<topic>.csv`

Contacts — `output/contacts_<topic>.csv`

How It Works

File Structure

Technical Details

Use Cases

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
journey.py		journey.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Journey — Journalist & Publication Finder Agent

Journey

Why Journey?

Use cases

Philosophy

What It Does

Example Output

Publications Found

Sample Contacts

Features

Quick Start

(linux, windows is slightly different venv)

Requirements

Usage

Full Scan

Limit Publications

Enrich Only

Resume

Output Schema

Publications — output/pubs_<topic>.csv

Contacts — output/contacts_<topic>.csv

How It Works

File Structure

Technical Details

Use Cases

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Publications — `output/pubs_<topic>.csv`

Contacts — `output/contacts_<topic>.csv`

Packages