Substack Scraper

A Python tool for scraping articles from Substack publications for research and content analysis.

Overview

This scraper extracts article metadata (date, author, headline, URL, subheading) from any Substack publication and exports it to CSV format. Designed for journalists, researchers, and data analysts studying media narratives and content patterns.

Output Format

CSV with the following columns:

date	author_byline	headline	url	subheading
2025-12-10	Author Name	Article Title	https://...	Brief description

Examples

Works with any Substack publication:

Drop Site News: https://www.dropsitenews.com
Zeteo: https://zeteo.com
Any custom domain running on Substack

Recommended Use

Copy the code from the python file, edit the URL, rename the save file, and hit enter.

Features

Scrapes all articles from any Substack publication
Extracts: publication date, author byline, headline, URL, and subheading/description
Exports to clean CSV format
Respectful scraping with built-in delays
Fully commented code for learning and customization

Use Cases

Journalism research: Analyze coverage patterns across independent media
Content analysis: Study narrative framing and topic trends
Media monitoring: Track publication output over time
Academic research: Dataset creation for discourse analysis

Dependencies

External packages (install required):

requests - HTTP requests to API endpoints
beautifulsoup4 - HTML parsing for author extraction

Built-in packages (no install needed):

csv - CSV file writing
time - Rate limiting delays
urllib.parse - URL construction

Transparency

The code has been generated by Perplexity.
The code has been tested. The repo includes two sample CSVs as output.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
DSN-AllPosts-AuthorsOrganized.csv		DSN-AllPosts-AuthorsOrganized.csv
LICENSE		LICENSE
README.md		README.md
dropsite_all_posts.csv		dropsite_all_posts.csv
substack-scraper.py		substack-scraper.py
zeteo_all_posts.csv		zeteo_all_posts.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Substack Scraper

Overview

Output Format

Examples

Recommended Use

Features

Use Cases

Dependencies

Transparency

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Substack Scraper

Overview

Output Format

Examples

Recommended Use

Features

Use Cases

Dependencies

Transparency

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages