XScrapper

Automated newsletter generator from X (Twitter) hashtags. Fetches tweets, processes with AI, and sends email newsletters.

Features

Automated Scraping: Extract tweets from X.com based on configurable hashtag groups
AI Processing: Filter and summarize tweets using LLM via OpenRouter
Email Delivery: Send newsletters via Resend API
Scheduling: Configure automatic execution times

Requirements

Python 3.11+
Playwright (browser automation)
API keys (see Configuration)

Installation

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
.venv\Scripts\activate.ps1     # Windows

# Install dependencies
pip install -e ".[dev]"

# Install Playwright browsers
playwright install chromium

Configuration

Create a .env file with the following variables:

# OpenRouter (AI processing)
OPENROUTER_API_KEY=your_openrouter_api_key

# Resend (Email delivery)
RESEND_API_KEY=your_resend_api_key
EMAIL_FROM=your_email@example.com
EMAIL_TO=recipient@example.com

Cookies Configuration

To scrape X.com, you need to export your browser cookies:

Log in to X.com in your browser (Chrome/Firefox)
Install a cookies extension (e.g., "Get cookies.txt" or "Cookie-Editor")
Export cookies for x.com in Netscape format
Save as cookies.json in the project root

Example cookies.json format:

[
  {
    "domain": ".x.com",
    "name": "auth_token",
    "value": "your_token_here",
    "path": "/",
    "secure": true,
    "sameSite": "Lax"
  }
]

Note: Cookies expire periodically. Re-export if you encounter login walls.

Hashtag Groups

Edit config/hashtags.yaml to configure hashtag groups and scraping parameters:

groups:
  - name: "IA & Data"
    hashtags:
      - "#AI"
      - "#MachineLearning"

scraper:
  min_tweets: 50
  min_interactions: 10
  wait_between_requests_ms: 7000

scheduler:
  hours:
    - 8
    - 13
    - 17
  timezone: "America/Lima"

Usage

Run Immediately

python main.py --run-now

Run Scheduled

python main.py --schedule

Options

Option	Description
`--run-now`	Execute pipeline immediately
`--schedule`	Start scheduler with configured times
`--config`	Path to configuration file (default: config/hashtags.yaml)
`--headless`	Run browser in headless mode (default: True)
`--no-headless`	Run browser in visible mode for debugging

Project Structure

XScrapper/
├── main.py              # Entry point
├── config/
│   └── hashtags.yaml    # Hashtag groups configuration
├── src/
│   ├── scraper.py       # X.com scraping module
│   ├── ai_processor.py  # AI processing module
│   ├── email_sender.py  # Email delivery module
│   └── scheduler.py     # Scheduling module
├── tests/               # Test suite
└── output/              # Raw tweet exports

Development

Run Tests

pytest

Run Tests with Coverage

pytest --cov=src --cov-report=term-missing

Type Checking

mypy src/

Linting

ruff check src/ tests/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
config		config
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
main.py		main.py
pre_commit_coverage.py		pre_commit_coverage.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XScrapper

Features

Requirements

Installation

Configuration

Cookies Configuration

Hashtag Groups

Usage

Run Immediately

Run Scheduled

Options

Project Structure

Development

Run Tests

Run Tests with Coverage

Type Checking

Linting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

XScrapper

Features

Requirements

Installation

Configuration

Cookies Configuration

Hashtag Groups

Usage

Run Immediately

Run Scheduled

Options

Project Structure

Development

Run Tests

Run Tests with Coverage

Type Checking

Linting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages