Skip to content

prime-optimal/vinny

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vinny Scraper

Vegas nightlife event scraper covering 13+ venues. v2.3.1

Vinny scrapes event listings, artist data, table pricing, and images from Las Vegas nightclubs and dayclubs. It outputs structured JSON, CSV, Markdown, and Cloudflare D1 SQL — powering the live vinny.vegas site.

Venues

Wynn / Fontainebleau properties:

  • LIV Las Vegas (nightclub)
  • LIV Beach (dayclub)
  • XS Nightclub
  • Encore Beach Club (day + night)

TAO Group properties:

  • Omnia Nightclub
  • Hakkasan Nightclub
  • Marquee Nightclub + Dayclub
  • Jewel Nightclub
  • Tao Nightclub + Tao Beach Dayclub
  • Palm Tree Beach Club
  • Liquid Pool Lounge

Quick Start

# Install dependencies
uv sync

# Scrape LIV events
vinny scrape liv

# Scrape all TAO Group venues
vinny scrape tao

# Full pipeline: scrape → images → R2 → D1
vinny sync liv

# See all commands
vinny --help

CLI

Built with Cyclopts. Command groups:

Group Commands Description
Scrape scrape, process Crawl venue sitemaps, extract events
Pipeline sync Full pipeline: scrape → images → R2 → D1
Export export-csv, export-md, export-d1, export-sqlite Output in various formats
Images images download, images upload-r2, images sync-d1, images status, images validate Download, upload to R2, sync URLs to D1
Enrich enrich artists Spotify, Resident Advisor, Tracklists data
Tables tables, deals, heatmap Table pricing exploration
Info stats, list-runs, venues, diff, sitemap-status Database, run, and sitemap inspection

Full reference: vinny --help or see the MkDocs site.

Architecture

flowchart TD
    A["Sitemap URLs"] --> B["Crawlee\n(incremental)"]
    B --> C["Venue Extractor"]
    C --> D["VegasEvent\n(Pydantic)"]
    D --> E["StorageManager\ntimestamped run"]
    D --> F["MasterDatabase\nfield-level diff tracking"]
    F --> G["JSON / CSV / Markdown"]
    F --> H["D1 SQL"]
    F --> I["Artist Images\n(VEA CDN → R2)"]
    H --> J["vinny.vegas\n(Astro + Cloudflare Pages)"]
    I --> J

    style A fill:#7c3aed,color:#fff
    style D fill:#059669,color:#fff
    style J fill:#f59e0b,color:#000
Loading

Stack: Python 3.10+, Crawlee, Pydantic v2, Cyclopts, Rich, uv

Extractors: Each venue family has its own extractor in src/extractors/ — LIV (liv.py), Wynn Social (wynn.py for XS/EBC), TAO Group (tao.py).

Incremental scraping: SitemapIndex tracks lastmod per URL — only new/updated events are re-scraped. Use --force for a full re-scrape.

The Site

The vinny.vegas Astro SSR site lives in site/ (gitignored, separate GitLab repo). It reads directly from Cloudflare D1 and serves artist images from R2 (img.vinny.vegas).

Pages: homepage, /events, /events/[key], /venues, /venues/[id], /artists/[id], /tables (interactive pricing comparison with React sliders).

Development

# Lint + type check
just check

# Run tests
just test

# Format code
just format

Pre-commit hooks enforce ruff, ruff-format, ty, and commitlint (conventional commits).

Documentation

Project docs live in docs/ (readable on GitHub). A MkDocs Material site renders them as a navigable reference:

uv run mkdocs serve -f mkdocs/mkdocs.yml

Key docs: Data Model, Extractors, Extractor Contract, Table Pricing, Images, Plugin Development, D1 Deployment, Site Development.

Related Repos

Repo Host Purpose
prime-optimal/vinny GitHub Scraper, CLI, pipeline (this repo)
optimalprime/vinny-vegas-app GitLab Astro site at vinny.vegas

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors