Skip to content

luckydsf/dddMapdata

Repository files navigation

DDD Map Data

A Python pipeline that keeps dddfanmap.com up to date. It scrapes Diners, Drive-Ins and Dives episode data from Wikipedia, geocodes each restaurant location using the Google Places API, stores the results in SQL Server, and exports a GeoJSON file for the web map.

How it works

  1. Scrapewiki_scrape.py fetches the episode list from Wikipedia and outputs ddd_episodes.csv
  2. Diffrefresh_data.ipynb compares the CSV against the database and identifies new episodes and looks for changes in operational business status (OPERATIONAL, TEMPORARILY_CLOSED, PERMANENTLY_CLOSED) and updates the dataset
  3. Geocode — new restaurant locations are looked up via the Google Places API (geocode.py) and inserted into the database
  4. Export — location data is exported from the database to locations.json (GeoJSON), which is consumed by the web map

Prerequisites

  • Python 3.x
  • SQL Server with ODBC Driver 17 or use database of your choice
  • Google Maps API key (with Places API and Geocoding API enabled)

Install dependencies:

pip install requests beautifulsoup4 pandas geopandas sqlalchemy pyodbc geodatasets matplotlib python-dotenv

Setup

  1. Copy .env.example to .env and add your Google Maps API key
  2. Copy database.ini.example to database.ini and fill in your database connection details

Usage

Scrape Wikipedia only (generates ddd_episodes.csv):

python wiki_scrape.py

Full pipeline (scrape, sync to DB, geocode new locations, update business status, export GeoJSON):

Run all cells in refresh_data.ipynb.

Files

File Description
wiki_scrape.py Scrapes episode data from Wikipedia into a CSV
geocode.py Google Places API helpers for geocoding and business status
db_conn.py Database query and update functions
config.py Reads database.ini and builds the DB connection string
refresh_data.ipynb Notebook orchestrating the full data refresh pipeline
.env.example Template for environment variables
database.ini.example Template for database connection config
ddd_episodes.csv Output csv file from wiki_scrape.py
locations.json Final product. Used in dddfanmap.com map

About

a python pipeline that keeps dddfanmap.com up to date

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors