Static-first personal travel guides built from Google Maps saved lists.
Frontend package management and script execution use bun.
The scraper dependency is vendored into this repo as a git subtree at
vendor/gmaps-scraper/, and uv installs it from that in-repo path.
- Astro for the site
- Python +
uvfor scraping, normalization, and enrichment - Cloudflare Pages or GitHub Pages for static hosting
Install frontend dependencies:
bun installInstall Python dependencies:
uv syncThe repo pins Python via .python-version so local uv usage and Cloudflare Pages builds both resolve the intended 3.14 runtime instead of Cloudflare's default 3.13.x.
Put local secrets in .env.
Recommended split:
# Browser Google Maps display only.
GOOGLE_MAPS_JS_API_KEY=...
# Optional backup for server/build-time enrichment fallback.
GOOGLE_PLACES_API_KEY=...
# Optional: force the old non-Google map path.
PUBLIC_MAP_PROVIDER=leafletNotes:
GOOGLE_MAPS_JS_API_KEYis read by Astro during render/build and embedded into the page only when the Google map provider is active. Treat it as a browser key: restrict the production key by HTTP referrer and allow onlyMaps JavaScript API.GOOGLE_PLACES_API_KEYshould never be exposed to the browser. Use it only as a server/build-time fallback when place-page enrichment cannot recover enough data.- Use a separate production browser key instead of reusing a local dev key.
PUBLIC_MAP_PROVIDER=leafletis an escape hatch if you need to force the Leaflet fallback while keeping the Google Maps codepath in the repo.GMAPS_SCRAPER_PROXYoptionally routes scraper traffic through a proxy. The pipeline keeps proxy-specific scraper sessions under.context/gmaps-scraper/, clears them after obvious block/cookie-jar failures, and expires idle sessions after 14 days.
Populate local raw data from public Google Maps lists:
bun run sync:sourcesThis refreshes every configured source and then rebuilds generated site data.
Public Google Maps URLs are always re-scraped. Local Google export CSV files are re-imported only when their
contents or config change.
Headless refreshes run up to 4 scraper workers in parallel by default. Use
uv run python3 scripts/build_data.py --refresh --refresh-workers 1 to force
serial execution, or --headed to keep browser windows single-worker.
Force-refresh raw source imports even if a CSV input is unchanged:
bun run sync:sources:forceRefresh one configured source by slug, source URL, or source path:
bun run sync:source -- tokyo-japan
bun run sync:source -- https://maps.app.goo.gl/your-public-list
bun run sync:source -- data/imports/taipei-taiwan.csvBuild generated site data and the static browser search index from local raw JSON:
bun run build:dataUse this when data/raw/ is already up to date and you only want to regenerate site inputs and search data.
Configured local CSV sources are auto-imported before rebuild. Public Google Maps URL sources are not refreshed here.
Fill missing or stale place enrichment cache entries, then rebuild:
GOOGLE_PLACES_API_KEY=... bun run enrich:dataThe same key can live in .env as GOOGLE_PLACES_API_KEY=....
Force-refresh all place enrichment cache entries:
GOOGLE_PLACES_API_KEY=... bun run refresh:enrichmentStart the site:
bun run devGuide pages use Google Maps by default when GOOGLE_MAPS_JS_API_KEY is set at build/render time. If it is missing, or if PUBLIC_MAP_PROVIDER=leaflet, the site falls back to Leaflet/OpenStreetMap.
Verify the site:
bun run test
bun run check
bun run buildCloudflare Pages should not auto-install Python dependencies for this repo.
The root pyproject.toml exists for the local data pipeline, and Cloudflare's
default pip install path does not understand the vendored scraper declared in
[tool.uv.sources].
Use these Pages settings instead:
- Environment variable:
SKIP_DEPENDENCY_INSTALL=true - Environment variable:
BUN_VERSION=1.3.12 - Python version: keep the root
.python-versionin sync withpyproject.toml - Build command:
bun ci && pipx install uv==0.11.6 && export PATH="$HOME/.local/bin:$PATH" && uv sync && bun run build:data && bun run buildWhy this is necessary:
bun ciinstalls the frontend dependencies frombun.lockpipx install uv==0.11.6makesuvavailable in the Pages build imageuv syncinstalls Python dependencies, including the vendored scraper fromvendor/gmaps-scraperbun run build:datageneratessrc/data/generated/, which Astro reads at build timebun run buildbuilds the static site
Do not rely on Cloudflare's automatic Python dependency detection for this repo unless the packaging layout changes.
This repo can commit raw scraped list snapshots in data/raw/ and reproducible Google Places
enrichment cache files in data/cache/google-places/ when you want stable source data in git.
It still does not commit generated build data.
- Export your saved lists from Google Takeout.
Go to Google Takeout, select Saved, and download the export.
After extracting the archive, you should get a folder with one or more .csv files for your saved lists.
You can then either keep those CSVs as your own reference data, or use the place names and URLs while
building scripts/config/list_sources.json.
- Add your source definitions to
scripts/config/list_sources.json.
If you are starting from this repo as a base template, copy the example file first:
cp scripts/config/list_sources.example.json scripts/config/list_sources.jsonEvery source needs a slug.
urlsources infertype: "google_list_url"for supported Google Maps links, includinghttps://maps.app.goo.gl/...shortlinks andhttps://www.google.com/maps/...share links.pathsources infertype: "google_export_csv"and requiretitle.typecan still be included explicitly, but it must match the configuredurlorpath.titleis optional for Google Maps URL sources and acts as a fallback list title.- Google My Maps URLs such as
https://www.google.com/maps/d/...are not supported yet.
Example:
[
{
"slug": "tokyo-japan",
"url": "https://maps.app.goo.gl/your-public-list"
},
{
"slug": "taipei-taiwan",
"path": "data/imports/taipei-taiwan.csv",
"title": "Taipei, Taiwan ๐น๐ผ"
}
]Optional fallback title example:
[
{
"slug": "tokyo-japan",
"url": "https://maps.app.goo.gl/your-public-list",
"title": "Tokyo, Japan ๐ฏ๐ต"
}
]- Pull raw list data through the installed scraper dependency:
bun run sync:sourcesThis writes local JSON files into data/raw/, including refresh metadata like fetched_at,
refresh_after, and a source signature. URL-backed sources skip network refreshes until
their refresh window expires unless the source config changes. CSV-backed sources skip rewrites
when the input file hash is unchanged.
It also rebuilds the generated site JSON afterward.
- Add manual curation files in
src/data/overrides/.
Example files live alongside the real override directories:
src/data/overrides/lists/list.example.jsonsrc/data/overrides/places/list.example.json
Per-list example at src/data/overrides/lists/tokyo-japan.json:
{
"city_name": "Tokyo",
"country_name": "Japan",
"country_code": "JP",
"list_tags": ["tokyo", "japan", "food", "coffee"]
}Per-place example at src/data/overrides/places/tokyo-japan.json:
{
"cid:6924437575605096209": {
"top_pick": true,
"tags": ["coffee", "nakameguro"],
"why_recommended": "A very easy first stop."
}
}- Optionally fill Google Places enrichment cache:
bun run enrich:dataThis writes cache files into data/cache/google-places/, which may be committed for reproducible
enrichment results.
- Build generated site data:
bun run build:dataThis writes local generated JSON into src/data/generated/ and the client-side search index into public/data/search-index.json from the current contents of data/raw/.
Configured local CSV sources are imported into data/raw/<slug>.json first when needed.
- Run the site:
bun run devIf you already have raw JSON from elsewhere, you can skip source refresh and place compatible files directly in data/raw/<slug>.json, then run bun run build:data.
For a targeted refresh, run bun run sync:source -- <slug-or-url-or-path>.
For a full forced refresh, run bun run sync:sources:force.
Legacy aliases still work:
bun run refresh:databun run refresh:data:forcebun run refresh:data:list -- <slug-or-url>
This repo can keep personal data and still act as the basis for a cleaner template extraction later. The key is to keep "replace me" files obvious and colocated with the real paths future users will edit.
scripts/config/list_sources.jsonis your real source list config.scripts/config/list_sources.example.jsonis the starter file for template users.src/data/site.tsis the site-level branding, favicon, and copy config for this instance.src/data/site.example.tsshows the expected shape for a new instance.src/data/overrides/lists/*.jsonandsrc/data/overrides/places/*.jsonare real handwritten curation files, excluding*.example.json.src/data/overrides/lists/list.example.jsonandsrc/data/overrides/places/list.example.jsonare starter examples showing the expected override shapes.
For future extraction into a dedicated template repo, the split is:
- Engine:
scripts/,src/lib/,src/components/, and Astro wiring. - Content:
scripts/config/list_sources.json,data/raw/, andsrc/data/overrides/. - Theme and branding:
src/data/site.tsplus any styling and assets undersrc/styles/andpublic/.
The project keeps three layers separate:
data/raw/stores disposable scraper output.data/cache/google-places/stores cached Google Places lookups keyed by stable place id and may be committed.src/data/overrides/stores handwritten metadata, tags, notes, and ranking.src/data/generated/stores the static JSON that Astro reads at build time.
Manual overrides always win over machine-enriched fields.
Enrichment is optional and cached. A normal build never calls Google.
--enrichfills missing or stale cache entries according to the cache entry's own refresh window.--refresh-enrichmentignores the 30-day cache window and refetches every place.- Manual overrides still win over Google data.
- Cache invalidation is field-aware: raw input changes force a refresh, operational places refresh more slowly, and volatile or risky states like ratings, closures, unmatched results, and API errors refresh sooner.
The current enrichment pass uses Google Places Text Search with a narrow field mask and location bias around the scraped coordinates. It is meant to fill in useful metadata such as category, Maps URI, and business status without turning the site build into a runtime dependency on Google.
For CSV imports, the pipeline trusts each place's Google Maps URL more than the exported title. It derives stable place IDs from the Maps place token when needed and prefers Google Places display names during normalization so mojibake in the export does not leak into the final guide.

