[GH-2700] Refresh docker example notebooks: scaffolding + quickstart#2876
Merged
jiayuasu merged 2 commits intoapache:masterfrom May 1, 2026
Merged
[GH-2700] Refresh docker example notebooks: scaffolding + quickstart#2876jiayuasu merged 2 commits intoapache:masterfrom
jiayuasu merged 2 commits intoapache:masterfrom
Conversation
…start The notebooks bundled in the Sedona docker image were several releases behind. Start a refresh series (issue apache#2700, milestone 1.9.1) by: - moving the five legacy notebooks under docs/usecases/legacy/. They remain reachable via existing GitHub URLs but are no longer copied into the image (the Dockerfile's COPY of *.ipynb is non-recursive). - adding 00-quickstart.ipynb: a ten-cell, ~30-second walkthrough that reads two shapefiles, runs a spatial join, aggregates, writes GeoParquet 1.1, and renders a SedonaKepler choropleth. Uses the Natural Earth data already shipped under docs/usecases/data/, so it needs no new bytes and no network. - bumping docker/test-notebooks.sh per-notebook timeout 600s -> 900s to absorb network variance for upcoming notebooks that hit STAC and remote object stores. - teaching docker/test-notebooks.sh to honour SEDONA_NOTEBOOK_OFFLINE=1 so notebooks tagged "requires-network: true" can be skipped in sandboxed CI environments without outbound network access. - redirecting the two existing tutorial/sql.md links from the moved AirportsPerCountry notebook to its new legacy/ path.
Contributor
There was a problem hiding this comment.
Pull request overview
Refreshes the notebooks shipped in the Sedona docker image by introducing a new quickstart notebook, moving older notebooks into a non-shipped legacy/ folder, and extending the docker notebook test harness (timeout + offline skipping).
Changes:
- Added a new
00-quickstart.ipynbnotebook intended for the docker image. - Moved existing example notebooks under
docs/usecases/legacy/and updated tutorial links accordingly. - Updated
docker/test-notebooks.shto support offline-mode skipping via arequires-network: truemarker and increased the per-notebook timeout.
Reviewed changes
Copilot reviewed 3 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
docs/usecases/00-quickstart.ipynb |
Adds a new quickstart notebook workflow (read shapefiles → spatial join → GeoParquet write/read → Kepler viz). |
docker/test-notebooks.sh |
Adds offline skip logic + increases timeout + summary includes skipped count. |
docs/tutorial/sql.md |
Updates links to the AirportsPerCountry notebook’s new legacy/ path. |
docs/usecases/legacy/ApacheSedonaCore.ipynb |
Legacy notebook relocated under legacy/. |
docs/usecases/legacy/ApacheSedonaSQL.ipynb |
Legacy notebook relocated under legacy/. |
docs/usecases/legacy/ApacheSedonaSQL_SpatialJoin_AirportsPerCountry.ipynb |
Legacy notebook relocated under legacy/ (and referenced by docs). |
docs/usecases/legacy/Sedona_OvertureMaps_GeoParquet.ipynb |
Legacy notebook relocated under legacy/. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- docker/test-notebooks.sh: enable `set -o pipefail`. The per-notebook pipeline `timeout 900 python3 ... | tee ...` was returning tee's exit status, so a notebook crash or timeout would be silently misreported as a pass. With pipefail the `if` block now sees the real exit code. - docs/usecases/00-quickstart.ipynb: drop the bullet list pointing to notebooks 01..05 — they don't exist yet and would be dead links until the rest of the refresh series lands. Replace with a one-line note that more notebooks are coming. Also drop the "ten cells" wording (the notebook has nine) and the trailing forward-reference to 01-mobility-pulse in the closing cell.
Member
Author
|
All three review comments addressed in 43e47e1:
|
This was referenced May 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Issue: #2700. Milestone: 1.9.1.
The notebooks bundled in the Sedona docker image are several releases behind. This is the first of a planned series of PRs that refresh them. It's intentionally scoped to scaffolding so the rest of the series (vector / raster / STAC notebooks) can land independently on top of it.
docs/usecases/legacy/. All five existing notebooks (ApacheSedonaCore,ApacheSedonaSQL,ApacheSedonaSQL_SpatialJoin_AirportsPerCountry,ApacheSedonaRaster,Sedona_OvertureMaps_GeoParquet) now live underdocs/usecases/legacy/. Existing GitHub URLs continue to work; the docker image stops bundling them because the Dockerfile'sCOPY docs/usecases/*.ipynbis non-recursive.00-quickstart.ipynb. A nine-cell, ~30-second walkthrough — read two shapefiles, spatial join, aggregate, write GeoParquet 1.1, render a SedonaKepler choropleth. Uses the Natural Earth data already shipped underdocs/usecases/data/, so no new bytes and no network are required.docker/test-notebooks.sh:SEDONA_NOTEBOOK_OFFLINE=1: notebooks taggedrequires-network: trueare skipped, so the harness still passes in sandboxed CI environments without outbound network access. The skip count is reported in the summary.docs/tutorial/sql.md: redirect the two existing links from the moved AirportsPerCountry notebook to its newlegacy/path.Test plan
docker build -f docker/sedona-docker.dockerfile -t sedona:dev .succeeds.docker run --rm sedona:dev /opt/sedona/docker/test-notebooks.shexits 0 and reports00-quickstartpassing.docker run --rm -e SEDONA_NOTEBOOK_OFFLINE=1 sedona:dev /opt/sedona/docker/test-notebooks.shstill exits 0 (no network-tagged notebooks exist yet, so the offline path is a no-op until the next PR).docker run -p 8888:8888 -p 8080:8080 sedona:dev, open00-quickstart.ipynbin JupyterLab, run all cells, eyeball the Kepler map.docs/tutorial/sql.mdresolve todocs/usecases/legacy/ApacheSedonaSQL_SpatialJoin_AirportsPerCountry.ipynb.