Skip to content

feat(docs): Add markdown-exec for executable code blocks#309

Merged
chekos merged 3 commits intomainfrom
feat/markdown-exec
Mar 16, 2026
Merged

feat(docs): Add markdown-exec for executable code blocks#309
chekos merged 3 commits intomainfrom
feat/markdown-exec

Conversation

@chekos
Copy link
Copy Markdown
Owner

@chekos chekos commented Mar 13, 2026

Summary

  • Adds markdown-exec plugin so Python code blocks in docs execute at build time, replacing manually-written output blocks
  • Converts 8 guide/quickstart pages (~66 code blocks) to use exec="on" source="tabbed-left" with per-page sessions
  • Adds --extra spatial to .readthedocs.yaml so geometry=True blocks can run during docs build
  • Requires CENSUS_API_KEY environment variable in ReadTheDocs admin (already configured)

Pages left unconverted due to pre-existing API bugs (get_pums 400s, load_variables wrong URL pattern, get_estimates 404s): variables, pums-microdata, population-estimates, survey-design.

Test plan

  • uv run mkdocs build --strict passes with no warnings
  • ReadTheDocs build succeeds with CENSUS_API_KEY env var set
  • Visually verify converted pages render output correctly
  • Confirm vegalite charts and Material annotations still render

🤖 Generated with Claude Code

Replace manually-written output blocks with code that runs at build time
via markdown-exec, so docs always reflect the current API behavior.

Converted pages: quickstart, acs-data, decennial-data, geography,
margins-of-error, migration-flows, multi-year, spatial.

Pages left unconverted due to pre-existing API bugs (get_pums 400s,
load_variables wrong URL pattern, get_estimates 404s): variables,
pums-microdata, population-estimates, survey-design.

Also adds --extra spatial to .readthedocs.yaml so geometry=True
blocks can execute during the docs build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Mar 13, 2026

Claude finished @chekos's task in 2m 37s —— View job


PR Review: feat(docs): Add markdown-exec for executable code blocks

  • Gather context and read changed files
  • Review configuration changes (mkdocs.yml, pyproject.toml, .readthedocs.yaml)
  • Review converted documentation pages
  • Summarize findings

Overall this is a well-structured PR that dramatically reduces manually-maintained output blocks. The approach (hidden exec blocks alongside annotated display blocks in quickstart.md; source="tabbed-left" pattern elsewhere) is clean and idiomatic for markdown-exec. The configuration additions look correct. A few issues need attention before merging.

Issues

🔴 Critical: known-broken APIs used in exec blocks

The PR description explicitly calls out get_pums (400s) and get_estimates (404s) as broken, but both appear in exec blocks that will run during the docs build:

  • quickstart.md lines 259–270get_pums() in a hidden exec block. Even with # markdown-exec: hide, the code still executes. If it raises an exception, the build will either fail (--strict mode) or show a traceback in the output tab.
  • docs/guides/spatial.md lines 62–74get_estimates() in a source="tabbed-left" exec block. This is inside the guide that was ostensibly fully converted, but it calls one of the functions explicitly excluded from conversion.

If the local mkdocs build --strict passes, it's likely because CENSUS_API_KEY is not set locally, causing all API calls to fail with the same error, which markdown-exec may handle non-fatally. On ReadTheDocs with the key configured, these blocks will hit the broken API endpoints. Fix this →

🟡 Bug: missing print() in migration-flows.md

docs/guides/migration-flows.md line 68ca_flows.head() without print() will produce no visible output. markdown-exec captures stdout; a bare expression doesn't write to stdout in standard Python. The output tab will be blank.

# Current (produces no output):
ca_flows.head()

# Should be:
print(ca_flows.head())

Fix this →

🟡 Data inconsistency: Washington state 2020 Census population

Two "2020 Decennial Census Population by State" Vega-Lite charts have different values for Washington (FIPS 53):

  • docs/guides/decennial-data.md7705281 ✓ (correct 2020 Census figure)
  • docs/guides/spatial.md line 159 — 7614893 ✗ (wrong figure)

The spatial.md chart should use 7705281 to match the actual 2020 Census count. Fix this →

🟡 Performance: nationwide ZCTA query in geography.md

docs/guides/geography.md line 129get_acs("zcta", variables="B01003_001", year=2022) fetches all ~33,000 ZCTAs nationwide. No cache_table=True. This will be one of the slowest calls in the docs build, and will be repeated on every rebuild. Consider adding cache_table=True here and in other large exec blocks. Fix this →

🔵 Fragility: session variable reuse in spatial.md

In docs/guides/spatial.md, four tabbed exec blocks all in session="spatial" each redefine gdf with different data (ACS county, decennial state, estimates state, flows county). Then at line 195, print(gdf.crs) references the last value of gdf in the session—which will be the flows GeoDataFrame (last tab). This works incidentally because all PyPUMS geometry is NAD83, but it's fragile. If the tab order changes or if an earlier block fails and gdf isn't updated, the behavior becomes unpredictable.

Consider naming each result (gdf_acs, gdf_dec, etc.) or using separate sessions per tab.

🔵 Note on # markdown-exec: hide pattern

The pattern in quickstart.md and decennial-data.md of showing an annotated display block followed by a hidden exec block that runs the same code is valid and intentional. This works correctly assuming the underlying APIs return results.

Configuration: all good ✅

  • mkdocs.yml: markdown-exec placed correctly before mkdocstrings; pymdownx.tabbed with alternate_style: true required for source="tabbed-left" is already configured.
  • pyproject.toml: markdown-exec[ansi] is the right extra (includes ANSI color support for rich terminal output).
  • .readthedocs.yaml: --extra spatial correctly added so geometry=True blocks can run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Mar 13, 2026

Claude finished @chekos's task in 2m 20s —— View job


PR Review: feat(docs): Add markdown-exec for executable code blocks

  • Gather context and read changed files
  • Review configuration changes (mkdocs.yml, pyproject.toml, .readthedocs.yaml)
  • Review converted documentation pages
  • Summarize findings

Overall this is a well-executed PR. Replacing ~370 lines of manually-maintained output blocks with live execution is a clear quality-of-life win. The session-based approach, consistent use of source="tabbed-left", and the hidden-block pattern for annotated code are all sound choices.

Configuration — looks good

  • mkdocs.yml: - markdown-exec cleanly inserted into the plugins list (line 69).
  • pyproject.toml: markdown-exec[ansi] — using the [ansi] extra is the right call to handle colored terminal output.
  • .readthedocs.yaml: Adding --extra spatial is necessary for the geometry=True blocks; correct.
  • uv.lock: Updated as required by CLAUDE.md; 31 lines added for the new dependency.

Issues found

Bug: missing print() in migration-flows.md

docs/guides/migration-flows.md line 68 calls ca_flows.head() without print(). markdown-exec captures stdout, not the REPL return value, so this will silently produce no output:

from pypums import get_flows

ca_flows = get_flows(
    "county",
    state="CA",
    year=2019,
)

ca_flows.head()   # ← no output will appear

All other guide pages use print(...) consistently. This line should be print(ca_flows.head()) or print(ca_flows.head().to_string()).

Fix this →


Risk: get_estimates and get_flows tabs in spatial.md execute at build time

docs/guides/spatial.md lines 62–88 include exec="on" blocks for get_estimates() and get_flows():

gdf = pypums.get_estimates(
    geography="state",
    product="population",
    vintage=2023,
    geometry=True,
)

The PR description explicitly lists get_estimates (404s) and get_flows as functions with pre-existing API bugs. If these calls fail during mkdocs build, the build will break on ReadTheDocs. The local mkdocs build --strict passing today does not guarantee they'll pass on ReadTheDocs (different API key, network conditions, Census API availability).

Recommendation: Either remove exec="on" from these two tabs and restore static output, or add exec="on" raise_on_error="false" if markdown-exec supports that, or document which tabs are known-broken.


Inconsistent pattern in decennial-data.md

Most blocks in this file use exec="on" source="tabbed-left" session="decennial", but the "Common patterns" section switches to exec="on" session="decennial" (no source="tabbed-left") with a preceding non-executing display block:

# Display block (not exec, has comments for readability)
race = pypums.get_decennial(...)   # (shown to user)

# Hidden execution block (produces output, no source tab)
```python exec="on" session="decennial"
# markdown-exec: hide
race = pypums.get_decennial(...)   # (runs but hidden)

This is functional but breaks the visual consistency established everywhere else in the file. Users will see a different tab layout for those sections. Consider applying source="tabbed-left" uniformly and using a comment in the exec block to explain why the code is duplicated, or simply convert these to exec="on" source="tabbed-left" directly.


Minor: income DataFrame fetched but unused in margins-of-error.md

Lines 118–135 of docs/guides/margins-of-error.md make an API call to fetch income with output="wide", but the variable is never used — the significance() call immediately below uses hardcoded values:

income = get_acs(
    "place",
    variables="B19013_001",
    state="CA",
    year=2022,
    output="wide",
)

# income is never referenced below
is_different = significance(85000, 78000, 4000, 5000, clevel=0.90)

This adds unnecessary build time and API calls. The income fetch was probably carried over from a previous version that used real data for the significance example. Either wire income into the calculation (extracting real estimates and MOEs from the DataFrame) or remove the fetch and just use the hardcoded example.

Fix this →


Summary

Severity Issue File
Bug ca_flows.head() missing print() → no output rendered migration-flows.md:68
Risk get_estimates/get_flows tabs use exec="on" despite known API bugs spatial.md:62-88
Style "Common patterns" section uses different pattern than rest of file decennial-data.md:411-495
Minor income DataFrame fetched but never used in significance example margins-of-error.md:118-135

The print() fix is a clear bug. The get_estimates/get_flows risk is worth addressing before merging to avoid a broken ReadTheDocs build. The rest are polish items.

- Add missing print() in migration-flows.md (no output was rendered)
- Revert get_estimates/get_flows exec blocks in spatial.md to static output
- Fix Washington state population value in spatial.md (7614893 -> 7705281)
- Remove unused income API call in margins-of-error.md
- Add cache_table=True to ZCTA query in geography.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

Claude finished @chekos's task in 2m 8s —— View job


PR Review: feat(docs): Add markdown-exec for executable code blocks

  • Read changed files
  • Review configuration changes
  • Review converted documentation pages
  • Summarize findings

The latest commit (fbb3d19) addressed four of the five issues raised in the previous reviews. Here's the current state:

Fixed ✅

Issue File
ca_flows.head() missing print() migration-flows.md:67
get_estimates/get_flows exec blocks → static output spatial.md:62-106
Washington state population 76148937705281 spatial.md:174
Unused income API call removed margins-of-error.md
cache_table=True added to ZCTA query geography.md:129

Remaining issue

🔴 Critical: get_pums hidden exec block in quickstart.md

docs/getting-started/quickstart.md lines 259–270 still has a hidden exec block that calls get_pums():

```python exec="on" session="qs-pums"
# markdown-exec: hide
import pypums
ca_pums = pypums.get_pums(
    variables=["AGEP", "SEX", "WAGP"],
    state="CA",
    year=2023,
    survey="acs5",
    recode=True,
)
print(ca_pums.head())

The PR description explicitly lists `get_pums` (400s) as a function with a known API bug. Unlike the `get_estimates`/`get_flows` blocks in `spatial.md` that were correctly reverted to static output, this one was not touched. When ReadTheDocs runs the docs build with `CENSUS_API_KEY` set, this will hit the broken API endpoint and the output tab will either show a traceback or the build will fail.

**Fix:** Remove `exec="on"` and convert to a static output block (same pattern used for the now-fixed `spatial.md` tabs):

```python
# Static output block after the display block:

SERIALNO SPORDER PWGTP ST PUMA AGEP SEX WAGP SEX_label
0 ...

Fix this →


Notes (non-blocking)

Session variable in spatial.md:213 — with get_estimates/get_flows reverted to static, gdf at line 213 is now the decennial state GeoDataFrame from line 50. The CRS print will work correctly. However, lines 239–243 call gdf.to_crs(epsg=2229) (NAD83 / California zone 5) on a nationwide state-level GDF. The code runs without error but the "California zone 5" comment in the surrounding prose is slightly misleading for a dataset containing all 50 states. Low priority.

Inconsistent pattern in decennial-data.md (Common patterns section, line ~430) — still uses exec="on" session="decennial" without source="tabbed-left" unlike the rest of the file. Cosmetic only.


Overall

This PR is very close to merge-ready. The get_pums exec block is the only remaining issue that could break the ReadTheDocs build once the API key is in the environment. Everything else looks good.

@chekos
Copy link
Copy Markdown
Owner Author

chekos commented Mar 16, 2026

Addressed most review feedback in fbb3d19:

  • Fixed: missing print() in migration-flows.md
  • Fixed: reverted get_estimates()/get_flows() exec blocks in spatial.md to static output
  • Fixed: Washington state population (7614893 → 7705281) in spatial.md
  • Fixed: removed unused income API call in margins-of-error.md
  • Fixed: added cache_table=True to ZCTA query in geography.md

Not changing:

  • get_pums() exec block in quickstart.md — The RTD build passes successfully, so the API call works. Leaving as-is.
  • Inconsistent pattern in decennial-data.md — The split pattern (display block + hidden exec block) is intentional for sections with Material annotations (# (1)!), which don't render correctly inside executed blocks. The visual difference is minor and the pattern is used consistently in quickstart.md too.
  • Session variable reuse in spatial.md — Now only 2 of 4 tabs are exec blocks (the other 2 were reverted to static), reducing the fragility. The remaining gdf reuse between get_acs() and get_decennial() tabs is fine since both produce GeoDataFrames with the same CRS.

@chekos chekos merged commit a8f81ae into main Mar 16, 2026
13 checks passed
@chekos chekos deleted the feat/markdown-exec branch March 16, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant