Precip+drying lit review: rag-build + 7 papers + cite map (2/3)#62
Merged
Conversation
Lit-review precipitation + drying methodology + interpretation backing. Issue 2 of 3 in the climate-departure 3-split lit reviews (temperature [done #58/v0.2.2], precip+drying [this], interpretation framing [forthcoming]). Mirrors #58 phase structure verbatim. PWF baseline carries forward 5 explicit lessons from #58: 1. BBT 9.x for Zotero 8/9 (compat split) 2. No Citation Key: overrides in extra (BBT auto-derives, soul#43) 3. PATCH individual authors when CrossRef returns only corporate name (Pepin 2015 working-group case) 4. OCR image-only scans before Zotero attach (Karl 93, Richter 05) 5. noun_verb naming for new rag scripts
… cross-rag) Targeted web search confirmed DOIs + OA paths for 7 new papers: Williams 2020 (NA megadrought), Ficklin & Novick 2017 (VPD US), Grossiord 2020 (VPD plants), Trenberth 2014 (global drought), Min 2011 (anthropogenic precip extremes), Mekis & Vincent 2011 (Canadian precip dataset), Marvel 2019 (20th-century hydroclimate). Skipped from initial candidate list: Donat 2013 HadEX2 (Min 2011 covers attribution; supplementary), Mass 2002 (NWP not orographic processes per se), Daly 2008 PRISM (methodology aside; cd uses ERA5-Land), Sheffield & Wood 2008 (superseded by Trenberth 2014). 5 existing climate-collection items + 4 cross-rag references from snow + temperature methodology stores cover the BC-specific hydrology, ERA5-Land soil-moisture validation, and trend-test methodology angles without duplicate Zotero entries. 11-topic coverage matrix in findings.md maps every #61 vignette claim type to its primary + supporting citations, ready for the Phase 5 cite-this-for-that map. Per philosophy memory (feedback_vignette_citations_sparse.md), the matrix is a menu — downstream branch picks sparingly per AOI graph/table findings. PDF acquisition: 5 OA-fetchable (UCAR, NASA-GISS, escholarship, Canada.gov, utah.edu), 2 paywalled needing user RG download (Ficklin & Novick 2017, Min 2011). Refs #61.
POSTed all 7 candidates to NewGraphEnvironment/climate (key 8MH9LCC9) via Zotero Web API with PDFs attached via S3. CrossRef- driven metadata; tags precip-drying-departure-methodology + cd-issue-61. 3 fresh uploads (Williams, Ficklin, Marvel), 4 md5- deduped. All 7 items have >=2 individual creators per CrossRef (no Pepin-style corporate-only authorship to PATCH around). PDF sourcing: 4 via curl (Williams emnrd.nm.gov, Grossiord utah.edu, Mekis Vincent ec.gc.ca, Min Edinburgh ghegerl PDF), 3 user-RG (Ficklin, Trenberth, Marvel). Marvel needed OCR (LLNL preprint image-only scan; working title differs from published Nature title but same DOI). No Citation Key overrides in extra per NGE convention (soul#43 + #58 lesson). User action pending: restart Zotero desktop so BBT generates citation keys. After restart, keys get captured into findings.md + Phase 3 rag build script. Refs #61.
Phase 3 — adds scripts/rag_precip_drying_methodology_build.R cloning the temp build script with a 7-paper pdf_specs map. Runs in ~25 s on Ollama nomic-embed-text: Found 7 / 7 PDFs Chunks: 526 Sources: 7 Phase 4 — adds scripts/rag_precip_drying_methodology_query.R running 24 queries across 8 topics and capturing top-5 chunks each to planning/active/precip_drying_methodology_quotes.md (626 lines). Topics: precip trend methodology, anthropogenic precip-extremes attribution, VPD continental-scale drying, VPD ecosystem responses, drought attribution (NA megadrought), drought framework, 20th-century hydroclimate pattern, BC/PNW summer flow. Phase 5 — synthesis in findings.md: per-topic methodology quotes selected from the rag retrieval, cross-cutting methodology section (baseline window + trend test consistent with snow+temp; ERA5-Land precip+soil-moisture validation gap noted with same caveat as #58), no new deviations beyond #58, and a 15-row cite-this-for-that menu mapping vignette claim types to BBT-auto-derived citation keys. Key capture: all 7 BBT keys generated cleanly after auto-restart of Zotero (osascript -e quit; open -a Zotero; ~30 s wait). Pattern verified working and added to soul#43 alongside the corporate-author guard (cf Pepin 2015) and BBT plugin compat-split note. Trenberth 2014 BBT key surfaced as trenberth_etal2013Globalwarming because CrossRef issued = 2013-12-17 online; print issue is 2014-01. Leaving as-is per the auto-derived convention. Refs #61.
NewGraphEnvironment
added a commit
that referenced
this pull request
May 5, 2026
…/3) (#64) * Initialize PWF baseline for #63 * Phase 1: interpretation framing candidate list (4 new + 6 reuse + 4 cross-rag) * Phases 2+3+4+5: framing rag store + query + synthesis (with auto-restart) Phase 2 — POSTed 4 candidates to NewGraphEnvironment/climate (8MH9LCC9) via Web API with PDFs attached via S3. CrossRef-driven metadata; tags interpretation-framing-methodology + cd-issue-63. 2 fresh PDF uploads (Arguez & Vose, Hawkins & Sutton), 2 md5- deduped (Hansen 2012, Livezey 2007). Auto-restart fired and all 4 BBT keys captured cleanly: arguez_vose2011DefinitionStandard livezey_etal2007EstimationExtrapolation hawkins_sutton2012Timeemergence hansen_etal2012Perceptionclimate Phase 3 — adds scripts/rag_interpretation_framing_build.R cloning the precip+drying build script with a 4-paper pdf_specs map. Runs in ~10 s on Ollama nomic-embed-text: Found 4 / 4 PDFs Chunks: 291 Sources: 4 Phase 4 — adds scripts/rag_interpretation_framing_query.R running 16 queries across 6 topics (narrower than #58/#61's 24 queries since framing topic surface is smaller). Captures top-5 chunks each to planning/active/interpretation_framing_quotes.md (373 lines). Topics: baseline window methodology, normals when trends exist, time of emergence, cumulative-impact / loaded dice, shifting baseline climate, departure from recent variability. Phase 5 — synthesis in findings.md: per-topic methodology quotes selected from the rag retrieval, cross-cutting methodology section (Hansen 2012's choice of 1951-1980 base period validates cd's choice for cumulative-impact reporting — strongest direct precedent across all three lit reviews), 3 documented deviations (1951-1980 vs WMO 1961-1990 baseline, no AC correction, ToE not quantified per-AOI), and an 11-row cite-this-for-that menu mapping vignette framing claim types to BBT-auto-derived citation keys. 3-split scoreboard added to findings.md — pointer to the four findings.md files (#53/#54 snow, #58/#60 temp, #61/#62 precip+ drying, this) for the downstream vignette wire-up branch. Refs #63.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
[@key]markers.NewGraphEnvironment/climateZotero collection with PDFs attached: Williams 2020 (NA megadrought attribution), Ficklin & Novick 2017 (VPD US continental-scale drying), Grossiord 2020 (plant responses to rising VPD), Trenberth 2014 (drought framework), Min 2011 (anthropogenic precip extremes), Mekis & Vincent 2011 (adjusted Canadian precip dataset), Marvel 2019 (20th-century hydroclimate signal). Plus 5 reuse-relevant existing climate-collection items + 2 cross-rag references from snow + temperature methodology stores.data/rag/precip_drying_methodology.duckdb(526 chunks, 7 sources, ~25 s ingest via Ollama nomic-embed-text), built by the newscripts/rag_precip_drying_methodology_build.R. 24-query mining viascripts/rag_precip_drying_methodology_query.Rproducedplanning/active/precip_drying_methodology_quotes.md(626 lines), synthesized intofindings.mdwith methodology-quotes-by-topic, cross-cutting methodology, and a 15-row "cite this for that" menu (BBT-auto-derived keys ready for downstream[@key]insertion).Citation Key:overrides in theextrafield (BBT auto-derives persoul#43); all 7 papers had ≥2 individual creators per CrossRef so no Pepin-style corporate-author PATCH was needed. macOS auto-restart automated viaosascript -e 'tell application "Zotero" to quit'; sleep 3; open -a Zotero; sleep 30— verified working for 7 items, ~30 s wait sufficient. Pattern documented insoul#43.Headline finding: the v0.1.1 vignette finding that "soils dry from both ↓P and ↑ET" is now backed by Ficklin & Novick 2017 (VPD US continental-scale drying), Williams 2020 (anthropogenic warming → NA megadrought, 47% of drought severity), and Trenberth 2014 (proper PDSI methodology with Penman-Monteith ET). The atmospheric-evaporative-demand half of the drying story is well-grounded in this corpus.
Relates to NewGraphEnvironment/sred-2025-2026#23.
Fixes #61.
Test plan
devtools::test()clean on61-precip-drying-lit-review(verified: 166 PASS, 0 FAIL)lintr::lint()on new scripts → 0 lints (verified)Rscript scripts/rag_precip_drying_methodology_build.Rreproduces the store (7 sources, ~526 chunks) given the local PDF cache populated from ZoteroRscript scripts/rag_precip_drying_methodology_query.Rregeneratesplanning/active/precip_drying_methodology_quotes.mdfindings.mdPhase 2 table resolve to items inNewGraphEnvironment/climate(verified via local sqlite)planning/active/task_plan.mdmatch landed work🤖 Generated with Claude Code