Resolve R1-R8 desert_farm_leverage_points + add BioNumbers subset#26
Merged
Conversation
R1: CO2 fixation Time_min 7.69E-03 → 6.67E-02 (Bar-On natural kcat ceiling, 15/s)
R2: Replace Moore et al. 2013 mis-citations on Nutrient transport, Biochemical
synthesis, and Cell growth with Milo & Phillips (2015) Cell Biology by
the Numbers
R3: Rename Growth → Cell growth; tighten bounds to single algal cell scope
(Time 1e3-1e5 s, Space 1e-18 to 1e-14 m³). Population scales remain
covered by Community Ecology row.
R4: MD Space_max 1e-27 → 1e-22 m³ (modern atomistic-MD reach)
R5: Community Metabolic Models Time_min 99 → 1.00E+02 (formatting consistency)
R6: Fill empty Reference cells:
- Molecular Dynamics Models: Karplus & McCammon (2002) Nat Struct Biol 9:646
- Community Metabolic Models: Zakem et al. (2020) ISME J 14:288
- Biogeochemical Circulation Models: Levine et al. (2025) Annu Rev
Earth Planet Sci 53:595
R7: Rename Extraction → Fossil-fuel formation; Time bounds 3.16E+13 to
3.16E+15 s reflect formation duration rather than deposit age
R8: Extend CO2 fixation Reference text to justify both time and space bounds
25 entries justifying time/space bounds for Cell growth (8), Biochemical synthesis (8), and Nutrient transport (9) rows. Phototroph-specific where available; generic-organism proxies (E. coli, Xenopus, generic) used where phototroph data was unavailable. Each entry includes direct URL to BioNumbers page.
Schema, curation criteria, known gaps (including the Cell growth Time_min mismatch flagged by phototroph data), update procedure, and BioNumbers attribution per Milo et al. (2010) Nucleic Acids Res 38:D750.
The Time bound for cell division varies by ~3 OOM across phyla. The previous 'Cell growth' label silently scoped to phototrophs (since Time_min was set based on phototroph cell-cycle data). Rename makes the implicit organism scope explicit. Time_min 1.00E+03 → 7.56E+03 s (2.1 h, Synechococcus elongatus UTEX 2973 — currently fastest known photoautotroph). Anchored to Yu et al. (2015) Sci Rep 5:8132. Reference cell extended to cite both Milo & Phillips and Yu et al.
Sibling file to bionumbers_subset.csv for entries from primary literature where BioNumbers either has no value (NaN) or no entry. Initial entry: Synechococcus elongatus UTEX 2973 doubling time 2.1 h from Yu et al. (2015) Sci Rep 5:8132 — anchors the Time_min for the Photoautotroph cell growth row. BioNumbers entry 112484 exists for this strain but has NaN Value, hence the supplementary citation.
- Add phototroph_growth_supplementary.csv documentation and schema - Remove the 'Cell growth Time_min mismatch' known gap (now resolved by anchoring to UTEX 2973 in supplementary file) - Update curation criteria to reflect Photoautotroph cell growth rename and the BioNumbers-NaN → supplementary workflow - Restructure 'Known gaps' to reflect current state
Matches the R3 rename in desert_farm_leverage_points.csv. The supplementary file already used the new name; this aligns the BioNumbers subset so the Cited_in_row join key works for all 26 entries.
Discoverable cross-link from main CSV to data/references/ for the rows backed by curated BioNumbers / primary-literature data: Nutrient transport, Biochemical synthesis, Photoautotroph cell growth. Non-breaking — Reference column is free-form display text.
Per Madison's clarification: BioNumbers data is meant as a standalone resource for future work, not as a backing-data layer for the main CSV. Changes: - Drop (see data/references/) markers from 3 Reference cells in desert_farm_leverage_points.csv (main CSV stands alone) - Drop Cited_in_row column from bionumbers_subset.csv (no longer a join key into the main CSV) - Delete phototroph_growth_supplementary.csv (its only entry was UTEX 2973 / Yu 2015 / 2.1h doubling, already inline-cited in the main CSV's Photoautotroph cell growth Reference cell) - Rewrite README.md to describe bionumbers_subset.csv as a standalone phototroph reference set, removing all cross-linking schema docs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves R1–R8 data concerns + adds a standalone curated BioNumbers subset.
Supersedes #25 (which fell behind main during the BioNumbers addition; rebuilt fresh from current main per branch-hygiene rule).
Changes to
data/datasets/desert_farm_leverage_points.csvTime_min7.69e-3 → 6.67e-2 (Bar-On natural RuBisCO kcat ceiling, 15/s)Community EcologySpace_max1e-27 → 1e-22 m³ (modern atomistic MD reaches 100-nm boxes)Time_min99→1.00E+02New:
data/references/(standalone resource)Curated phototroph reference data, not coupled to the main CSV — kept as a resource for future work.
bionumbers_subset.csv— 25 entries from the BioNumbers database with stablebion_ididentifiers and direct URLs. Phototroph-relevant (cyanobacteria, green algae, diatoms) across cell generation/doubling times, biochemical synthesis rates (transcription/translation elongation), and small-molecule diffusion / transporter kinetics. Generic-organism proxies (e.g., E. coli, Xenopus) used where phototroph-specific entries were unavailable; flagged per-row inOrganism.README.md— schema, scope, attribution.