Skip to content

Shorten w3id prefix, stage redirect, dogfood #30 on SPRUCE#79

Merged
realmarcin merged 2 commits into
mainfrom
w3id-and-spruce-dogfood
May 23, 2026
Merged

Shorten w3id prefix, stage redirect, dogfood #30 on SPRUCE#79
realmarcin merged 2 commits into
mainfrom
w3id-and-spruce-dogfood

Conversation

@realmarcin
Copy link
Copy Markdown
Contributor

Summary

  • Closes Register w3id.org redirects for communitymech CURIEs #12: Stages the upstream w3id.org redirect for https://w3id.org/communitymech/ under w3id/communitymech/.htaccess and shortens the LinkML schema prefix from https://w3id.org/culturebot-ai/communitymech/ to https://w3id.org/communitymech/ so the registered redirect resolves CommunityMech CURIEs. The generated Python datamodel and SPARQL examples are updated to match.
  • First dogfood of Enhance cross-repository environmental linking with CultureMech and MediaIngredientMech #30 cross-repo linking schema on the SPRUCE peatland community: declares humic-substance electron acceptors and acetate as related_ingredients, anchored to existing PMID:40715043 evidence. No MediaIngredientMech IDs are minted yet, so this exercises the schema using preferred_term + chebi_term + evidence only — the dogfood pattern recommended in docs/cross_repo_linking.md.

Files

Path Change
src/communitymech/schema/communitymech.yaml Prefix shortened (id: line 1, prefixes: line 14)
src/communitymech/datamodel/communitymech.py Regenerated from updated schema
docs/cross_repo_linking.md SPARQL PREFIX examples updated
kb/communities/SPRUCE_Peatland_Methane_Cycling_Community.yaml related_ingredients block added (humic substances + acetate)
w3id/communitymech/.htaccess New — upstream redirect rules for perma-id/w3id.org
w3id/README.md New — submission steps and target table

Upstream w3id submission (manual, not part of this PR)

  1. Fork perma-id/w3id.org.
  2. Copy w3id/communitymech/ from this PR into the fork root.
  3. Open a PR there per their CONTRIBUTING.md.

Resolution targets:

URL Target
https://w3id.org/communitymech/ https://culturebotai.github.io/CommunityMech/
https://w3id.org/communitymech/<Name> https://culturebotai.github.io/CommunityMech/communities/<Name>.html
https://w3id.org/communitymech/<Name>.yaml raw YAML on GitHub
https://w3id.org/communitymech/schema/communitymech.yaml raw schema on GitHub

#30 Phase 1 status (separate finding, not in this PR)

The schema work for #30 Phases 1 and 2 is already complete in the repo: MediaRelationshipEnum, RelatedMedia, RelatedIngredient, the related_media / related_ingredients slots, full tests (tests/test_cross_repo_linking.py — 30 passing), and reference docs all exist. Adoption was at 0 communities before this PR; this PR brings it to 1 (SPRUCE).

Test plan

  • just gen-python regenerates datamodel cleanly with new prefix
  • just validate kb/communities/SPRUCE_Peatland_Methane_Cycling_Community.yaml — passes
  • just validate-references kb/communities/SPRUCE_Peatland_Methane_Cycling_Community.yaml — passes (PMID:40715043 cached)
  • uv run pytest — 121 passed, 9 skipped, 7 deselected (no regressions vs main)
  • After merge: open upstream PR to perma-id/w3id.org following w3id/README.md

🤖 Generated with Claude Code

Closes #12 by staging the upstream w3id.org redirect for
https://w3id.org/communitymech/ (apply by copying w3id/communitymech/
into a fork of perma-id/w3id.org). The LinkML schema prefix is
shortened from /culturebot-ai/communitymech/ to /communitymech/ so the
registered redirect resolves CommunityMech CURIEs, and the generated
datamodel and SPARQL examples are updated to match.

Also adds the first dogfood use of #30's cross-repo linking schema:
SPRUCE peatland community now declares humic-substance electron
acceptors and acetate as related_ingredients, anchored to existing
PMID:40715043 evidence. No MediaIngredientMech IDs minted yet; this
exercises the schema using preferred_term + CHEBI + evidence only,
which is the dogfood pattern recommended in cross_repo_linking.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 23, 2026 20:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aligns CommunityMech identifiers with an intended w3id.org permanent prefix by shortening the LinkML schema prefix to https://w3id.org/communitymech/, and stages the corresponding upstream w3id.org redirect rules. It also “dogfoods” the cross-repo linking schema (#30) by adding related_ingredients to the SPRUCE community and updates documentation/examples accordingly.

Changes:

  • Shorten the LinkML schema id/communitymech prefix to https://w3id.org/communitymech/ and regenerate the Python datamodel.
  • Add staged w3id.org redirect configuration (w3id/communitymech/.htaccess) plus submission README.
  • Add related_ingredients (humic substances, acetate) to the SPRUCE peatland community and update SPARQL docs to use the new prefix.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
w3id/README.md Documents the planned upstream w3id.org registration and redirect targets.
w3id/communitymech/.htaccess Adds redirect rules mapping w3id.org/communitymech/* to GitHub Pages and raw GitHub content.
src/communitymech/schema/communitymech.yaml Updates schema id and communitymech prefix to the shortened w3id.org/communitymech base.
src/communitymech/datamodel/communitymech.py Regenerated datamodel reflecting the new schema identifier/prefix.
kb/communities/SPRUCE_Peatland_Methane_Cycling_Community.yaml Adds related_ingredients entries with evidence anchored to PMID:40715043.
docs/cross_repo_linking.md Updates SPARQL examples to use the new cm: prefix IRI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/communitymech/datamodel/communitymech.py Outdated
Addresses Copilot review feedback on #79: the regenerated LinkML
datamodel didn't match the repo's black formatting (line length 100,
default string normalization), so `just lint`'s `black --check` would
flag it. Running `just format` reformats only this file; the rest of
src/ and tests/ is untouched and `black --check src/ tests/` now
reports all 46 files clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@realmarcin realmarcin merged commit 84af7bd into main May 23, 2026
@realmarcin realmarcin deleted the w3id-and-spruce-dogfood branch May 23, 2026 21:19
realmarcin added a commit that referenced this pull request May 24, 2026
* #30 backfill batch 2: 4 metals + 3 gut/rhizosphere communities

Continues the SPRUCE/wetland dogfood pattern from PRs #79/#80/#81. Each
entry uses CHEBI terms with snippets taken verbatim from cached
PMID/DOI abstracts; no cross-repo IDs (MIM IDs haven't been minted).

AMD/biomining/REE (4 of 16 remaining):

| Community | Ingredients | Source |
|---|---|---|
| Cyprus_Copper_Sulphide_Bioleaching_Consortium | chalcopyrite (Cu(II) surrogate), chalcocite (Cu(I) sulfide), iron(2+) | PMID:41381092 |
| Ferroplasma_Leptospirillum_Syntrophy | iron(2+), pyrite | PMID:16104851 |
| Iberian_Pit_Lake_Stratified_Community | sulfate, iron(2+) | PMID:23840525 |
| Ewaste_Bioleaching_Consortium | glycine (10 g/L cyanide substrate), hydrogen cyanide (gold lixiviant) | PMID:26704063 |

Gut/rhizosphere (3 of ~13 remaining):

| Community | Ingredients | Source |
|---|---|---|
| Bacteroides_Eubacterium_Gnotobiotic_Gut_Model | acetate, butyrate, host-derived mucin glycans | PMID:19321416 |
| Brachypodium_Young_Root_Rhizosphere_EcoFAB_Community | root exudates, labile root carbon | PMID:37280433 |
| ORNL_PMI_Populus_PD10_SynCom | glucose (minimal-medium axis) | PMID:33995895 |

#30 related_ingredients adoption: 12/265 -> 19/265.

Test plan: just test (136 passed), all 7 modified files validate
clean against the schema.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address Copilot review on #83

Five findings, all valid:

1. Cyprus chalcopyrite was mapped to CHEBI:30074 / "copper(2+)", which
   is wrong on both axes. Updated to CHEBI:50885 / "chalcopyrite" —
   the mapping the repo already uses (Copper_Biomining_Heap_Leach
   metabolites).

2. Ewaste cyanide entry's `chebi_term.label` said "hydrogen cyanide"
   but CHEBI:17514's canonical label is "cyanide". Aligned label.

3. Ewaste cyanide entry's snippet ("This gold complexing agent was
   used…") did not literally mention cyanide. Replaced with the more
   direct adjacent abstract sentence ("cyanide-producing heterotrophic
   Pseudomonas fluorescens and Pseudomonas putida were used") and
   moved the gold-complexing context into the explanation field.

4. Iberian Pit Lake relevance text described an Fe(II)/Fe(III) cycle
   across the chemocline but only iron(2+) was listed. Added a
   separate iron(3+) related_ingredient with its own snippet
   anchoring the bottom-layer iron-reducing guild
   (Acidiphilium / Ferroplasma / Acidithiobacillus ferrooxidans in
   reducing mode); split the original Fe(II) relevance text to
   reference only the oxidising guild.

5. Ewaste "gold-mobilisation" -> "gold-mobilization" for spelling
   consistency with the rest of the repo (American spelling).

136 tests still pass; all 3 modified YAMLs validate clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Register w3id.org redirects for communitymech CURIEs

2 participants