Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions docs/cross_repo_linking.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,8 @@ All new fields are optional:

## Validation

### Schema-level tests

Run the cross-repo linking tests:

```bash
Expand All @@ -311,6 +313,31 @@ Test data files are in `tests/data/test_cross_repo_linking/`:
- `community_no_links.yaml` -- Backward compatibility
- `community_all_relationship_types.yaml` -- All 5 enum values

### Cross-repo ID validator

`just validate-cross-repo-ids FILE` checks that `culturemech_id` /
`mediaingredientmech_id` values match their CURIE patterns and, when
sibling-repo paths are configured, that the referenced IDs actually
exist in those repos.

```bash
# Pattern check only (no sibling-repo paths)
just validate-cross-repo-ids kb/communities/SPRUCE_Peatland_Methane_Cycling_Community.yaml

# Pattern + existence check
COMMUNITYMECH_SIBLING_REPOS="CultureMech=../CultureMech/kb/media,MediaIngredientMech=../MediaIngredientMech/kb/ingredients" \
just validate-cross-repo-ids-all
```

The validator returns:
- `error` for malformed CURIEs or IDs missing from a configured sibling repo
- `info` for IDs whose existence check was skipped because the relevant
sibling-repo path wasn't configured
- nothing if a community has no cross-repo IDs at all

Sibling-repo paths can also be passed via `--culturemech` /
`--mediaingredientmech` flags to `scripts/validate_cross_repo_ids.py`.

## See Also

- [Growth Media Linking](media_linking.md) -- Existing cultivation-based linking
Expand Down
10 changes: 10 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,16 @@ validate-references-all:
uv run linkml-reference-validator validate data "$file" -s src/communitymech/schema/communitymech.yaml --config conf/reference_validator.yaml
done

# Validate cross-repo IDs (CultureMech, MediaIngredientMech) in one community file.
# Pattern checks always run; existence checks run when sibling-repo paths are
# configured via COMMUNITYMECH_SIBLING_REPOS env (Name=path,Name=path).
validate-cross-repo-ids FILE:
PYTHONPATH=src uv run python scripts/validate_cross_repo_ids.py {{FILE}}

# Validate cross-repo IDs across all community files.
validate-cross-repo-ids-all:
PYTHONPATH=src uv run python scripts/validate_cross_repo_ids.py kb/communities/*.yaml

# Validate ontology terms in a community file
validate-terms FILE:
uv run linkml-term-validator validate-data {{FILE}} -s src/communitymech/schema/communitymech.yaml --labels
Expand Down
1 change: 0 additions & 1 deletion kb/communities/AMD_Acidophile_Heterotroph_Network.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -765,7 +765,6 @@ metals_present:
- COPPER
- GOLD
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
measurements; Metal/REE detected via keyword matching in description (context-validated)
2 changes: 0 additions & 2 deletions kb/communities/AMD_Nitrososphaerota_Archaeal.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -699,9 +699,7 @@ environmental_factors:
explanation: Demonstrates value of genomic data for understanding archaeal adaptations
metals_present:
- COPPER
- GOLD
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
measurements; Metal/REE detected via keyword matching in description (context-validated)
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,50 @@ environmental_factors:
snippet: contributions in soil ecosystems remain unknown
explanation: Supports terrestrial soil as the focus environment.
growth_media: []
related_ingredients:
- preferred_term: acetate
chebi_term:
id: CHEBI:30089
label: acetate
relevance: Asgard archaeal acetogens in this wetland soil generate acetate
from carbohydrate breakdown via the Wood-Ljungdahl pathway; acetate is
therefore both the headline output of Asgard metabolism in this community
and the substrate that feeds the co-resident acetoclastic methanogens.
evidence:
- reference: PMID:39085194
supports: SUPPORT
evidence_source: COMPUTATIONAL
snippet: carbohydrate breakdown to acetate and formate
explanation: Anchors acetate as the central Asgard metabolic output that
modulates downstream methanogenesis substrates in wetland soil.
- preferred_term: formate
chebi_term:
id: CHEBI:15740
label: formate
relevance: Formate co-produced with acetate is a major C1 substrate for
hydrogenotrophic and formate-utilizing methanogens, defining a second
Asgard-mediated methanogenesis-substrate channel in this community.
evidence:
- reference: PMID:39085194
supports: SUPPORT
evidence_source: COMPUTATIONAL
snippet: carbohydrate breakdown to acetate and formate
explanation: Anchors formate as the second Asgard-derived methanogenesis
substrate alongside acetate.
- preferred_term: dihydrogen
chebi_term:
id: CHEBI:18276
label: dihydrogen
relevance: Expression of [NiFe]-hydrogenases by both Atabeyarchaeia and
Freyarchaeia genomes implicates H2 cycling as a core Asgard activity in
this wetland soil; any cultivation medium designed around the community
would need an H2 headspace.
evidence:
- reference: PMID:39085194
supports: SUPPORT
evidence_source: COMPUTATIONAL
snippet: expression of genes for [NiFe]-hydrogenases
explanation: Anchors H2 cycling as an in situ expressed Asgard activity.
external_resources:
- name: Primary publication for the Asgard wetland soil methanogenesis-substrate
community
Expand Down
3 changes: 1 addition & 2 deletions kb/communities/At_RSPHERE_SynCom.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -427,8 +427,7 @@ environmental_factors:
plant-microbe interactions, and synthetic community design

'
metals_present:
- TITANIUM
metals_present: []
metal_relevance: INCIDENTAL
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected
via keyword matching in description (context-validated)
2 changes: 0 additions & 2 deletions kb/communities/Australian_Lead_Zinc_Polymetallic.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -966,10 +966,8 @@ environmental_factors:
explanation: Documents long-term weathering profile development
metals_present:
- COPPER
- GOLD
- IRON
- LEAD
- TITANIUM
- ZINC
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
Expand Down
2 changes: 0 additions & 2 deletions kb/communities/Bayan_Obo_REE_Tailings_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -544,9 +544,7 @@ environmental_factors:
was narrated for plausible real-world use
explanation: Describes REE mineralogy at Bayan Obo
metals_present:
- GOLD
- IRON
- TITANIUM
rare_earth_elements_present:
- CERIUM
- LANTHANUM
Expand Down
3 changes: 1 addition & 2 deletions kb/communities/Chlamydomonas_Bacterial_H2_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,6 @@ growth_media:
explanation: Establishes mannitol and yeast extract as key medium components for sustained H2 production
culturemech_id: CultureMech:000139
culturemech_url: https://github.com/CultureBotAI/CultureMech/tree/main/kb/media/CultureMech:000139
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,5 @@ growth_media:
culturemech_url: https://github.com/CultureBotAI/CultureMech/tree/main/kb/media/CultureMech:000139
metals_present:
- IRON
- TITANIUM
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements
1 change: 0 additions & 1 deletion kb/communities/Chlorella_Rhizobium_Bioflocculation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,6 @@ environmental_factors:
explanation: Co-culture timing optimized for maximum harvesting efficiency
metals_present:
- IRON
- TITANIUM
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
1 change: 0 additions & 1 deletion kb/communities/Chromium_Sulfur_Reduction_Enrichment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -610,7 +610,6 @@ environmental_factors:
metals_present:
- CHROMIUM
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
3 changes: 1 addition & 2 deletions kb/communities/Cinnamate_Degradation_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -182,8 +182,7 @@ environmental_factors:
complete mineralization to methane

'
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,54 @@ environmental_factors:
snippet: emissions of carbon dioxide, methane, and nitrous oxide
explanation: Supports greenhouse gas endpoints.
growth_media: []
related_ingredients:
- preferred_term: sulfate
chebi_term:
id: CHEBI:16189
label: sulfate
relevance: Sulfate is the explicitly manipulated ion in this microcosm
experiment; an environment-analog cultivation medium for this community
would need sulfate as a controllable variable rather than a fixed
background anion, since the study disentangles sulfate effects from other
seawater ions on the community.
evidence:
- reference: PMID:38628812
supports: SUPPORT
evidence_source: IN_VIVO
snippet: tease apart the effects of sulfate from other seawater ions
explanation: Anchors sulfate as the central controllable environmental
variable for any medium designed to dogfood this community.
- preferred_term: seawater ions
chebi_term:
id: CHEBI:26710
label: sodium chloride
relevance: Artificial-seawater ions (NaCl as the representative bulk salt)
drove the community shifts and GHG emission changes more strongly than
sulfate alone; any cultivation medium for this community would need a
seawater-equivalent ion background, not just sulfate.
evidence:
- reference: PMID:38628812
supports: SUPPORT
evidence_source: IN_VIVO
snippet: other ions present in seawater, not sulfate, drive ecological and
biogeochemical responses to seawater intrusion
explanation: Anchors NaCl-dominated artificial seawater (representative
bulk seawater-ion mixture) as the primary driver of community responses.
- preferred_term: methane
chebi_term:
id: CHEBI:16183
label: methane
relevance: Methane is one of the headline greenhouse gas emission endpoints
monitored across all microcosm treatments; an environment-analog medium
targeting this community would need methane in the headspace as a key
quantitative endpoint.
evidence:
- reference: PMID:38628812
supports: SUPPORT
evidence_source: IN_VIVO
snippet: emissions of carbon dioxide, methane, and nitrous oxide
explanation: Anchors methane as a measured greenhouse gas endpoint of this
microcosm community.
associated_datasets: []
external_resources:
- name: Primary publication for coastal forested wetland seawater-ion microcosms
Expand Down
1 change: 0 additions & 1 deletion kb/communities/Copper_Biomining_Heap_Leach.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -566,7 +566,6 @@ environmental_factors:
metals_present:
- COPPER
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
measurements; Metal/REE detected via keyword matching in description (context-validated)
Expand Down
3 changes: 1 addition & 2 deletions kb/communities/DVM_Triculture.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -375,7 +375,6 @@ environmental_factors:
snippet: We find that tri-cultures with both routes increase methane production by almost twofold
compared to co-cultures and are stable in the absence of sulfate
explanation: Quantifies enhanced productivity from tri-culture interactions
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements
3 changes: 1 addition & 2 deletions kb/communities/Dangl_SynComm_35.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -541,7 +541,6 @@ environmental_factors:
evidence_source: IN_VITRO
snippet: Suppressors and nonsuppressors co-occur in the root microbiome and the
presence of the former can enhance the colonization ability of the latter
metals_present:
- TITANIUM
metals_present: []
metal_relevance: INCIDENTAL
metal_notes: Metal/REE detected via environmental factor measurements
1 change: 0 additions & 1 deletion kb/communities/Desulfovibrio_Methanococcus_Syntrophy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,6 @@ environmental_factors:
explanation: Quantifies growth conditions for the syntrophic consortium
metals_present:
- IRON
- TITANIUM
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
Expand Down
1 change: 0 additions & 1 deletion kb/communities/Ewaste_Bioleaching_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -570,7 +570,6 @@ metals_present:
- NICKEL
- PALLADIUM
- SILVER
- TITANIUM
- ZINC
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
Expand Down
1 change: 0 additions & 1 deletion kb/communities/Ferroplasma_Leptospirillum_Syntrophy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -472,7 +472,6 @@ environmental_factors:
metals_present:
- COPPER
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
measurements; Metal/REE detected via keyword matching in description (context-validated)
Expand Down
3 changes: 1 addition & 2 deletions kb/communities/GLBRC_Populus_Variovorax_SynCom28.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -833,7 +833,6 @@ external_resources:
resource_id: zenodo.17466836
url: https://zenodo.org/records/17466836
description: Reproducible workflow and supplemental tables including strain metadata for the DefCom.
metals_present:
- TITANIUM
metals_present: []
metal_relevance: INCIDENTAL
metal_notes: Metal/REE detected via environmental factor measurements
4 changes: 1 addition & 3 deletions kb/communities/GLBRC_UFMP_Fermentation_Community.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -462,9 +462,7 @@ associated_datasets:

'
explanation: Links the metagenome dataset to the UFMP fermentation community study
metals_present:
- GOLD
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
3 changes: 1 addition & 2 deletions kb/communities/GOM_Oil_Degrading_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,7 @@ environmental_factors:
description: 'Designed for practical application in bioremediation of oil-contaminated marine environments

'
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected via keyword
matching in description (context-validated)
3 changes: 1 addition & 2 deletions kb/communities/Geobacter_Clostridium_DIET.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,6 @@ growth_media:
explanation: Establishes anaerobic requirement maintained using Hungate technique
culturemech_id: CultureMech:015432
culturemech_url: https://github.com/CultureBotAI/CultureMech/tree/main/kb/media/CultureMech:015432
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements
3 changes: 1 addition & 2 deletions kb/communities/Geobacter_Methanosaeta_DIET.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -293,8 +293,7 @@ growth_media:
specialized medium
culturemech_id: CultureMech:015435
culturemech_url: https://github.com/CultureBotAI/CultureMech/tree/main/kb/media/CultureMech:015435
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected
via keyword matching in description (context-validated)
3 changes: 1 addition & 2 deletions kb/communities/Geobacter_Methanosarcina_DIET.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -336,8 +336,7 @@ growth_media:
explanation: Confirms ethanol as electron donor and DIET mechanism in coculture
culturemech_id: CultureMech:015434
culturemech_url: https://github.com/CultureBotAI/CultureMech/tree/main/kb/media/CultureMech:015434
metals_present:
- TITANIUM
metals_present: []
metal_relevance: SIGNIFICANT
metal_notes: Metal/REE detected via environmental factor measurements; Metal/REE detected
via keyword matching in description (context-validated)
1 change: 0 additions & 1 deletion kb/communities/Iberian_Pit_Lake_Stratified_Community.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -725,7 +725,6 @@ environmental_factors:
metals_present:
- COPPER
- IRON
- TITANIUM
- ZINC
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
Expand Down
2 changes: 0 additions & 2 deletions kb/communities/Industrial_Bioreactor_Consortium.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -999,9 +999,7 @@ environmental_factors:
explanation: Links high Fe³⁺ to Ferroplasma competitive advantage
metals_present:
- COPPER
- GOLD
- IRON
- TITANIUM
metal_relevance: PRIMARY
metal_notes: Metal/REE detected via CHEBI terms in metabolites; Metal/REE detected via environmental factor
measurements; Metal/REE detected via keyword matching in description (context-validated)
Original file line number Diff line number Diff line change
Expand Up @@ -671,7 +671,6 @@ environmental_factors:
'
metals_present:
- IRON
- TITANIUM
rare_earth_elements_present:
- DYSPROSIUM
- ERBIUM
Expand Down
3 changes: 1 addition & 2 deletions kb/communities/Lotus_LjSC3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -597,7 +597,6 @@ environmental_factors:
interactions relevant to sustainable agriculture and reduced fertilizer use

'
metals_present:
- TITANIUM
metals_present: []
metal_relevance: INCIDENTAL
metal_notes: Metal/REE detected via environmental factor measurements
Loading