Skip to content

Integrate METPO release changes across transforms, mappings, and tooling#573

Merged
realmarcin merged 7 commits into
masterfrom
chore/metpo-release-integration
Jun 12, 2026
Merged

Integrate METPO release changes across transforms, mappings, and tooling#573
realmarcin merged 7 commits into
masterfrom
chore/metpo-release-integration

Conversation

@realmarcin

Copy link
Copy Markdown
Collaborator

Update transforms (bacdive, gtdb, madin_etal, mediadive, metatraits, ontologies, ontologies_stubs) and supporting utils (metpo_predicates, microbial_trait_mappings, mapping_file_utils, isolation_source_mapping, stub_curie_collection) for METPO release integration. Refresh mappings, merge configs, prefixmap, and dependency lockfiles. Add new dev skills (gtdb-phylo-diagram, kg-postprocess-report), tests, and validation/analysis scripts.

realmarcin and others added 7 commits June 10, 2026 11:32
Update transforms (bacdive, gtdb, madin_etal, mediadive, metatraits,
ontologies, ontologies_stubs) and supporting utils (metpo_predicates,
microbial_trait_mappings, mapping_file_utils, isolation_source_mapping,
stub_curie_collection) for METPO release integration. Refresh mappings,
merge configs, prefixmap, and dependency lockfiles. Add new dev skills
(gtdb-phylo-diagram, kg-postprocess-report), tests, and validation/analysis
scripts.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…fix)

The gtdb transform emits name-derived CURIEs that retain the rank prefix
(GTDB:s__<species>). metatraits_gtdb was stripping s__, so its synthetic
species nodes never joined the GTDB taxonomy in the merged graph (0 CURIE
overlap). Retain s__ at both CURIE-construction sites (synthetic node id
and subClassOf target); the lookup maps still strip it because they must
match the prefix-free metatraits input labels. Input is species-level only.

Verified after re-running gtdb + metatraits_gtdb + merge: all 2531
subClassOf targets now resolve to real gtdb taxonomy nodes (was 0), and
the synthetic->taxonomy edges survive into the merged graph. Regenerates
merged_graph_stats.yaml (2,438,443 nodes / 12,900,316 edges).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts:
#	mappings/kgmicrobe_unified_entity_mappings.sssom.tsv.gz
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CI's Linux ruff classifies the repo-root `run` module as third-party
(filesystem detection misses run.py), wanting the click/run imports in
tests/test_run.py merged into one block, while local detection treats it
as first-party. Declaring `run` in known-first-party makes the 3-section
import layout canonical in both environments.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The release dependency bump pulled setuptools 82, which removed
pkg_resources. The eutils transitive dependency (via oaklib) still imports
it, so CI's clean install broke pytest collection across the suite. Constrain
setuptools <81 and re-lock (only setuptools changes: 82.0.1 -> 80.10.2),
matching master's working environment.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Commit 1aaaf8f re-added two old test files (absent on master) that import
the repo-root run.py CLI and kg_microbe/utils/transform_utils.py — both
removed from the repo in 2023 and present only as untracked working-tree
files. They break CI's clean-checkout collection. Drop the tests to restore
parity with master, which intentionally carries neither the tests nor the
modules they reference.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@realmarcin realmarcin merged commit 4c23ed9 into master Jun 12, 2026
3 checks passed
@realmarcin realmarcin deleted the chore/metpo-release-integration branch June 12, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant