Integrate METPO release changes across transforms, mappings, and tooling#573
Merged
Conversation
Update transforms (bacdive, gtdb, madin_etal, mediadive, metatraits, ontologies, ontologies_stubs) and supporting utils (metpo_predicates, microbial_trait_mappings, mapping_file_utils, isolation_source_mapping, stub_curie_collection) for METPO release integration. Refresh mappings, merge configs, prefixmap, and dependency lockfiles. Add new dev skills (gtdb-phylo-diagram, kg-postprocess-report), tests, and validation/analysis scripts. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…fix) The gtdb transform emits name-derived CURIEs that retain the rank prefix (GTDB:s__<species>). metatraits_gtdb was stripping s__, so its synthetic species nodes never joined the GTDB taxonomy in the merged graph (0 CURIE overlap). Retain s__ at both CURIE-construction sites (synthetic node id and subClassOf target); the lookup maps still strip it because they must match the prefix-free metatraits input labels. Input is species-level only. Verified after re-running gtdb + metatraits_gtdb + merge: all 2531 subClassOf targets now resolve to real gtdb taxonomy nodes (was 0), and the synthetic->taxonomy edges survive into the merged graph. Regenerates merged_graph_stats.yaml (2,438,443 nodes / 12,900,316 edges). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
# Conflicts: # mappings/kgmicrobe_unified_entity_mappings.sssom.tsv.gz
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CI's Linux ruff classifies the repo-root `run` module as third-party (filesystem detection misses run.py), wanting the click/run imports in tests/test_run.py merged into one block, while local detection treats it as first-party. Declaring `run` in known-first-party makes the 3-section import layout canonical in both environments. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The release dependency bump pulled setuptools 82, which removed pkg_resources. The eutils transitive dependency (via oaklib) still imports it, so CI's clean install broke pytest collection across the suite. Constrain setuptools <81 and re-lock (only setuptools changes: 82.0.1 -> 80.10.2), matching master's working environment. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Commit 1aaaf8f re-added two old test files (absent on master) that import the repo-root run.py CLI and kg_microbe/utils/transform_utils.py — both removed from the repo in 2023 and present only as untracked working-tree files. They break CI's clean-checkout collection. Drop the tests to restore parity with master, which intentionally carries neither the tests nor the modules they reference. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update transforms (bacdive, gtdb, madin_etal, mediadive, metatraits, ontologies, ontologies_stubs) and supporting utils (metpo_predicates, microbial_trait_mappings, mapping_file_utils, isolation_source_mapping, stub_curie_collection) for METPO release integration. Refresh mappings, merge configs, prefixmap, and dependency lockfiles. Add new dev skills (gtdb-phylo-diagram, kg-postprocess-report), tests, and validation/analysis scripts.