Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kg microbe host #168

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
413464f
Support for getting disease from Uniprot and MONDO. Assumes that unip…
bsantan May 7, 2024
f7c4a63
Ctd transform with node normalizer and chebi xrefs
bsantan May 8, 2024
8bae663
Merge branch 'master' into kg-microbe-host
bsantan May 8, 2024
121862d
ctd init file
bsantan May 8, 2024
6c3e3a8
merge yaml
bsantan May 8, 2024
3f8a97b
Merge branch 'kg-microbe-host' of github.com:Knowledge-Graph-Hub/kg-m…
bsantan May 8, 2024
4dd9309
fix libraries
bsantan May 8, 2024
ab56acf
MONDO genes
bsantan May 9, 2024
1701453
added Rapp file to gitignore
bsantan May 9, 2024
1e40193
Use HGNC prefix
bsantan May 9, 2024
f0d3760
ctd and human transforms
bsantan May 9, 2024
24cac14
bacdive changes
bsantan May 9, 2024
48015a4
Disbiome ingest
bsantan May 16, 2024
c2ddff7
Ingest pdmetagenomics data
bsantan May 21, 2024
dd30ebc
Disbiome ingest improvements to microbe indexing
bsantan May 30, 2024
e12156c
disbiome ingest fix
bsantan May 30, 2024
5f376e8
exclude Disiome microbes not found
bsantan May 30, 2024
dde363f
Exclude microbes not in NCBITaxon subset from disbiome, formatting
bsantan May 30, 2024
601a2b6
Remove not found microes from disbiome transform
bsantan May 30, 2024
4c4f971
Fix disease label
bsantan May 31, 2024
e4aa925
fix disease association predicates
bsantan Jun 4, 2024
990ab34
ontology transform add back in upa
bsantan Jun 4, 2024
827ae69
changes from master
bsantan Jun 4, 2024
9549f4f
Merge branch 'master' into kg-microbe-host
bsantan Jun 4, 2024
ef32715
fix xref filepath search
bsantan Jun 4, 2024
935140f
formatting
bsantan Jun 4, 2024
bd10e36
formatting
bsantan Jun 4, 2024
0499bd7
bacdive master changes
bsantan Jun 4, 2024
eee5425
add uniprot human transform
bsantan Jun 5, 2024
83b6678
uniprot human raw file
bsantan Jun 5, 2024
4ef69b0
bacdive to master, use disease microbe set in download.yaml
bsantan Jun 14, 2024
562e173
fix hp path
bsantan Jun 19, 2024
e12b627
Merge pull request #181 from Knowledge-Graph-Hub/master
bsantan Jun 19, 2024
21c4d61
fix uniprot human and microbial combination issues
bsantan Jun 20, 2024
1b8020c
Merge branch 'master' into kg-microbe-host
bsantan Jun 26, 2024
2192919
Pdmetagenomics and disbiome transforms
bsantan Jun 26, 2024
1b6cd0e
fix writing duplicate nodes
bsantan Jun 27, 2024
e7753ba
Merge branch 'master' into kg-microbe-host
bsantan Jun 27, 2024
86fae8a
Merge branch 'master' into kg-microbe-host
bsantan Jul 3, 2024
a095fff
linting
bsantan Jul 3, 2024
5f42d3d
updated mondo and remove special chars from description
bsantan Jul 4, 2024
8595098
Include only unique and necessary transforms in biomedical branch
bsantan Jul 23, 2024
1e1ddca
Merge branch 'master' into kg-microbe-host
bsantan Jul 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,5 @@ data/transformed/uniprot_genome_features/*.tsv
kg_microbe/transform_utils/uniprot/tmp/relevant_file_content.txt
kg_microbe/transform_utils/uniprot/tmp/nodes_and_edges/*
data/transformed/uniprot_genome_features/uniprot_kgx.zip

data/transformed/ontologies/.Rapp.history
Binary file added data/raw/uniprot_human.tar.gz
Binary file not shown.
80 changes: 80 additions & 0 deletions data/transformed/PdMetagenomics/edges.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
subject predicate object relation primary_knowledge_source
NCBITaxon:100884 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:113107 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1150298 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1203556 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1232426 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1262792 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1262889 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1263044 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1309 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1322 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1343 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1352 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1406512 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1432052 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1463165 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1472761 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:150055 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1520815 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1522 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1535 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:154046 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:154288 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1550024 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1596 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1598 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1603888 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1613 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1624 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:165179 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1655 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:166486 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1681 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1685 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1689 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1736 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1759399 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1776081 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1849041 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:187327 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1965555 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:1965576 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:2107999 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:214856 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:216816 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:2173 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:218538 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:230143 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:270498 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:28037 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:28123 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:301302 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:3062497 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:33945 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:360807 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:39490 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:39491 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:40519 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:418240 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:43675 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:46228 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:47715 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:501571 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:508460 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:53442 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:544580 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:562 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:573 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:626932 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:626937 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:649756 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:671232 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:712124 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:78257 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:78344 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:78448 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:823 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:853 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:89153 biolink:associated_with_increased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
NCBITaxon:Actinomyces sp. ICM47 biolink:associated_with_decreased_likelihood_of MONDO:0005180 PATO:0001668 PdMetagenomics
81 changes: 81 additions & 0 deletions data/transformed/PdMetagenomics/nodes.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
id category name description xref provided_by synonym iri object predicate relation same_as subject subsets
MONDO:0005180 biolink:Disease
NCBITaxon:544580 biolink:OrganismTaxon
NCBITaxon:1689 biolink:OrganismTaxon
NCBITaxon:1309 biolink:OrganismTaxon
NCBITaxon:1535 biolink:OrganismTaxon
NCBITaxon:1550024 biolink:OrganismTaxon
NCBITaxon:418240 biolink:OrganismTaxon
NCBITaxon:1520815 biolink:OrganismTaxon
NCBITaxon:33945 biolink:OrganismTaxon
NCBITaxon:166486 biolink:OrganismTaxon
NCBITaxon:1432052 biolink:OrganismTaxon
NCBITaxon:100884 biolink:OrganismTaxon
NCBITaxon:154288 biolink:OrganismTaxon
NCBITaxon:46228 biolink:OrganismTaxon
NCBITaxon:1624 biolink:OrganismTaxon
NCBITaxon:1262792 biolink:OrganismTaxon
NCBITaxon:1232426 biolink:OrganismTaxon
NCBITaxon:230143 biolink:OrganismTaxon
NCBITaxon:113107 biolink:OrganismTaxon
NCBITaxon:649756 biolink:OrganismTaxon
NCBITaxon:712124 biolink:OrganismTaxon
NCBITaxon:1613 biolink:OrganismTaxon
NCBITaxon:40519 biolink:OrganismTaxon
NCBITaxon:154046 biolink:OrganismTaxon
NCBITaxon:1150298 biolink:OrganismTaxon
NCBITaxon:28123 biolink:OrganismTaxon
NCBITaxon:1203556 biolink:OrganismTaxon
NCBITaxon:2107999 biolink:OrganismTaxon
NCBITaxon:853 biolink:OrganismTaxon
NCBITaxon:1776081 biolink:OrganismTaxon
NCBITaxon:562 biolink:OrganismTaxon
NCBITaxon:508460 biolink:OrganismTaxon
NCBITaxon:626932 biolink:OrganismTaxon
NCBITaxon:2173 biolink:OrganismTaxon
NCBITaxon:1262889 biolink:OrganismTaxon
NCBITaxon:1965555 biolink:OrganismTaxon
NCBITaxon:47715 biolink:OrganismTaxon
NCBITaxon:39491 biolink:OrganismTaxon
NCBITaxon:1736 biolink:OrganismTaxon
NCBITaxon:671232 biolink:OrganismTaxon
NCBITaxon:187327 biolink:OrganismTaxon
NCBITaxon:78448 biolink:OrganismTaxon
NCBITaxon:1596 biolink:OrganismTaxon
NCBITaxon:78344 biolink:OrganismTaxon
NCBITaxon:1322 biolink:OrganismTaxon
NCBITaxon:1603888 biolink:OrganismTaxon
NCBITaxon:573 biolink:OrganismTaxon
NCBITaxon:1655 biolink:OrganismTaxon
NCBITaxon:78257 biolink:OrganismTaxon
NCBITaxon:28037 biolink:OrganismTaxon
NCBITaxon:89153 biolink:OrganismTaxon
NCBITaxon:1343 biolink:OrganismTaxon
NCBITaxon:1522 biolink:OrganismTaxon
NCBITaxon:Actinomyces sp. ICM47 biolink:OrganismTaxon
NCBITaxon:1681 biolink:OrganismTaxon
NCBITaxon:218538 biolink:OrganismTaxon
NCBITaxon:301302 biolink:OrganismTaxon
NCBITaxon:1406512 biolink:OrganismTaxon
NCBITaxon:3062497 biolink:OrganismTaxon
NCBITaxon:165179 biolink:OrganismTaxon
NCBITaxon:53442 biolink:OrganismTaxon
NCBITaxon:626937 biolink:OrganismTaxon
NCBITaxon:1463165 biolink:OrganismTaxon
NCBITaxon:216816 biolink:OrganismTaxon
NCBITaxon:1472761 biolink:OrganismTaxon
NCBITaxon:1352 biolink:OrganismTaxon
NCBITaxon:1965576 biolink:OrganismTaxon
NCBITaxon:360807 biolink:OrganismTaxon
NCBITaxon:43675 biolink:OrganismTaxon
NCBITaxon:823 biolink:OrganismTaxon
NCBITaxon:1598 biolink:OrganismTaxon
NCBITaxon:501571 biolink:OrganismTaxon
NCBITaxon:214856 biolink:OrganismTaxon
NCBITaxon:270498 biolink:OrganismTaxon
NCBITaxon:1849041 biolink:OrganismTaxon
NCBITaxon:1263044 biolink:OrganismTaxon
NCBITaxon:39490 biolink:OrganismTaxon
NCBITaxon:1685 biolink:OrganismTaxon
NCBITaxon:150055 biolink:OrganismTaxon
NCBITaxon:1759399 biolink:OrganismTaxon
Loading
Loading