Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
Refactor or better document loaders #57
Comments
This was referenced Mar 3, 2017
|
migrating relevant comments over to here... Whilst setting up the new code, I was able to create a new public BMKnowledgeBase bmKnowledgeBase() throws Exception {
OWLLoader owlLoader = new OWLLoader();
owlLoader.load("src/main/resources/ontologies/hp.obo");
owlLoader.loadDataFromTsvGzip("src/main/resources/data/human-pheno.assocs.gz");
return owlLoader.createKnowledgeBaseInterface();
}This loaded and started up OK, but when a set of HP terms ( @GetMapping(path = "phenodigm")
public MatchSet matchPhenodigm(@RequestParam(value = "id") Set<String> ids) throws IncoherentStateException {
ProfileMatcher phenodigmProfileMatcher = PhenodigmICProfileMatcher.create(bmKnowledgeBase);
logger.info("Created {} profile matcher", phenodigmProfileMatcher.getShortName());
ProfileQuery query = ProfileQueryFactory.createQuery(ids);
logger.info("Querying {} matcher with Class ids: {}", phenodigmProfileMatcher.getShortName(), query.getQueryClassIds());
return phenodigmProfileMatcher.findMatchProfile(query);
}the following NPE was thrown java.lang.NullPointerException: null
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:210) ~[guava-18.0.jar:na]
at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getIndexForClassNode(BMKnowledgeBaseOWLAPIImpl.java:740) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getIndex(BMKnowledgeBaseOWLAPIImpl.java:625) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.kb.impl.BMKnowledgeBaseOWLAPIImpl.getClassIndex(BMKnowledgeBaseOWLAPIImpl.java:634) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.getProfileSetBM(AbstractProfileMatcher.java:94) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.compute.matcher.impl.PhenodigmICProfileMatcher.findMatchProfileImpl(PhenodigmICProfileMatcher.java:66) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.findMatchProfileAll(AbstractProfileMatcher.java:223) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.owlsim.compute.matcher.impl.AbstractProfileMatcher.findMatchProfile(AbstractProfileMatcher.java:194) ~[owlsim-core-3.0-SNAPSHOT.jar:na]
at org.monarchinitiative.controllers.MatchController.matchPhenodigm(MatchController.java:43) ~[classes/:na]debugging the application it turned out this was because there was no curie mappings provided to the |
|
To fix the above, and to try and lead people through the creating of the OwlKnowledgeBase.loader()
.loadCuries(curies())
.loadOntology("src/main/resources/ontologies/hp.obo")
.loadData("src/main/resources/data/human-pheno.assocs.gz")
.createKnowledgeBase();Where |
|
Is the code of OwlKnowledgeBase available somewhere? Also, what is the rational behind the format of the G2P annotations (human-pheno.assocs.gz) ? This is yet another format... can we create a public addX2Pannotation(...) method so that I can put content into the KB from my code? |
|
Not yet available - I'm working on it, but will let you know when I'm done. We need to think more on the actual import format used - I'm just taking what's currently in place and trying to consolidate code to reduce duplication. The Alternatively, it might be worthwhile allowing people to define their own parsers and domain objects so long as these implement standard interfaces. That way you can load up a knowledgebase with any given set of data, providing they have a unique identifier and a set of associated phenotypes. Ultimately these all end up as asserted axioms of class to individual. |
|
@drseb do you want to load these annotations as a Map or from file or both? I think it only really makes sense to load them in in bulk as they are all added to the ontology which is then passed to the |
|
Loading in bulk is totally ok. I would suggest to make the key of the Map generic - sometimes one may use int (entrez-id) and sometimes Strings (disease-id). |
|
We use URIs internally as identifiers. Anything coming in or our has to be a CURIE or a URI. We have all the prefixes you should need in the Monarch prefixmap, e.g. NCBIGene |
|
And if my annotation set is a bunch of patients? Or papers? |
|
I think you can define these in the CurieUtil initialisation map, making-up a namespace. So you could map PTNT:1, PTNT:2 etc. using the mapping In the tests there are examples of gene identifiers just being used as is without any real URI, these are mapped to HP terms or whatever, so these will probably be OK, at least for matching purposes. Papers presumably have a pubmed id or some such. These are mapped in Monarch using these identifiers:
|
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 13, 2017
|
|
julesjacobsen |
b3c3422
|
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 13, 2017
|
|
julesjacobsen |
bd6ea08
|
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 13, 2017
|
|
julesjacobsen |
d9746b0
|
|
@drseb I've added a Map<String, String> curies = new LinkedHashMap<>();
curies.put("ORPHA", "'http://www.orpha.net/ORDO/Orphanet_");
curies.put("HP", "http://purl.obolibrary.org/obo/HP_");
Map<String, Collection<String>> individuals = new LinkedHashMap<>();
individuals.put("ORPHA:710", Arrays.asList("HP:0000194",
"HP:0000218",
"HP:0000262",
"HP:0000303",
"HP:0000316",
"HP:0000322",
"HP:0000324",
"HP:0000348",
"HP:0000431",
"HP:0000470",
"HP:0000508",
"HP:0001156",
"HP:0001385",
"HP:0003307",
"HP:0004209",
"HP:0004322",
"HP:0005048",
"HP:0006101",
"HP:0009773",
"HP:0010669",
"HP:0011304",
"HP:0012368"));
BMKnowledgeBase knowledgeBase = BMKnowledgeBase.owlLoader()
.loadOntology("http://purl.obolibrary.org/obo/hp.owl")
.loadCuries(curies)
.loadDataFromMap(data)
.createKnowledgeBase();
Will this work for you, or do you need some other way of loading in the individuals data? |
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 15, 2017
|
|
julesjacobsen |
bdee79f
|
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 15, 2017
|
|
julesjacobsen |
ed63056
|
julesjacobsen
referenced this issue
Mar 15, 2017
Merged
Adding ability to load individuals from in-memory data as well as off-disk. #71
julesjacobsen
added a commit
to julesjacobsen/owlsim-v3
that referenced
this issue
Mar 17, 2017
|
|
julesjacobsen |
75bf89f
|
cmungall commentedFeb 28, 2017
The ways in which ontologies and annotations are loaded is confusing and not well documented.
ultimately we want to be able to fetch latest version from API, to avoid stale files in github
cc @jnguyenx @damiansm @kshefchek