Identifying the subset of human relevant terms #703

Open
dhimmel opened this Issue Jun 18, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@dhimmel

dhimmel commented Jun 18, 2015

For my project, I would like to identify the set of Uberon, and potentially Cell Ontology, terms that apply to humans. In other words, which structures exist in humans.

I posted the question on Thinklab -- an open science platform that pays people for feedback -- and @cmungall gave a detailed and expert answer. I will continue the discussion on Thinklab, but wanted to post here in case others had the same question.

@fbastian

This comment has been minimized.

Show comment
Hide comment
@fbastian

fbastian Jun 18, 2015

Chris showed you how to generate a human subset of Uberon. From this, I also generate a csv files with a true/false status for all terms in Uberon, I find it more convenient to spot errors, see https://www.dropbox.com/s/r97fykcg68tvmux/ext_human_constraints.csv?dl=0

Chris mentioned two approaches, but IMO the best approach would be to stick to the first solution, and fix the incorrect taxon constraints in Uberon directly. If you are interested, we could share the effort, as this is something we want to do.

Chris showed you how to generate a human subset of Uberon. From this, I also generate a csv files with a true/false status for all terms in Uberon, I find it more convenient to spot errors, see https://www.dropbox.com/s/r97fykcg68tvmux/ext_human_constraints.csv?dl=0

Chris mentioned two approaches, but IMO the best approach would be to stick to the first solution, and fix the incorrect taxon constraints in Uberon directly. If you are interested, we could share the effort, as this is something we want to do.

@dhimmel

This comment has been minimized.

Show comment
Hide comment
@dhimmel

dhimmel Jun 18, 2015

Thanks for generating the file -- it's a tsv rather than csv, which is better and should be named as such.

I will look into how the negative evidence method compares with our (likely suboptimal) implementation of the positive evidence method.

I will help in what ways I can, which will be pretty limited because I don't know much about anatomy and am inexperienced with OWL (although here I am looking to learn).

dhimmel commented Jun 18, 2015

Thanks for generating the file -- it's a tsv rather than csv, which is better and should be named as such.

I will look into how the negative evidence method compares with our (likely suboptimal) implementation of the positive evidence method.

I will help in what ways I can, which will be pretty limited because I don't know much about anatomy and am inexperienced with OWL (although here I am looking to learn).

@dhimmel

This comment has been minimized.

Show comment
Hide comment
@dhimmel

dhimmel Jun 18, 2015

@fbastian, take a look at my comparison of the positive evidence versus no negative evidence methods. For now, we plan to use your tsv file to determine whether a term is human appropriate. However, we identified 8 terms where the no negative evidence approach failed. These may be a good place to start regarding taxon constraints.

dhimmel commented Jun 18, 2015

@fbastian, take a look at my comparison of the positive evidence versus no negative evidence methods. For now, we plan to use your tsv file to determine whether a term is human appropriate. However, we identified 8 terms where the no negative evidence approach failed. These may be a good place to start regarding taxon constraints.

@fbastian

This comment has been minimized.

Show comment
Hide comment
@fbastian

fbastian Jun 19, 2015

Hey, this is really great! I wanted to use the "only_in_taxon" and "present_in_taxon" to infer "positive evidence", but never had the time. It's also nice to use the xrefs. The only problem is that it identified more than 4'000 terms over the full Uberon, I don't want to believe there is so much work to do :p

I think @cmungall will easily find how to fix the taxon constraints for the 8 terms you identified.

Hey, this is really great! I wanted to use the "only_in_taxon" and "present_in_taxon" to infer "positive evidence", but never had the time. It's also nice to use the xrefs. The only problem is that it identified more than 4'000 terms over the full Uberon, I don't want to believe there is so much work to do :p

I think @cmungall will easily find how to fix the taxon constraints for the 8 terms you identified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment