Subset with non-specific terms excluded #1133

Closed
dhimmel opened this Issue Jul 18, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@dhimmel

dhimmel commented Jul 18, 2015

I am interested in excluding terms that are too broad to be biologically meaningful. Examples include anatomical entity, organ part, reproductive structure, and potentially even digestive system part.

I saw that GO has a do not annotate termset, defined as "the set of high level terms that are useful for grouping, but should have no direct annotations". Is there anything like this for Uberon? If not is there an easy methodology to accomplish the task?

Finally, are there any slim subsets that remove redundant nodes and achieve broad coverage for a given specificity level?

@cmungall

This comment has been minimized.

Show comment
Hide comment
@cmungall

cmungall Jul 18, 2015

Member

These two antislims will get you a long way

subsetdef: upper_level "abstract upper-level terms not directly useful for analysis"
subsetdef: non_informative "abstract class brought in to group ontology classes but not informative"

can you give an example of redundant nodes? Formally nothing is redundant, but some classes buy you nothing for certain purposes, these should largely go into the grouping_class subset (let us know if any are missing)

You may also want to exclude any subclasses of anatomical space. These are useless for gene expression (but potentially useful for phenotype: e.g. lumen volume)

Member

cmungall commented Jul 18, 2015

These two antislims will get you a long way

subsetdef: upper_level "abstract upper-level terms not directly useful for analysis"
subsetdef: non_informative "abstract class brought in to group ontology classes but not informative"

can you give an example of redundant nodes? Formally nothing is redundant, but some classes buy you nothing for certain purposes, these should largely go into the grouping_class subset (let us know if any are missing)

You may also want to exclude any subclasses of anatomical space. These are useless for gene expression (but potentially useful for phenotype: e.g. lumen volume)

@dhimmel

This comment has been minimized.

Show comment
Hide comment
@dhimmel

dhimmel Jul 18, 2015

Thanks @cmungall -- these subsets should fulfill my needs.

By redundant, I meant nodes that subsume each other. Are there ways to get a set of only independent terms?

dhimmel commented Jul 18, 2015

Thanks @cmungall -- these subsets should fulfill my needs.

By redundant, I meant nodes that subsume each other. Are there ways to get a set of only independent terms?

@dhimmel

This comment has been minimized.

Show comment
Hide comment
@dhimmel

dhimmel Aug 3, 2015

Thanks, I created a subset of 402 Uberon terms tailored for my application. As suggested I removed terms in upper_level and non_informative.

dhimmel commented Aug 3, 2015

Thanks, I created a subset of 402 Uberon terms tailored for my application. As suggested I removed terms in upper_level and non_informative.

@dhimmel dhimmel closed this Aug 3, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment