Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reconcile OBO/OWL legal term names with our current label name restrictions #1113

Closed
bkmartinjr opened this issue Jan 16, 2020 · 2 comments
Closed
Assignees
Labels

Comments

@bkmartinjr
Copy link
Contributor

Current code does not allow user to select a (legal) ontology term if it contains various non-alphanumeric characters (eg, a comma, dash, etc). Anything in the ontology file should be legal (which means any unicode point). In addition, this will require validation that the CSV escaping is correctly handled when saving/loading.

Notes from discussion on desired approach:
1 - We still validate user input when they enter a label (i.e., not an ontology term), and still restrict to a subset of all characters.
2 - When they enter an ontology term, we use the exact term, even if it contains characters not in the set used to validate de novo terms in #1.
3 - To make these behaviors as similar as possible, we will add ( ) - to the allowed characters list in user-entered labels legal characters (see #1).
4 - CSV escaping will be properly implemented for load and save

End result:

  • any term in the ontology will be selectable, loadable and savable.
  • de novo terms entered by the user, not in the ontology, will be restricted to a sub-set of ASCII A-Za-z0-9_-()

@sidneymbell - please edit if I missed something.

@bkmartinjr bkmartinjr added the bug label Jan 16, 2020
@bkmartinjr bkmartinjr added this to the 0.14.0 milestone Jan 16, 2020
@bkmartinjr
Copy link
Contributor Author

Example term in the now-current cell ontology which exercises this bug:

[Term]
id: CL:0002037
name: CD2-positive, CD5-positive, CD44-positive alpha-beta intraepithelial T cell
namespace: cell
def: "Intraepithelial T cells with a memory phenotype of CD2-positive, CD5-positive, and CD44-positive." [GOC:tfm, PMID:16048545]
synonym: "memory alpha beta IEL" EXACT [PMID:16048545]
is_a: CL:0000793 ! CD4-positive, alpha-beta intraepithelial T cell
created_by: tmeehan
creation_date: 2009-10-20T02:20:43Z

@bkmartinjr bkmartinjr modified the milestones: 0.14.0, 0.15.0 Feb 5, 2020
@sidneymbell
Copy link
Contributor

Closing for now, with the caveat that long-term we're going to have to handle the inconsistency between user-created annotations vs ontology-provided terms re: commas, etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants