Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lexical validation (spellchecking, punctuation, casing) to OAK #306

Open
cmungall opened this issue Oct 10, 2022 · 1 comment
Open

Comments

@cmungall
Copy link
Collaborator

cmungall commented Oct 10, 2022

Followed on from #305

From @gouttegd

at FlyBase we have a small pronto-based script to do some minimal validation of the text in definitions and comments
https://github.com/FlyBase/flybase-ontology-scripts/blob/master/misc/obo_spellchecker.py
It checks spelling and obvious typo mistakes such as missing or duplicated punctuation (it started as a spell-checking script only, hence the name).

@cmungall cmungall changed the title Add lexical validation to OAK Add lexical validation (spellchecking, punctuation, casing) to OAK Oct 10, 2022
@cmungall
Copy link
Collaborator Author

cmungall commented Oct 10, 2022

We have some of these right now:

➜   ✗ cat tests/input/lint-test.obo
format-version: 1.2
ontology: lint-test

[Term]
id: X:1
name: test  1 ! double whitespace
def: " foo    bar  " [PMID:1] ! training spaces
➜   ✗ runoak -i tests/input/lint-test.obo lint
[
{
  "id": "x",
  "old_value": "test  1 ",
  "new_value": "test 1",
  "about_node": "X:1",
  "@type": "NodeRename"
}
,
{
  "id": "x",
  "old_value": " foo    bar  ",
  "new_value": "foo bar",
  "about_node": "X:1",
  "@type": "NodeTextDefinitionChange"
}
]

You can then take this output and apply it:

$ runoak -i tests/input/lint-test.obo apply --changes-format json --changes-input changes.json -o fixed.obo -O obo
diff fixed.obo tests/input/lint-test.obo
6,7c6,7
< name: test 1
< def: "foo bar" [PMID:1]
---
> name: test  1 ! double whitespace
> def: " foo    bar  " [PMID:1] ! training spaces

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant