Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to convert owl format to obo #1001

Open
lmanchon opened this issue May 12, 2022 · 6 comments
Open

unable to convert owl format to obo #1001

lmanchon opened this issue May 12, 2022 · 6 comments

Comments

@lmanchon
Copy link

-Hi,

last version of robot failed to convert this owl file:
https://raw.githubusercontent.com/stemcellontologyresource/OSCI/master/src/ontology/osci.owl

is there a specific parameter i need to add ? Thank you.

java -jar ./robot.jar convert --input osci.owl --format obo --output osci.obo

ERROR Input ontology contains 3 triple(s) that could not be parsed:

OBO STRUCTURE ERROR Ontology does not conform to OBO structure rules:
multiple def tags not allowed. in frame:Frame(CLO:0000001 id( CLO:0000001)comment( A 'cell line cell' is a part of a cell line established through the passaging/selection of a primary cultured cells or the experimental modification of an existing cell line. New types of cell line cells are established after sufficient passaging of a primary culture to establish a stable and homogenous population that qualifies as a line (typically 1-20 passages), or following some spontaneous or experimental modification that confers novel characteristics to an existing line. A cell line cell typically has mutations of five or more genes compared to the original cell that derives the cell line cell. Some gene mutations may turn on some oncogenes. Cell line cells can be in active culture, stored in a quiescent state for future use (e.g. frozen in liquid nitrogen), or applied in experimental procedures. )name( cell line cell)property_value( IAO:0000111 cell line cell xsd:string)def( A cultured cell that is part of a cell line - a stable and homogeneous population of cells with a common biological origin and propagation history in culture)def( A cultured cell that is part of a cell line - a stable and homogeneous population of cells with a common biological origin and propagation history in culture
)property_value( IAO:0000412 http://purl.obolibrary.org/obo/clo.owl)property_value( IAO:0000117 Yongqun He, Matthew Brush, Sirarat Sarntivijai, Alexander Diehl, Jie Zheng, Yu Lin, Bjoern Peters xsd:string)property_value( IAO:0000412 http://purl.obolibrary.org/obo/obi.owl)relationship( RO:0001000 CL:0000001)is_a( OBI:0001866{{is_inferred=true} }))
For details see: http://robot.obolibrary.org/errors#obo-structure-error
Use the -vvv option to show the stack trace.
Use the --help option to see usage information.

@matentzn
Copy link
Contributor

Hey @lmanchon

ERROR Input ontology contains 3 triple(s) that could not be parsed:

Has nothing to do with OBO. Just ignore it #829

OBO structure:

  1. OBO goes loslessly into OWL, but not the other way around. For example, having two labels or definitions is illegal in OBO.
  2. That said, if you add --check false you may be lucky enough and ROBOT will create a broken OBO file for you (one with multiple definitions/etc. This may be ok for some use cases, but many strict OBO libraries may not be able to read an OBO file that is not OBO format conformant.
  3. If you really have to convert an OWL ontology to legal OBO, you may have to massage the input, like deleting duplicate labels (difficult problem, I think you have to write a really advanced SPARQL query to do that).

@matentzn
Copy link
Contributor

You also asked elswhere where to get OBO files from? Many OBO Foundry ontologies provide OBO format outputs, but to be honest, most use --check false. There is no guarantee they are standard conformant.

For downstream processing I recommend using ROBOT table generation processes: https://oboacademy.github.io/obook/tutorial/sparql-report-robot/

@lmanchon
Copy link
Author

so nothing obvious for the owl --> obo conversion.
it's a problem. The chado schema and the tripal module only recognize the obo format.

@matentzn
Copy link
Contributor

To be honest, I think this would be a great ROBOT feature. I struggle with this so much as well. I have been debugging dozens of OBO format violations.

@lmanchon
Copy link
Author

it's a problem with all these file formats.
there should be a unique standard format, why not the owl.
And delete the obo format.

@jamesaoverton
Copy link
Member

The OBO community requires all projects to publish an OWL file in RDF/XML format as their primary release product. All other products are optional. If you use OBO community projects, building your tools to use OWL in RDF/OWL format makes good sense, so why doesn't everybody do that?

  1. A number of projects predate the OWL specification, and/or still rely on OBO-format toolchains.

  2. OBO-format is pretty simple to read and write, and there are libraries to work with it in a number of different languages, including Java, Rust, and Python. On the other hand, the only thorough implementation of OWL that I'm aware of has been OWLAPI, which is limited to Java and the JVM. That is starting to change with horned-owl (Rust), and early work I've been pushing to encode OWL logic in JSON ("wiring") so it can be inserted into SQL ldtab.clj, and variations such as semantic-sql.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants