error writing to sbml without prefixing identifiers #883

Karrenbelt · 2019-07-25T13:01:26Z

Trying to prevent cobra from prefixing my identifiers when writing SBML, I got this (using the textbook model, model history error is unrelated):

cobra.io.write_sbml_model(model, 'test.xml', f_replace={})

Error encountered trying to <set model history>.
LibSBML error code -5: The object passed as an argument to the method is not of a type that is valid for the operation or kind of object involved. For example, handing an invalidly-constructed ASTNode to a method expecting an ASTNode will result in this error.
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/venv/lib/python3.6/site-packages/cobra/io/sbml.py", line 831, in write_sbml_model
    doc = _model_to_sbml(cobra_model, f_replace=f_replace, **kwargs)
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/venv/lib/python3.6/site-packages/cobra/io/sbml.py", line 1059, in _model_to_sbml
    gpa.setAssociation(gpr_new)
UnboundLocalError: local variable 'gpr_new' referenced before assignment

replacing gpr_new with gpr in these lines should fix this

The text was updated successfully, but these errors were encountered:

matthiaskoenig · 2019-07-25T13:12:58Z

I basically fixed this yesterday in the devel branch as part of other bugfixes ;). Please try the devel version.
Do in your virtualenv

(cobra) git clone https://github.com/opencobra/cobrapy.git
(cobra) cd cobrapy
(cobra) git checkout devel
(cobra) pip install -e .

This will be resolved with the next release.

Karrenbelt · 2019-07-25T13:29:26Z

it writes, but reading back it shows to be erroneous

cobra.io.read_sbml_model('test.xml')

Traceback (most recent call last):
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 210, in read_sbml_model
    **kwargs)
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 381, in _sbml_to_model
    sid = _check_required(specie, specie.getIdAttribute(), "id")
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 1194, in _check_required
    raise CobraSBMLError(msg)
cobra.io.sbml.CobraSBMLError: Required attribute 'id' cannot be found or parsed in '<Species "3-Phospho-D-glyceroyl phosphate">'. with name '3-Phospho-D-glyceroyl phosphate'
None
Traceback (most recent call last):
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 210, in read_sbml_model
    **kwargs)
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 381, in _sbml_to_model
    sid = _check_required(specie, specie.getIdAttribute(), "id")
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 1194, in _check_required
    raise CobraSBMLError(msg)
cobra.io.sbml.CobraSBMLError: Required attribute 'id' cannot be found or parsed in '<Species "3-Phospho-D-glyceroyl phosphate">'. with name '3-Phospho-D-glyceroyl phosphate'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Users/Zarathustra/ETH/repositories/sbml_tools/cobrapy/cobra/io/sbml.py", line 217, in read_sbml_model
    "Something went wrong reading the SBML model. Most likely the SBML"
cobra.io.sbml.CobraSBMLError: Something went wrong reading the SBML model. Most likely the SBML model is not valid. Please check that your model is valid using the `cobra.io.sbml.validate_sbml_model` function or via the online validator at http://sbml.org/validator .
	`(model, errors) = validate_sbml_model(filename)`
If the model is valid and cannot be read please open an issue at https://github.com/opencobra/cobrapy/issues .

same when using libsbml to read

doc = libsbml.readSBMLFromFile('test.xml')
doc.printErrors() # 16 in total

Midnighter · 2019-07-25T14:27:59Z

I think this is a different error and a duplicate of #879. Basically, the model you were using doesn't have correct SBML identifiers and those are dropped upon writing the model thus the model is broken on import.

cdiener · 2019-07-25T16:33:45Z

Yeah, most common pitfall is that you have identifiers that start with numbers (common for metabolites) and this is not allowed in SBML (and is why we add a to IDs).

matthiaskoenig · 2019-07-26T07:30:54Z

Not sure what to do here. There are clear error messages on writing the SBML that the identifiers are incorrect. We agreed on only reading valid SBML.
Only solution here is to change the invalid identifiers if you want to write them out.

We could be more drastic and just not create a SBML file at all on write errors like identifiers. Or restrict the allowed identifiers within cobrapy (nobody will like that most of all the bigg database).

Not sure what the issue is here. Please let me know if there is anything to fix.

cdiener · 2019-07-26T16:15:19Z

I would say there is a slightly incorrect behavior in writing the initial SBML. It seems to have invalid identifiers but no error is given. Writing a SBML model with invalid identifiers should nope out and make it clear that this is not right.

Midnighter · 2019-07-28T14:27:26Z

I agree, invalid identifiers is definitely an error that should not only be a logged error level message but actually raise an exception and not write the file.

gregmedlock · 2019-07-28T15:57:58Z

Maybe we could steer users in the right direction with a function that checks SBML compliance prior to writing, so that what we end up putting in documentation is more like:

# check SBML compliance
cobra.io.check_sbml_model(model) # prints any compliance issues

# write the compliant SBML model
cobra.io.write_sbml_model(model)

check_sbml_model could even return a "validated model" that is the only acceptable input to write_sbml_model if we wanted to be really strict... check_sbml_model could also have output that reads more like a report than an error, telling the user what they need to fix to be compliant.

I think only writing fully valid SBML files, plus adding a checking function of some sort that we recommend use of prior to writing, is the way to go. For cases like BIGG, IMO identifiers should be changed on a case-by-case basis for SBML compliance and/or shared as json.

matthiaskoenig · 2019-07-29T07:21:20Z

My personal take on this is to just make the used ids SBML compliant, i.e., if somebody tries to set an id on a cobra model which starts with a number the user should get a warning/error and the replacement should be performed right there and then.

Having invalid SBML ids internally in cobra creates many issues, e.g.,

things like tab completion are not working (Prepend underscores for tab completion of IDs that start with a number #828)
mapping data on networks becomes cumbersome, because one has to keep track of two set of ids (one internally in cobrapy and working with python, a second set of ids as soon as using the SBML model in a different tool; this hampers interoperability; makes it complicated to do simple things like hashmap lookups, because one has to perform possible id replacements everywhere and so on)
SBML export becomes slower and much more complicated (because one has to keep track of id replacements all the way through the model)
default SBML export behavior becomes strange (why are ids replaced by default)? I look for my cobrapy id in my SBML model but cannot find it, because strange replacements were made.

I would prefer the following behavior:

enforce (or highly recommend the use of) SBML compliant ids internally in cobrapy
add a convenience function allowing to make all ids SBML compliant
SBML export with invalid SBML ids does raise an error
id replacement is not the norm on SBML export any more (but can be enabled if people want this, for instance if some tools expect M_, R_ or G_ prefixes)

cdiener · 2019-07-29T17:07:37Z

@gregmedlock I think this woul be great but may be hard to implement given the sheer size of the SBML spec. The only way that seems simple would be to write the broken model and use a validator on the written SBML file (maybe @matthiaskoenig) knows if there is functionality for that in libSBML (something like the online validator).

@matthiaskoenig That sounds good. I think due to backwards compatibility we can't enforce those IDs but we could nudge users to use good ones (even though a warning may be too strong since non-SBML compliant IDs are valid in other export formats such as JSON and YAML). The rest sounds great too me (error on write and optional replacement).

gregmedlock · 2019-07-30T14:48:15Z

@cdiener agreed--I should have been more specific. The proposed function would only check compliance for SBML aspects that are already checked when writing the file via libSBML (e.g., IDs).

The idea is that users should be free to modify IDs for their own use, but should hit a hard roadblock when they attempt to write SBML without making the IDs compliant. Of course we should encourage compliance from the beginning, but I think throwing errors when a user sets a non-compliant ID is going to scare people away from the package. In my experience, most people struggle to understand the dire need for SBML when they first start using COBRA tools. For that reason, I favor warnings over errors when non-compliant IDs are set (and when a model containing non-compliant IDs is read). E.g., the suggestions by @matthiaskoenig with strong preference for warnings over errors when IDs are first set/read.

cdiener · 2019-07-30T16:40:24Z

@gregmedlock Very much agreed. To clarify I would only throw an error if the user enforces no prefixes and has invalid SBML IDs. The default would still be to add prefixes if the IDs are not compliant and just show a warning.

akaviaLab · 2019-07-30T23:05:30Z

I ran across this problem or a similar one. What about implementing something similar to https://github.com/opencobra/cobratoolbox/blob/develop/src/base/io/utilities/SBML/convertSBMLID.m
This is needed for various reasons, but since both read and write SBML call it, the user can't even tell that anything is happening.
If this is okay with you guys, I can take a stab at it.

matthiaskoenig · 2019-07-31T07:31:07Z

@akaviaLab This is happening right now. Code is part of the current behavior and is already implemented.

We are only talking about the very special use case in this issue of "someone has invalid identifiers but does not want the id replacements on writing SBML. What should we do in this case?"

matthiaskoenig · 2019-07-31T07:35:05Z

@akaviaLab In addition, ID replacement is a nightmare for interoperability between tools. Whenever possible tools should highly encourage to use valid ids. It becomes very difficult to map datasets onto networks if the ids change between formats. E.g. I run a flux simulation in cobra and want to visualize this with the SBML network. The ids of the flux simulation do not match the ids of the SBML exchange format any more.

Karrenbelt · 2019-07-31T08:00:40Z

naming conventions are more often a pain than they ought to be.
I agree with matthiaskoenig and cdiener, it is best to only throw an error in the special case someone does not want identifier replacement and thereby would create an invalid SBML model.

akaviaLab · 2019-08-01T21:55:52Z

@matthiaskoenig - the code does not replace the GIDs when reading old-style SBML files that have the gene association in NOTES field, like Recon 2.2. See example file exampleModel.xml.gz, where if you read SBML and try to write SBML, you get an error because the gene IDs have ":".
I think the SBML needs a function like the MATLAB Cobratoolbox I linked to. It might work with GPRcleaner, but it would need to be extended to colons - I also think it is a good idea that the replacement is the same in Cobrapy and cobatoolbox.
If you'd like, I can open a different bug for this issue.

Midnighter · 2019-08-03T16:38:53Z

If someone feels like mediating, coordinating with the Matlab folks to standardize on how identifiers are transformed could be a useful thing?

Other than that, I take from this issue that we need to:

Do transformations for GPR read from notes.
Raise an error upon writing invalid SBML.
Anything else I missed now?

matthiaskoenig closed this as completed Jul 25, 2019

Midnighter reopened this Jul 25, 2019

Midnighter added the SBML Related to reading and writing SBML models. label Jul 25, 2019

matthiaskoenig self-assigned this Jul 29, 2019

akaviaLab mentioned this issue Aug 21, 2019

Sbml improvements - notes, gene IDs, group IDs, and test #901

Merged

mmp3 mentioned this issue May 3, 2021

cannot load final model files: Required attribute 'id' cannot be found or parsed in... maureencarey/paradigm#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error writing to sbml without prefixing identifiers #883

error writing to sbml without prefixing identifiers #883

Karrenbelt commented Jul 25, 2019 •

edited

Loading

matthiaskoenig commented Jul 25, 2019 •

edited

Loading

Karrenbelt commented Jul 25, 2019 •

edited

Loading

Midnighter commented Jul 25, 2019

cdiener commented Jul 25, 2019

matthiaskoenig commented Jul 26, 2019

cdiener commented Jul 26, 2019

Midnighter commented Jul 28, 2019

gregmedlock commented Jul 28, 2019

matthiaskoenig commented Jul 29, 2019

cdiener commented Jul 29, 2019

gregmedlock commented Jul 30, 2019

cdiener commented Jul 30, 2019

akaviaLab commented Jul 30, 2019

matthiaskoenig commented Jul 31, 2019

matthiaskoenig commented Jul 31, 2019

Karrenbelt commented Jul 31, 2019

akaviaLab commented Aug 1, 2019

Midnighter commented Aug 3, 2019

error writing to sbml without prefixing identifiers #883

error writing to sbml without prefixing identifiers #883

Comments

Karrenbelt commented Jul 25, 2019 • edited Loading

matthiaskoenig commented Jul 25, 2019 • edited Loading

Karrenbelt commented Jul 25, 2019 • edited Loading

Midnighter commented Jul 25, 2019

cdiener commented Jul 25, 2019

matthiaskoenig commented Jul 26, 2019

cdiener commented Jul 26, 2019

Midnighter commented Jul 28, 2019

gregmedlock commented Jul 28, 2019

matthiaskoenig commented Jul 29, 2019

cdiener commented Jul 29, 2019

gregmedlock commented Jul 30, 2019

cdiener commented Jul 30, 2019

akaviaLab commented Jul 30, 2019

matthiaskoenig commented Jul 31, 2019

matthiaskoenig commented Jul 31, 2019

Karrenbelt commented Jul 31, 2019

akaviaLab commented Aug 1, 2019

Midnighter commented Aug 3, 2019

Karrenbelt commented Jul 25, 2019 •

edited

Loading

matthiaskoenig commented Jul 25, 2019 •

edited

Loading

Karrenbelt commented Jul 25, 2019 •

edited

Loading