-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot export SBML when annotation is a list of lists #736
Comments
This will be solved as part of #684 |
This is a JSON import bug, not an SBML related bug. When reading the json the list of lists must be converted in correct annotations. |
I also had this issue when I was trying to parse universal_model to sbml which still shows annotations in the form of list of lists. But when I checked the json_schema for annotations, it specify annotations to be of type object :
And performing a validation check over this json using the corresponding json_schema do points the error :
Error :
So I think json_scheme is upto date but the error exist with the models where annotations are list of lists. Because other models like e_coli_core where annotations values are json objects passes the validation test. |
A JSON object is not the same as an object in Python but rather a something like a dict (https://restfulapi.net/json-objects/). @zakandrewking when you said that the JSON schema allows annotations as list which source did you refer to? The one currently implemented in cobrapy does not seem to allow it. |
I see this in a similar light to you, Christian. To me what should possibly happen in a next version of the JSON schema is:
An example of an annotation is then: "annotation": {
"bigg.reaction": [["is", "PFK26"]],
"kegg.reaction": [["is", "R02732"]],
"rhea": [["is", "15656"]]
} |
This looks good to me and has the advantage that is already valid under the current schema which makes it backwards compatible. |
I just tested the steps from my original posting, and they are still the case with COBRA 0.17.1. It's not that the schema is wrong but that COBRApy doesn't stop you from using it this way. |
Oh I see so the problem is that loading it does not throw an error because it should not pass the schema. |
That was also one thing that confused me. Though the json_schema is defined in cobra.core.json.py file, it is nowhere used as a validator to validate the model which we are loading from the external json file. The load_json_model(filename) method of cobra.core.json.py directly parse the json into dictionary without any validation.The above error which I caught came when I myself checked the json against the json_schema. If validation is done while parsing the model from the json file, the 'list of lists' error or any other possible error will be caught there only instead of it being rising at the time of making of sbml or any other kind of model. |
@Hemant27031999 yeah I saw the the same thing. I think this is due to schema validation being really slow and reading JSON is currently really fast. Also most validation is performed in |
Yeah, I think that can be the best thing to do. Not only it will stop the complete validation check on the json model using the schema that can make the process slow, it will even catch the error of "list of lists" while reading the model from the json file. |
Sure go ahead. The |
I am working on my proposal for GSoC 2020 where I have to add the complete support for annotation in JSON format which, right now, have many issues. I have gone through the SBML level 3, version 2 specification doc for annotation in section 6 and have figured out following three feasible data format in SBML-JSON form. SBML Annotation :
Format 1 :
Format 2 :
Format 3:
Out of the above three formats, format 2 is somewhat similar to one discussed above (suggested by @Midnighter ) in above discussion. This format is good, but it has a repetition of biological identifier which makes it a little bit lengthy. Format 1 has identifier as key and list of resources as value. It looks a little more compact and also follow structure similar to one followed in SBML itself i.e resources listed under biological qualifier name. Format 3 is also compact but it looks a little scattered. |
Awesome, thank you for putting in the extra effort here @Hemant27031999. I have opted for format 2 in the pydantic models created in cobra-component-models. In particular, you can look at the definition of the class for annotations which is used in the common base class for species, compartments, and reactions. I opted for this slightly more verbose option because it is closest to the current JSON schema. I'm not a fan of format 1 at all because separating namespace and identifier will require a lot of string manipulation and it is the biggest change from the existing format. |
In my case, I just get rid of annotations completely:
|
This has derailed a bit into discussing the new annotation format. The original issue has been fixed by enforcing the schema. |
Problem description
The JSON format allows annotations as lists of lists, but trying to export such a model as SBML causes an error.
This is the case for the BiGG Models universal model (http://bigg.ucsd.edu/data_access). If that model is incorrectly formatted, I can fix it, but it might be good to also add some warning in COBRApy.
Might be related to #580. Originally reported as SBRG/bigg_models#299.
(Thanks!!!!)
Code Sample
Model:
Script:
Actual Output
Expected Output
Valid SBML model.
Dependency Information
Please paste the output of
python -c "from cobra import show_versions; show_versions()"
in the details block below.System Information
OS Darwin
OS-release 17.6.0
Python 3.6.5
Package Versions
cobra 0.13.0
depinfo 1.3.0
future 0.16.0
numpy 1.14.2
optlang 1.4.2
pandas 0.22.0
pip 10.0.1
ruamel.yaml 0.14.12
setuptools 39.2.0
six 1.11.0
swiglpk 1.4.4
tabulate 0.8.2
wheel 0.31.1
The text was updated successfully, but these errors were encountered: