Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata fbc3 group #988

Open
wants to merge 81 commits into
base: devel
Choose a base branch
from

Conversation

Hemant27031999
Copy link
Contributor

@Hemant27031999 Hemant27031999 commented Aug 15, 2020

Description of Features

  1. Metadata Classes: A separate directory for handling metadata information is made inside cobra/core directory. Every object derived from SBase can have meta information like annotation (CVTerms), notes, history attached to it.
  • The Notes class is holding a simple string containing notes data (XHTML string) and a dictionary, synchronized with the notes string, storing key-value pair of the form <p> key : value </p> present in the notes string. One can only modify this key-value pair data, he can't add new key-value pairs inside notes because notes are not a right place to store these key-value pairs.
  • The CVTerms class for storing externally linked resources to each component derived from SBase. This class is maintaining the new format annotation as well as old format annotation simultaneously, and both are kept synchronized with each other. Changing one will modify the other accordingly. This new class for annotation can handle any type of annotation data (be it the case of nested annotation or alternative annotation). It can read the old format as well as the new format annotation data from JSON and other formats. At the time of writing back the model, the new data format is used because it contains the complete data organized in the same way as SBML.
  • The History class used for storing the history, validating dates, etc, is now attached to each component derived from SBase.
  • The KeyValuePair class for storing key-value pairs, defined by fbc-v3.

The last three metadata objects (i.e CVTerms, History, KeyValuePair) are present inside a single attribute of SBase (Object) class and can be accessed via object.annotation.cvterms, object.annotation.history and object.annotation.key_value_data attributes. Calling simply the annotation attribute (object.annotation) will return the annotation data in old format (making it backward compatible).

  1. Group to JSON: The support of the group package is extended to JSON.

  2. JSON schema v2: The version2 of JSON schema has been added which defines the new format annotation, history, key-value pair, notes, group package data, user-defined constraints data and basic SBML info.

Issues Fixed

Tests

Tests for all the newly implemented features are added to check the functionalities. A few old tests are also modified accordingly. Some tests which were initially marked 'xfail' are now working dew to modified formats.

@codecov-commenter
Copy link

codecov-commenter commented Aug 26, 2020

Codecov Report

Merging #988 into devel will decrease coverage by 1.31%.
The diff coverage is 75.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##            devel     #988      +/-   ##
==========================================
- Coverage   84.45%   83.13%   -1.32%     
==========================================
  Files          58       64       +6     
  Lines        5036     5888     +852     
  Branches     1092     1276     +184     
==========================================
+ Hits         4253     4895     +642     
- Misses        508      660     +152     
- Partials      275      333      +58     
Impacted Files Coverage Δ
src/cobra/core/formula.py 25.00% <ø> (ø)
src/cobra/core/metabolite.py 70.65% <ø> (ø)
src/cobra/core/reaction.py 88.02% <ø> (ø)
src/cobra/flux_analysis/deletion.py 93.33% <ø> (ø)
src/cobra/flux_analysis/loopless.py 91.11% <ø> (ø)
src/cobra/flux_analysis/variability.py 92.78% <ø> (ø)
src/cobra/sampling/achr.py 100.00% <ø> (ø)
src/cobra/sampling/optgp.py 96.92% <ø> (ø)
src/cobra/summary/metabolite_summary.py 89.65% <ø> (ø)
src/cobra/summary/model_summary.py 86.66% <ø> (ø)
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad49414...010bb44. Read the comment docs.

@matthiaskoenig
Copy link
Contributor

@Hemant27031999 I went over most of the code and cleaned it up. Was much more work then expected but most of the metadata is now pretty clean and neat. I also updated to the latest develop code.

Could you

  • add the cvterms.to_dict method which will handle the serialization of the cvterms. I added the similar methods on the other metadata objects. You can currently see in the two failing tests where this is still missing
  • can you update the `metadata.ipynb I started for the the documentation of the metadata features. Here you should add the common use cases, i.e., adding CVTerms, adding ModelHistory, modifying notes, storying information as Key-Value pairs, ...

@Hemant27031999
Copy link
Contributor Author

@matthiaskoenig I have made the requested changes. All tests are passing now.

@Hemant27031999
Copy link
Contributor Author

Hey people, metadata and UserDefinedConstraints are two separate functionalities. So I have removed UserDefinedConstraint from this branch, it now majorly includes metadata and some JSON serialization. I shall link another PR for Constraint class.

Comment on lines 156 to 158
cvterms = data["cvterms"] if "cvterms" in data else None
history = data["history"] if "history" in data else None
keyValueDict = data["history"] if "key_value_data" in data else None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to define String literals as constants? Using such literal expressions repeatedly belongs to a possible source of error (because of typos) that can be relatively easily eliminated.

Comment on lines 165 to 166
if "sbo" in data:
annotation["sbo"] = [data["sbo"]]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also: Please avoid using String literals repeatedly.

@@ -180,7 +186,7 @@ def _update_optional(cobra_object, new_dict, optional_attribute_dict,
def metabolite_to_dict(metabolite):
new_met = OrderedDict()
for key in _REQUIRED_METABOLITE_ATTRIBUTES:
if key == 'id':
if key == "id":
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again... Define as a String constant if possible.

cobra_members.append(cobra_obj)
elif member["type"] == "Gene":
cobra_obj = model.genes.get_by_id(
F_REPLACE["F_GENE"](member["idRef"]))
F_REPLACE["F_GENE"](member["idRef"])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just notice that there are many String literals all over the source code. I will not add further remarks about it. If possible, please go through the source code, identify String literals and replace them with constant expressions (constant variables).

@@ -295,7 +295,7 @@ def constraint_from_expression(id=None, expression: 'str' = '',
tree = ast.parse(source=expression, mode='eval')
compute_nodes = UserDefinedConstraint.ComputeNumericNodes()
tree = compute_nodes.visit(tree)
print(ast.dump(tree))
print((ast.dump(tree)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are additional parentheses needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I added them or the code formatter. By the way, I will update it. This file is actually a part of other the PR now. It's no longer on this branch.

Comment on lines 74 to 86
result = Num(n=node.left.n + node.right.n)
return copy_location(result, node)
elif isinstance(node.op, Sub):
result = Num(n=node.left.n - node.right.n)
return copy_location(result, node)
elif isinstance(node.op, Mult):
result = Num(n=node.left.n * node.right.n)
return copy_location(result, node)
elif isinstance(node.op, Div):
result = Num(n=node.left.n / node.right.n)
return copy_location(result, node)
elif isinstance(node.op, Mod):
result = Num(n=node.left.n % node.right.n)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add blanks between variables and operators, e.g., change n=xyz to n = xyz.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The python code format tool "black" doesn't allow spaces between the variable and operator when passed as arguments.

Comment on lines 100 to 214
"P": 30.973761,
"S": 32.065000,
"Cl": 35.453000,
"Ar": 39.948000,
"K": 39.098300,
"Ca": 40.078000,
"Sc": 44.955910,
"Ti": 47.867000,
"V": 50.941500,
"Cr": 51.996100,
"Mn": 54.938049,
"Fe": 55.845000,
"Co": 58.933200,
"Ni": 58.693400,
"Cu": 63.546000,
"Zn": 65.409000,
"Ga": 69.723000,
"Ge": 72.640000,
"As": 74.921600,
"Se": 78.960000,
"Br": 79.904000,
"Kr": 83.798000,
"Rb": 85.467800,
"Sr": 87.620000,
"Y": 88.905850,
"Zr": 91.224000,
"Nb": 92.906380,
"Mo": 95.940000,
"Tc": 98.000000,
"Ru": 101.070000,
"Rh": 102.905500,
"Pd": 106.420000,
"Ag": 107.868200,
"Cd": 112.411000,
"In": 114.818000,
"Sn": 118.710000,
"Sb": 121.760000,
"Te": 127.600000,
"I": 126.904470,
"Xe": 131.293000,
"Cs": 132.905450,
"Ba": 137.327000,
"La": 138.905500,
"Ce": 140.116000,
"Pr": 140.907650,
"Nd": 144.240000,
"Pm": 145.000000,
"Sm": 150.360000,
"Eu": 151.964000,
"Gd": 157.250000,
"Tb": 158.925340,
"Dy": 162.500000,
"Ho": 164.930320,
"Er": 167.259000,
"Tm": 168.934210,
"Yb": 173.040000,
"Lu": 174.967000,
"Hf": 178.490000,
"Ta": 180.947900,
"W": 183.840000,
"Re": 186.207000,
"Os": 190.230000,
"Ir": 192.217000,
"Pt": 195.078000,
"Au": 196.966550,
"Hg": 200.590000,
"Tl": 204.383300,
"Pb": 207.200000,
"Bi": 208.980380,
"Po": 209.000000,
"At": 210.000000,
"Rn": 222.000000,
"Fr": 223.000000,
"Ra": 226.000000,
"Ac": 227.000000,
"Th": 232.038100,
"Pa": 231.035880,
"U": 238.028910,
"Np": 237.000000,
"Pu": 244.000000,
"Am": 243.000000,
"Cm": 247.000000,
"Bk": 247.000000,
"Cf": 251.000000,
"Es": 252.000000,
"Fm": 257.000000,
"Md": 258.000000,
"No": 259.000000,
"Lr": 262.000000,
"Rf": 261.000000,
"Db": 262.000000,
"Sg": 266.000000,
"Bh": 264.000000,
"Hs": 277.000000,
"Mt": 268.000000,
"Ds": 281.000000,
"Rg": 272.000000,
"Cn": 285.000000,
"Uuq": 289.000000,
"Uuh": 292.000000,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that elements naturally occur in different isotopes. Please indicate if these values are the average or the most likely molecular weight in form of a user-readable description. Better could be to use a dictionary whose values are lists of values for the different isotopes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Members, please provide review. I haven't updated it. It's been reformatted by "black" itself by adding double quotes in place of single quotes.

Comment on lines 5 to 11
QUALIFIER_TYPES = (
"is", "hasPart", "isPartOf", "isVersionOf", "hasVersion",
"isHomologTo", "isDescribedBy", "isEncodedBy", "encodes",
"occursIn", "hasProperty", "isPropertyOf", "hasTaxon",
"unknown", "bqm_is", "bqm_isDescribedBy", "bqm_isDerivedFrom",
"bqm_isInstanceOf", "bqm_hasInstance", "bqm_unknown",
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a documentation comment here where to find the most recent list of qualifiers to ease later updats.

@cdiener cdiener added the stale The issue or pull request lacks activity. label Sep 1, 2021
@akaviaLab akaviaLab mentioned this pull request May 29, 2022
@cdiener cdiener mentioned this pull request Jun 6, 2022
3 tasks
akaviaLab pushed a commit to akaviaLab/cobrapy that referenced this pull request Jun 19, 2022
@akaviaLab akaviaLab mentioned this pull request Jun 19, 2022
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ready Finished PR that requires review and merge. stale The issue or pull request lacks activity.
Projects
Roadmap
  
Review
Development

Successfully merging this pull request may close these issues.

None yet

6 participants