Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new in GECKO 3.2.0: protein usage reactions always draw from protein pool, also when proteomics is integrated #375

Closed
2 tasks done
edkerk opened this issue May 8, 2024 · 4 comments

Comments

@edkerk
Copy link
Member

edkerk commented May 8, 2024

27 May 2024: The suggested changes are now implemented in GECKO 3.2.0. The text below describes how GECKO worked before version 3.2.0.

Currently, the enzyme usage reactions can be defined in two ways, dependent on whether proteomics data is integrated.

Model content Without proteomics With proteomics
Protein usage rxn prot_Q99312[c] <= prot_pool[c] prot_Q99312[c] <=
LB of protein usage rxn -1000 Measured Q99312 concentration, as taken from model.ec.concs, or potentially flexibilized by flexibilizeEnzConcs. Example = -0.0416
Protein pool exchange rxn prot_pool[c] <= prot_pool[c] <=
LB of protein pool exchange rxn Total enzyme content, as defined by Ptot * sigma * f. Example = -125 Non-measured enzyme content, as calculated by updateProtPool . Example = -95.915

A problem that I have encountered with this approach is that the new lower bound of the protein pool exchange reaction might be too strict. The model can no longer be solved, unless some proteins are flexibilized by a high amount (although sometimes this even does not resolve the problem).

  • In the calculation by updateProtPool, it assumes that the f-factor (fraction of protein being enzymes) is the same for both the measured- and unmeasured-protein fraction.
  • Already when the f-factor is first calculated, it is only based on the measured-protein fraction (if this data is available), which might be somewhat biased, but at that stage it would be countered out by the fitting of the sigma-factor.
  • In addition, to avoid over-constraining individual proteins based on noisy proteomics data, loadProtData by default adds 1 or more standard deviations to the protein measurements. As a consequence, the sum of measured protein concentrations Pmeas is substantially higher, which automatically means that the unmeasured protein fraction Ptot-Pmeas is always lower than it should be.

As an alternative, there is actually no good reason why the enzyme usage reaction has to change when proteomics data is integrated, except for changing its lower bound. The new approach suggested below would prevent the issues raised above, and instead would keep using the same lower bound for the protein pool exchange reaction that earlier in the model generation pipeline had been fitted to give realistic growth predictions. New suggestion:

Model content Without proteomics With proteomics
Protein usage rxn prot_Q99312[c] <= prot_pool[c] prot_Q99312[c] <= prot_pool[c]
LB of protein usage rxn -1000 Measured Q99312 concentration, as taken from model.ec.concs, or potentially flexibilized by flexibilizeEnzConcs. Example = -0.0416
Protein pool exchange rxn prot_pool[c] <= prot_pool[c] <=
LB of protein pool exchange rxn Total enzyme content, as defined by Ptot * sigma * f. Example = -125 Total enzyme content, as defined by Ptot * sigma * f. Example = -125

I hereby confirm that:

  • The new feature is not already in the main branch of the repository.
  • A similar issue does not already exist.
@Yu-sysbio
Copy link
Collaborator

The new suggestion looks very nice!

@edkerk
Copy link
Member Author

edkerk commented May 26, 2024

This will be implemented in GECKO 3.2.0.

  • constrainEnzConcs will no longer remove prot_pool[c] from the protein usage reaction when it is constrained with protein concentrations, as described above.
  • updateProtPool has become obsolete, setProtPoolSize should instead be used. updateProtPool will check if the model indeed follows the new approach, in which case an error is thrown with explanation and direction to use setProtPoolSize. If the protein usage reactions follow the old approach, then updateProtPool will run as usual.
  • tutorial.m will be updated to mention the change.
  • README.md will have a section with changes since the Nature Protocols publication.

@edkerk
Copy link
Member Author

edkerk commented May 26, 2024

In the full_ecModel tutorial, the flexibilizeEnzConcs that is run after constraining enzyme concentrations will flexibilize 34 enzymes with the new approach, while 36 in the previous approach. This already shows that the model is allowed a little more flexibility, which should indeed avoid the potential problems that are mentioned in the OP.

edkerk added a commit that referenced this issue May 26, 2024
edkerk added a commit that referenced this issue May 27, 2024
* fix: calculateFfactor can take protData input

* feat: constrainEnzConcs keep prot pool draw

addresses #375

* fix: updateProtPool obsolete

* doc: update full_ecModel for prot_usage rxns

* fix: enzymeUsage correct mention of output units

solves #376

* doc: updateGECKOdoc

* doc: update README.md with protocol change

* fix: README.md link
@edkerk edkerk mentioned this issue May 27, 2024
2 tasks
@edkerk edkerk pinned this issue May 27, 2024
@edkerk edkerk changed the title feat: keep protein usage reaction draw from protein pool when proteomics is integrated new in GECKO 3.2.0: protein usage reactions always draw from protein pool, also when proteomics is integrated May 27, 2024
@edkerk
Copy link
Member Author

edkerk commented May 27, 2024

Issue will close, as the changes are applied. Issue will remain pinned for now, for easy access to this explanation.

@edkerk edkerk closed this as completed May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants