Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numerical structural identifiability analysis #196

Closed
formersbach opened this issue Sep 22, 2022 · 4 comments
Closed

Numerical structural identifiability analysis #196

formersbach opened this issue Sep 22, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request text2model Text-to-model conversion

Comments

@formersbach
Copy link
Contributor

I believe identifiability analysis is a very useful tool in model development and experiment design!
Based on the work of Joubert and Stigter I have implemented the numerical part of their algorithm in a proof of concept script.
Structural identifiability is a structural property of any model. To quote Joubert et al:
"One way of characterizing structural identifiability is to say that at least 1 parameter has a confidence interval that spans the interval negative infinity to positive infinity. Any form of unidentifiability [...] calls into question the predictive power of a model and urges its user to interpret all results with caution"
The idea of their algorithm is to analyze the so called parametric output sensitivity matrix (OSM). Full rank of the OSM is a sufficient condition for structural identifiability. Rank deficiency indicates non-influence by parameters or correlation between parameters.
The rank of the OSM is calculated using singular value decomposition. Since all calculations are numerical and non-exhaustive there is a non-zero possibility of false negatives (identifiable parameter is deemed unidentifiable).

In principle no user input is required. A user defined observable matrix can be supplied however.

The script is ~ 100 lines long and requires the sympy package for symbolic calculations.

The only information required is a Text2Model object.
Where in Pasmopy/Biomass do you think this should go? Should I just provide you the proof of concept script?

@formersbach formersbach added enhancement New feature or request text2model Text-to-model conversion labels Sep 22, 2022
@himoto
Copy link
Contributor

himoto commented Sep 22, 2022

Hi @formersbach, sounds fantastic to me!
Regarding implementation, it would be nice to add your script to text2model.py as a method of Text2Model class.
Or you can create something like identifiability_analysis.py in construction/ folder and implement as follows:

from .text2model import Text2Model

# Example
class IdentifiabilityAnalysis(Text2Model):
    ...

Anyways it depends on the usage.
If you need sympy, please add it to requirements.txt.
Can you please show me a usage example of it in the reply of this or Show and tell in a Discussion?

@himoto himoto assigned himoto and formersbach and unassigned himoto Sep 23, 2022
@formersbach
Copy link
Contributor Author

Hey @himoto !
This is the proof of concept script for identifiability:

from biomass import Text2Model
import sympy as sym
import numpy as np
from scipy.integrate import solve_ivp
import os
import re
import matplotlib.pyplot as plt

class ident_analysis:
    def __init__(self, model_file) -> None:
        self.model = Text2Model(model_file)
        self.model.convert(overwrite=True)
        self.symbol_list = self.create_symbols()
        self.v = self.get_v()
        self.N = sym.Matrix(self.model.stoichiometry_matrix.toarray())
        self.dydt = self.N * self.v

    def create_symbols(self):
        #Creates list of sympy symbols for integration
        temp_text = "import sympy\nsymbol_list=[\n"
        for species in self.model.species:
            temp_text += f"\tsympy.Symbol(\"{species}\"),\n"
        for parameter in self.model.parameters:
            temp_text += f"\tsympy.Symbol(\"{parameter}\"),\n"
        temp_text += "]"
        with open("tmp_symbols.py", "w") as tmp:
            tmp.writelines(temp_text)
        try:
            from tmp_symbols import symbol_list
        except ImportError:
            raise ImportError("The temporary symbols file could not be created.")
        os.remove("tmp_symbols.py")
        return symbol_list

    def get_v(self):
        #Turn reaction velocities to sympy equation vector
        v = []
        for reaction in self.model.kinetics:
            v += [sym.parsing.sympy_parser.parse_expr(reaction.rate)]
        return sym.Matrix(v)

    def get_values(self, perturbation, t_steps=100, perturb_species=None):
        #Performs the integration by substituting numerical values into the equations
        params, y0 = [], []
        for symbol in self.symbol_list:
            value=1
            default_values = self.model.param_info + self.model.init_info
            for default in default_values:
                if str(symbol) in default:
                    value = float(default.split("=")[1].strip())
            if str(symbol) == perturb_species:
                value += perturbation
            if str(symbol) in self.model.species:
                y0.append(value)
            elif str(symbol) in self.model.parameters:
                params.append([symbol, value])
        system = self.dydt.subs(params)
        def int_step(t, y):
            assert len(y) == len(self.model.species)
            replace = list(zip(self.symbol_list[:len(self.model.species)], y))
            return np.array(system.subs(replace)).flatten().astype(np.float64)
        t_eval = np.linspace(0, 100, num=t_steps, endpoint=False)
        integrated = solve_ivp(int_step, t_span=(0, 100), y0=y0, t_eval=t_eval)
        assert integrated.y.shape == (len(self.model.species), len(t_eval))
        return integrated

    def default_output_matrix(self):
        # Generates a default output matrix from the observable description
        observables = self.model.obs_desc
        output = np.zeros((len(observables), len(self.model.species)))
        indx_dict = {species: indx for indx, species in enumerate(self.model.species)}
        for i, observable in enumerate(observables):
            assert len(observable) == 2
            _, equation = observable
            re_find_u = r"(?<=u\[)(.+?)(?=\])"
            summands = re.split(r"\+|\-",equation)
            for summand in summands:
                spec = re.findall(re_find_u, summand)
                assert len(spec) == 1, "Observable equation is non-linear."
                spec = spec[0]
                output[i, indx_dict[spec]] = 1
        return output

    def apply_output(self, raw_sim, output_matrix=None):
        # Drops the non-observed species concentrations
        if output_matrix is None:
            output_matrix = self.default_output_matrix()
        return np.matmul(output_matrix, raw_sim.y)

    def run_perturbation(self, output_matrix=None, init_cond=True, const=None, perturbation=0.001, t_steps=100):
        # Performs unperturbed integration, perturbed integration and calculates sensitivities
        if init_cond:
            perturb = self.model.species + self.model.parameters
        else:
            perturb = self.model.parameters
        if const:
            perturb = [element for element in perturb if element not in const]
        base = self.apply_output(self.get_values(perturbation=perturbation, t_steps=t_steps))
        if output_matrix:
            num_obs = output_matrix.shape[0]
        else:
            num_obs = self.default_output_matrix().shape[0]
        sensitivity = np.zeros((num_obs*t_steps, len(perturb)))
        for i, param in enumerate(perturb):
            res = self.apply_output(self.get_values(perturbation, t_steps=t_steps, perturb_species=param))
            sensitivity[:, i] = ((res - base)/perturbation).flatten()
        assert sensitivity.shape == (num_obs * t_steps, len(perturb))
        return sensitivity

    def analyze_sensitivities(self):
        # Singular value decomposition of sensitivity matrix
        sensitivity = self.run_perturbation()
        U, E, Vstar = np.linalg.svd(sensitivity)
        return U, E, Vstar.T

def plot_sing_values(E):
    # Plots log10 transformed singular values
    log10E = np.log10(E)
    x = list(range(1, len(E) + 1))
    plt.stem(x, log10E, use_line_collection=True)
    plt.savefig("Singular_values")
    return

def plot_unid_components(V, index, ident_analys):
    # Plots given column of V
    column_vec = V[:, index]
    x = [str(symb) for symb in ident_analys.symbol_list]
    plt.stem(x, column_vec, use_line_collection=True)
    plt.savefig("Column_vector")

if __name__ == "__main__":
    import time
    start = time.time()
    test = ident_analysis("test.txt")
    U, E, V = test.analyze_sensitivities()
    end = time.time()
    print(f"Analysis took {end - start} seconds.")
    # plot_sing_values(E)
    #or
    # plot_unid_components(V, 7, test)

You can try it out with this simple unidentifiable text model:

A --> B
B --> P
A --> C
C --> P

@obs A: u[A]
@obs P: u[P]

If you add an observation for C the unidentifiability is resolved.

@formersbach
Copy link
Contributor Author

Hey @himoto!
I would like to close this issue, since it's currently not planned to add this to biomass right?
In any case discussions is more fitting for this, sorry for opening it as an issue!

@himoto
Copy link
Contributor

himoto commented Jan 20, 2023

No problem.
New idea to improve BioMASS is always more than welcome!
I would like to transfer this issue to Discussions.

@biomass-dev biomass-dev locked and limited conversation to collaborators Jan 20, 2023
@himoto himoto converted this issue into discussion #234 Jan 20, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request text2model Text-to-model conversion
Projects
None yet
Development

No branches or pull requests

2 participants