# Exercises Week 9: Temperature regulated gene expression

**Course**: [Topics in life sciences engineering](https://moodle.epfl.ch/enrol/index.php?id=17061) (BIO-411)

**Professors**:  _Gönczy Pierre_, _Naef Felix_, _McCabe Brian Donal_

SSV, MA, 2024-2025


This week's exercises are inspired by the work of [Gotic et al., 2015](http://genesdev.cshlp.org/content/30/17/2005), in which RNA levels, including pre-mRNA and mRNA (total RNA-seq), were measured in mouse NIH-3T3 fibroblasts after cells were switched from a warm to a colder temperature (38C° to 33C°), and vice versa (33C° to 38C°).
The experiments were conducted over multiple time points in an "*approach to steady-state*" design. The cold-inducible *Cirbp* gene was used as an example to examine the kinetics and mechanisms of post-transcritional control of gene expression. 

## Setup the Jupyter environment 
In order to run this notebook, we advise you to use the EPFL's JupyterLab centralized platform [noto.epfl.ch](http://noto.epfl.ch). Some of the required libraries are not available on the platform but you can install them copy-pasting the following commands in the **Jupyter Terminal** (open a new tab with the '+' icon and select "Terminal"):

````
my_venvs_create Week9_env
my_venvs_activate Week9_env
pip install gseapy
pip install adjustText
my_kernels_create Week9_env “Week9”
my_venvs_deactivate
````
Refresh your browser's page and select $\textbf{\color{red}the "Week9" kernel}$ using the top right toggle button. Import the libraries running the code cell below.

In [None]:
## Import libraries
import pandas as pd
import numpy as np

from sklearn.decomposition import PCA
from sklearn.preprocessing import scale
from scipy.integrate import odeint

import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt

import gseapy as gp
from ipywidgets import interact, fixed
from adjustText import adjust_text


## **Temperature regulates splicing efficiency of the cold-inducible RNA-binding protein gene *Cirbp***
This week’s exercises are based on the work of [Gotic et al., 2015](http://genesdev.cshlp.org/content/30/17/2005), which explores how core body temperature (CBT) rhythms influence diurnal gene expression, focusing on the role of the protein Cirbp.

In mammals, core body temperature (CBT) fluctuates diurnally around a mean value of 36°C–37°C. While the period of circadian rhythms is stable under temperature changes, the phase of the cycle is highly sensitive to (even small) temperature shifts. For instance, in cultured fibroblasts, the phase of circadian gene expression can be synchronized with simulated CBT cycles.

The paper investigates how CBT rhythms might synchronize gene expression, possibly through temperature-sensitive proteins. One such protein, Cirbp (Cold-Inducible RNA-Binding Protein), shows increased mRNA levels at lower temperatures. The authors aim to understand the mechanisms behind this temperature-sensitive response.

They explore different ways mRNA regulation could change with temperature, including:
- Increased Cirbp transcription at lower temperatures
- Reduced degradation of Cirbp mRNA at lower temperatures
- Enhanced pre-mRNA splicing efficiency at lower temperatures
  
Through various experiments (RNase protection assays, ChIP assays, luminescence assays), the authors rule out transcriptional and degradation regulation, showing that increased pre-mRNA splicing efficiency is key to the higher Cirbp mRNA accumulation at lower temperatures.

**Experimental design**
In these exercises, we will focus on one of the experiments that led to this conclusion. The authors used an “approach to steady-state” (ATSS) RNA-seq technique to examine genome-wide transcription and mRNA accumulation after temperature changes.  Specifically, NIH3T3 cells were exposed to a temperature change (either from 33°C to 38°C or from 38°C to 33°C) and collected at various time points after the shift. Total RNA was then extracted, ribosomal RNA was removed, and cDNA libraries were sequenced. This method enabled the authors to measure both pre-mRNA and mRNA levels across the entire genome.

<img src="./Experiment.png" alt="Alt text" width="50%"/>


## Exercise 1: Simulation of a simplified production-decay ODE model of transcription

We first consider a system in which nuclear pre-mRNA is transcribed and subsequently spliced to produce mRNA.  
Both molecular species are subject to distinct degradation processes, which can be modeled with a two-dimensional system of ordinary differential equations (ODEs):

\begin{aligned}
&\frac{dP}{dt} = s - (k_p + \rho)P \\
&\frac{dM}{dt} = \rho P - k_m M \\
\end{aligned}

where $P$ and $M$ represent the concentrations of pre-mRNA and mRNA, respectively.

Here, $s$ the transcription rate, is assumed to be a constant, but could also be a time-dependent function $s(t)$ (this will be explored further in the exercise session for Week 11).  
The parameters $k_m$ and $k_p$ are the degration rates of $M$ and $P$, respectively, while $\rho$ is the splicing rate of pre-mRNA $P$ into mRNA $M$; these rates are also constants.  
When all parameters are constant, the system will reach steady-state levels after a transient period, meaning that the derivatives $\frac{dP}{dt}$ and $\frac{dM}{dt}$ will equal to zero.

### Question 1
1. Derive the steady-state concentrations and interpret what happens to $M$ when $\rho\gg k_p$.  

2. Compute the steady-state ratio mRNA/pre-mRNA.   
    a. Comment on the result, in particular discuss which parameters are important and which are not.  
    b. Discuss how this ratio might be used to compare mRNA half-lives across different conditions.

3. Using $\rho = 2 h^{-1}$, $s = 1 [\text{P}]h^{-1}$ and $k_m = 0.1 h^{-1}$ (values from litterature), estimate the ratio of mRNA to pre-mRNA molecules in typical mammalian cells.

### Answer 1

1. *Type your answers here*  

2. *Type your answers here*  

3. *Type your answers here*


### Question 2 
Simulate the production-decay ODE with a constant transcription rate $s$ using the provided code:
1. Using the interactive widget (read the code below), describe the effects obtained by varying parameters such as splicing rate ($\rho$), degradation rates ($k_p$ and $k_m$) and transcription rate ($s$).\
Experiment with different initial conditions for the pre-mRNA and mRNA concentrations as well.

2. Adapt the code by adding a few lines to plot the steady-state concentrations of pre-mRNA and mRNA, as derived in Question 1.1. 

3. Optional: Extend the model by incorporating a third differential equation that represents the protein level (G). Update the simulation and the corresponding plots to include protein dynamics. How similar/different is the relation between P and M compared to the one between M and G ?

Remember to clearly label all axes and legends on your plots for easy interpretation.

### Answer 2  

1. *Type your answer here*  

2. *Type your answer here*  

3. *Type your answer here*  

#### Definition of important functions


In [None]:
def production_decay_ode(y, time, s, kp, km, rho):

    """
    Defines the ODEs for the production and decay of pre-mRNA and mRNA.
    P: pre-mRNA concentration
    M: mRNA concentration
    s: transcription rate (constant)
    kp: pre-mRNA degradation rate
    km: mRNA degradation rate
    rho: splicing rate
    """
    
    P, M = y
    dPdt = s - (kp + rho) * P
    dMdt = rho * P - km * M

    return [dPdt, dMdt] 


def simulation(ode_function, parameters, initial_conditions = None, time = None):

    """
    Simulate the ODEs for the production and decay of pre-mRNA and mRNA.
    parameters: parameters for the ODEs
    initial_conditions: initial conditions for the ODEs
    time: time intervals for the simulation
    """

    s = parameters['s']
    kp = parameters['kp']
    km = parameters['km']
    rho = parameters['rho']

    # Placeholder for adding steady-state solutions
    # P_ss = ... #Add your steady-state solution for P here 
    # M_ss = ... #Add you steady-state solution for M here
    
    results = odeint(ode_function, initial_conditions, time, args=(s, kp, km, rho))

    plt.figure(figsize=(10, 6))
    plt.ylim([0,150])

    # Uncomment the lines below after adding your steady-state solutions
    #plt.axhline(y=P_ss, color='blue', linestyle='--', label='P_ss')  
    #plt.axhline(y=M_ss, color='orange', linestyle='--', label='M_ss')

    plt.title('Simulation of pre-mRNA and mRNA Production-Decay Dynamics')
    plt.plot(time, results[:, 0], label = 'P(t)', linewidth=2)
    plt.plot(time, results[:, 1], label = 'M(t)', linewidth=2)
    plt.xlabel("Time [min]")
    plt.ylabel("Concentration")
    plt.legend()
    plt.show()
    
def animate(initial_conditions, time, s=5, kp= 0.05, km=0.05, rho=0.4):
    """
    Runs the simulation with an interactive widgets for parameter exploration.
    """
      
    parameters = {'s':s,'kp':kp, 'km':km, 'rho':rho}
    simulation(production_decay_ode, parameters, initial_conditions, time)


#### Run the simulation

In [None]:

# Define initial conditions and time intervals for the simulation
initial_conditions = [0, 0] #[P, M]
time = np.linspace(0, 100, 1001)

interact(animate, 
         initial_conditions=fixed(initial_conditions), 
         time=fixed(time), 
         s = (1,10,0.01), #  Transcription rate units: concentration [P] per min    
         kp = (1/120,1/10,1/1000), # Pre-mRNA degradation rate units: 1/min (range: 10 min to 120 min)
         km = (1/600,1/10,1/1000),  # mRNA degradation rate units: 1/min (range: 10 min to 600 min)
         rho = (1/30,1,0.05)); #splicing rate units: 1/min (range: 1 min to 30 min)

## Exercise 2: RNA-seq analysis and modelling of gene expression response to temperature up- and down-shifts

In [Gotic et al., 2015](http://genesdev.cshlp.org/content/30/17/2005), cells were harvested at various time points (0, 1, 3, 6, and 9 hours) after a temperature switch from 33°C to 38°C and vice versa, with duplicates subjected to total RNA-seq analysis. The resulting data were processed to assign reads to pre-mRNA and mRNA species and to obtain their respective levels. Here, we will analyze those data to study and model gene expression responses to temperature shifts.

The functions provided below (run the cell) will assist you in the analysis.

#### Implementation of useful functions

In [None]:
def subselect(data, feature, replicate, time, condition):

    """
    Subselects samples from the full dataset based on the specified features, conditions, 
    replicates, and time points.

    Parameters:
    data: The full gene expression dataset.
    feature: The features to filter by (e.g., ['intron', 'exon']).
    replicate: The replicates to include.
    time: The time points to include.
    condition: The temperature shift conditions to include (e.g, ['33to38', '38to33']).

    Returns:
    pd.DataFrame: The filtered dataset.
    """
    
    # Create boolean masks for each filtering criterion

    feature_mask  = data.columns.get_level_values('feature').isin(feature)
    condition_mask  = data.columns.get_level_values('condition').isin(condition)
    replicate_mask  = data.columns.get_level_values('replicate').isin(replicate)
    time_mask = data.columns.get_level_values('time').isin(time)

    # Use the masks to filter the data

    filtered_data = data.iloc[:, feature_mask & condition_mask & replicate_mask & time_mask]
    
    return filtered_data

def run_PCA(data, n_components=5, color_by='replicate', symbol_by='time', scale_data=True, log_transform=True):
    
    """
    Runs PCA on the provided data and generates a scatter plot of pairs of principal components.
    It also returns a DataFrame containing the PCA loadings.

    Parameters:
    data: The gene expression data to analyze.
    n_components: The number of principal components to calculate.
    color_by: The metadata category to color the data points by.
    symbol_by: The metadata category to symbolize the data points by.
    scale_data: Whether to scale the data before PCA.
    log_transform: Whether to apply log transformation to the data.

    Returns:
    pd.DataFrame: A DataFrame containing the PCA loadings.
    """

    data_tmp = data.copy()
    
    # Scale and/or log-transform the data if specified
    if scale_data:
        data = scale(data)
    else:
        data = data
        
    if log_transform:
        data = np.log2(data + 1)
        
    # Fit the PCA model
    pca = PCA(n_components=n_components)
    pca.fit(data.T)

    # Transform the data
    pca_transformed = pca.transform(data.T)

    # Create a DataFrame for the PCA results
    labels = [f'PC{i+1}' for i in range(n_components)]
    pca_df = pd.DataFrame(pca_transformed, columns=labels)
    pca_df[color_by] = data_tmp.columns.get_level_values(color_by)
    pca_df[symbol_by] = data_tmp.columns.get_level_values(symbol_by)

    # Plotting each pair of consecutive principal components
    fig, axes = plt.subplots(1, n_components -1, figsize=(30, 5))
    for i in range(n_components-1):
        sns.scatterplot(ax=axes[i], data=pca_df, x=f'PC{i+1}', y=f'PC{i+2}', 
        hue=color_by, style=symbol_by, s=200, palette='Set1')

        axes[i].set_xlabel(f'PC{i+1} ({pca.explained_variance_ratio_[i]*100:.2f}%)')
        axes[i].set_ylabel(f'PC{i+2} ({pca.explained_variance_ratio_[i+1]*100:.2f}%)')
        # remove legend if it's not the first plot
        if i != 0:
            axes[i].get_legend().remove()
    
    fig.suptitle(f'PCA of Gene Expression Data (color={color_by}, symbol={symbol_by})', fontsize=16)

    # Return the PCA loadings
    df_pca = pd.DataFrame(pca.components_, columns=data_tmp.index, index=labels)

    return df_pca


## Plot Gene profile across all conditions
def plot_gene(data, gene , xx_33_38=None, xx_38_33=None, t_2=None):

    """
    Plots the gene expression profile across different conditions and time points.
    
    Parameters:
    data: The gene expression data.
    gene: The gene to plot.
    xx_33_38: The simulated solution for the 33to38 condition.
    xx_38_33: The simulated solution for the 38to33 condition.

    Returns:
    None
    """

    #subselect
    time= ['0', '1' ,'3' ,'6', '9']
    replicate = ['2', '13', '11', '5']
    dat_1= subselect(data, ['intron'], replicate, time, ['33to38'])
    dat_2= subselect(data, ['exon'], replicate, time, ['33to38'])
    dat_3= subselect(data, ['intron'], replicate, time, ['38to33'])
    dat_4= subselect(data, ['exon'], replicate, time, ['38to33'])

    #plot
    t=np.array([0, 0, 1, 1, 3, 3, 6, 6, 9 , 9])
    fig, axs = plt.subplots(2, 2, figsize=(8,8), sharex=True, sharey=True)
    #fig.subplots_adjust(wspace=0.25)
    axs[0, 0].scatter(t, dat_1[dat_1.index.str.endswith(gene)].values.T, c = [0,2]*5, vmin=0, vmax=3, label='intron 33to38') #color is replicate
    axs[0, 0].set_title("intron 33to38")
    
    axs[1, 0].scatter(t, dat_2[dat_2.index.str.endswith(gene)].values.T, c = [0,2]*5, vmin=0, vmax=3,  label='exon 33to38') #color is replicate
    axs[1, 0].set_title("exons 33to38")
    
    axs[0, 1].scatter(t, dat_3[dat_3.index.str.endswith(gene)].values.T, c = [1,3]*5, vmin=0, vmax=3, label='intron 38to33') #color is replicate
    axs[0, 1].set_title("intron 38to33")

    axs[1, 1].scatter(t, dat_4[dat_4.index.str.endswith(gene)].values.T, c = [1,3]*5, vmin=0, vmax=3,  label='exon 38to33') #color is replicate
    axs[1, 1].set_title("exon 38to33")

    if xx_33_38 is not None:
        axs[0,0].plot(t_2, xx_33_38[:,0],label = 'P(t)', color="tab:orange")
        axs[1,0].plot(t_2, xx_33_38[:,1], color="tab:orange")
        axs[0,1].plot(t_2, xx_38_33[:,0], color="tab:orange")
        axs[1,1].plot(t_2, xx_38_33[:,1], color="tab:orange")
    
        
    for ax in axs.flat:
        ax.set(xlabel='Time [h]', ylabel='RPKM')

    fig.suptitle(f'Gene Expression Profile for {gene}', fontsize=16)
    plt.show()

def plot_DE(dat_all, condition_x1, condition_x2, condition_y1, condition_y2, time_x1, time_x2, time_y1, time_y2, feature_x1, feature_x2, feature_y1, feature_y2, FC=2, xlab="", ylab="", FC_type="delta"):
    
    """
    Plots differentially expressed genes based on log fold changes between specified conditions, times, and features.

    Parameters:
    dat_all: The complete dataset with gene expression values.
    condition_x1, condition_x2, condition_y1, condition_y2: Lists of conditions to compare.
    time_x1, time_x2, time_y1, time_y2: Lists of time points to compare.
    feature_x1, feature_x2, feature_y1, feature_y2: Lists of features to compare.
    FC: Absolute fold change threshold for considering a gene differentially expressed in at least one of the two axes.
    xlab: Label for the x-axis representing the log fold change between condition_x2 and condition_x1.
    ylab: Label for the y-axis representing the log fold change between condition_y2 and condition_y1.

    Returns:
    Array: Boolean array indicating differentially expressed genes that meet the fold change threshold.
    """
     
    replicate = ['2', '13', '11', '5']

    dat_x1=subselect(dat_all, feature_x1, replicate, time_x1, condition_x1).mean(axis=1)
    dat_x2=subselect(dat_all, feature_x2, replicate, time_x2, condition_x2).mean(axis=1)

    dat_y1=subselect(dat_all, feature_y1, replicate, time_y1, condition_y1).mean(axis=1)
    dat_y2=subselect(dat_all, feature_y2, replicate, time_y2, condition_y2).mean(axis=1)

    x=np.log2(1 + dat_x2)- np.log2(1 + dat_x1)
    y=np.log2(1 + dat_y2) - np.log2(1 + dat_y1)

    if FC_type == "delta":
        pos =((x > FC) & (y > FC)) | ((x < -FC) & (y < -FC))

    if FC_type == "delta_delta":
        pos = abs(x-y) > FC

    x_2 = x[pos]
    y_2 = y[pos]
    fig, ax = plt.subplots(figsize=(10,10))

    ax.plot(x,y, 'o', color = 'lightblue', markersize=2)
    ax.axhline(y=0, color='b')
    ax.axvline(x=0, color='b')
    ax.set_xlim(-8,8)
    ax.set_ylim(-8,8)
    ax.set_xlabel(xlab)
    ax.set_ylabel(ylab)
    texts = [ax.text(x_2.iloc[k], y_2.iloc[k], v.split("|")[1]) for k, v in enumerate(dat_all.index[pos])]
    adjust_text(texts)
    
    return(pos)    


###  Question 1

**PCA** (Read carefully)

*"Principal Component Analysis (PCA) is a dimensionality reduction technique that simplifies complex datasets by transforming them into a set of orthogonal axes, or principal components (PCs), that summarize the underlying patterns of variation. It works by calculating the eigenvectors and eigenvalues from the covariance matrix of the data, which then define the new axes. The first principal component captures the largest amount of variance in the data, with each subsequent component—perpendicular to the last—capturing progressively less variance. This orthogonal nature of PCA ensures that each component adds distinct information. This means that all the PCs are uncorrelated to each other. When applied to gene expression data, PCA reduces the numerous gene variables to a few composite indicators without losing critical information. After transformation, each principal component represents a combination of genes. The loadings of these components then tell us about the contribution of each gene to the component. Genes with high absolute loadings have a greater role in defining the structure revealed by that component, allowing us to identify which genes are most responsible for differentiating the samples on the basis of the observed variance."*
  
**GO enrichment analysis (GEA)**

*"Gene Ontology (GO) analysis is a common bioinformatics technique used to gain insight into the biological significance behind large lists of genes. When we perform experiments that result in the identification of many genes, such as with RNA-seq, we often want to understand the underlying biology of these genes. However, looking at each gene individually can be overwhelming due to the sheer number. This is where GO analysis comes in handy. GO analysis categorizes genes into groups known as "GO terms" describing different aspects of molecular biology or pathways. Using GO analysis, we can identify which biological processes are enriched in our gene list—meaning that these categories contain more genes from our list than we would expect by chance. This can suggest that the genes are collectively involved in certain biological functions or pathways, which may be crucial to the condition or treatment we're studying."*
  
Using the provided code below (PCA and enrichR cells), explore the RNA-Seq data:

1. Perform principal component analysis (PCA) selecting different temperature conditions, time points, replicates and features (i.e. intron or exon). For each scenario described below, explore the principal components (PCs) and comment:
  
    a. With all samples included, determine the primary sources of variance. Is there an outlier? Do replicates cluster as expected?  
    b. Focus on exonic counts and identify which principal component correlates with time.  
    c. In the PCA representation, describe the transcriptome response over time. Are there notable differences between the two conditions? 

2. Extract gene PCA loadings (positive or/and negative) from relevant sample groups. 

    a. Identify genes functions that are temperature-regulated.  
    b. Compare cold-induced versus heat-induced gene functions. What differences do you find?    
    c. Advanced (Optional): Identify and discuss the main sources of variance between replicates.

*Modify only the code marked as 'PCA cell' and 'enrichR' below*.

### Answer 1

1. *Type your answers here*   

a.   
b.  
c.  

2. *Type your answers here*   

a.   
b.  
c.  

#### Load the RNA-seq data (normalized counts) and format it as a panda data.frame


In [None]:
#Load 33to38 data
dat_33_38 = pd.read_csv("./GoticData/GSE85553_33-38_exon_intron_RPKM.txt",sep='\t')
dat_33_38=dat_33_38.add_prefix("33to38.")

#Load 33to38 data
dat_38_33 = pd.read_csv("./GoticData/GSE85553_38-33_exon_intron_RPKM.txt",sep='\t')
dat_38_33=dat_38_33.add_prefix("38to33.")

#concatenate
dat_all = pd.concat([dat_33_38, dat_38_33.iloc[:, :20]],axis=1)
dat_all.columns=dat_all.columns.str.replace('\.t','.', regex=True)
dat_all.replace([np.inf, -np.inf], 0, inplace=True)

#create column names
dat_all.columns=dat_all.columns.str.split('.', expand = True)
dat_all.columns.names = ['condition','time','replicate','feature']

# have a look at the data
with pd.option_context('display.max_columns', None):
    print(dat_all.head())

#### Keep expressed genes (i.e. genes with mean expression > 1 and > 10 RPKM in intron and exon quantification, respectively)


In [None]:
time= ['0', '1' ,'3' ,'6', '9']
replicate = ['2', '13', '11', '5']
conditions = ['33to38','38to33']

dat_intron= subselect(dat_all, ['intron'], replicate, time, conditions)
dat_exon= subselect(dat_all, ['exon'], replicate, time, conditions)
dat_all = dat_all.loc[(dat_intron.mean(axis=1) > 1 ) & (dat_exon.mean(axis=1) > 10), :]

### PCA cell
#### Perform PCA selecting different temperature conditions, timepoints, replicate and features (intron vs. exon). Change those parameters to select different samples for the PCA.

In [None]:

#Choose the features, conditions, replicates, and time points to include in the analysis for the different questions.
condition = ['38to33','33to38'] 
time= ['0', '1' ,'3' ,'6', '9']
replicate = ['2', '13', '11', '5'] # replicates 13 and 2 are from the 33to38 experiment, 5 and 11 from the 38to33.
feature = ['intron']

#Subselect the samples 
dat_sub = subselect(dat_all, feature, replicate, time, condition)

#Perform the PCA, plot, and return the PCA loadings. Adjust the parameters color_by and symbol_by to answer to the different question.
PC_loadings_time_replicate = run_PCA(dat_sub, n_components=5, color_by='time', symbol_by='replicate', scale_data=True, log_transform=True) 

### Question 2
In this exercise, we will explore the RNA-seq data to find genes whose expression levels change in response to temperature shifts (i.e. *Cirbp* is known to be cold-inducible). We will use an approach to directly identify genes that are differentially expressed (DE) when cells are moved from one temperature to another. Modify only the *DE cell* below.

1.1. Use the function *plot_DE* to visualize genes with a change in expression between 33°C and 38°C. Look for genes with patterns similar to or opposite that of *Cirbp* — that is, those that are upregulated (cold-inducible) or downregulated (heat-inducible) when temperature decreases. 

1.2. Using the *plot_gene* function, show the temporal dynamic of genes identified as having temperature-dependent regulation of gene expression.


2.1. Use the function *plot_DE* to highlight genes that are regulated after transcription (i.e. post-transcriptionally) by temperature changes. These genes may show differential expression at the mRNA level that are not seen at the pre-mRNA level. 

2.2. Using the *plot_gene* function, show the temporal dynamic of some of those genes. What can you say in terms of transcriptional versus post-transcriptional regulation?  


*Hint: Be aware that the FC_type argument of the plot_DE function allows you to either:*
- *Highlight genes with 'extreme' fold changes (FC_type = 'delta')*
- *Highlight genes that out of the diagonal (FC_type = 'delta_delta')*

3.1. Looking at the temporal dynamic of Cirbp during temperature shifts at the intronic and exonic level, what can you say about its regulation?  

### Answer 2  
1.1. *Type your answer here*  

1.2. *Type your answer here*  

2.1. *Type your answer here*  

2.2. *Type your answer here*  

3.1. *Type your answer here*  


In [None]:
#Running example, modify it to answer the different questions:
DE_genes = plot_DE(dat_all, condition_x1=['33to38'], condition_x2=['33to38'], condition_y1=['38to33'], condition_y2=['38to33'], # modify the condition parameters ['33to38'], ['38to33']
                time_x1=['0'], time_x2=['1'], time_y1=['1'], time_y2=['0'], # modify the time parameters ['0'], ['1'], ['3'], ['6'], ['9']
                feature_x1=['exon'], feature_x2=['exon'], feature_y1=['exon'], feature_y2=['exon'], # modify the feature parameters ['intron'], ['exon']
                FC=1, # adapt the FC threshold for readibility
                xlab='38C(t=1) - 33C(t=0) (33to38 condition, Log2FC)',
                ylab='38C(t=0) - 33C(t=1) (38to33 condition, Log2FC)',
                FC_type="delta") # this allows to select the type of fold change to use for the plot ("delta" or "delta_delta", extremities or out-of-diagonal)

### Question 3

We will now simulate the production decay model from Question 1 again, but with a temperature shift (i.e. temperature-dependent parameters).

1. Using the widget provided below, simulate the ODE with distinct parameters for the two temperatures. Comment on the outcome.

2. Use the widget to determine the parameters ($\rho_{33}$, $\rho_{38}$, $k_{m,33}$, $k_{m,38}$) that best fit *Cirbp* gene expression profile. Comment on your result.  

3. *Optional*: replace *Cirbp* with other genes identified previously as regulated post-transcriptionally. You might want to change the parameters and limits of the y-scale. Comment on your results. 


*Note: For simplification, compared to the model in the paper, we have used a model without the $\alpha$ parameter, which is still adequate for fitting most gene expression profiles. However, a limitation of this simplified model is that the obtained parameters may not be entirely realistic, which is the argument the authors used to favor the model with the $\alpha$ parameter.*

#### Implementation of the functions for the simulation with a temperature shift

In [None]:
def compute_ss_param(s, kp, km, rho):
    """
    Compute the steady-state concentrations of pre-mRNA and mRNA (P_ss and M_ss) given the parameters of the model.

    """
    P_ss = s / (kp + rho)
    M_ss = (rho / km) * (s / kp + rho)
    return [P_ss, M_ss]  

def simulation(ode_function, parameters, initial_conditions = None, time = None):

    """
    Simulate the ODEs for the production and decay of pre-mRNA and mRNA.
    parameters: parameters for the ODEs
    initial_conditions: initial conditions for the ODEs
    time: time intervals for the simulation
    """

    s=parameters['s']
    kp=parameters['kp']
    km=parameters['km']
    rho=parameters['rho']

    xx=odeint(ode_function, initial_conditions, time, args=(s, kp, km, rho))
    
    return(xx)

def animate(time, s_33=5560, s_38=5560, km_33=0.1, km_38=0.1, kp= 20.794, rho_33=0.05, rho_38=0.05, gene='Cirbp'):
    
    """
    Runs the simulation with interactive widgets for parameter exploration and using the steady-state of each temperature condition as initial condition to simulate the switch.
    """

    #33 to 38
    initial_conditions = compute_ss_param(s_33, kp, km_33, rho_33)
    parameters = {'s':s_38,'km':km_38, 'kp':kp, 'rho':rho_38}
    xx_33_38 = simulation(production_decay_ode, parameters, initial_conditions, time)
    
    #38 to 33
    initial_conditions = compute_ss_param(s_38, kp, km_38, rho_38)

    parameters = {'s':s_33,'km':km_33, 'kp':kp, 'rho':rho_33}
    xx_38_33 = simulation(production_decay_ode, parameters, initial_conditions, time)
    
   
    plot_gene(dat_all, gene, xx_33_38, xx_38_33, time)    

#### Run the simulation

In [None]:
t = np.linspace(0,9,9*3600)
interact(animate, time=fixed(t), gene = 'Cirbp',
         s_33 = (500,10000,1), # [RPKM/hr] at 33C
         s_38 = (500,10000,1), # [RPKM/hr] at 38C
         km_33 = (0.05,2,0.05), # [hr^-1], you have to find the correct km_33 
         km_38 = (0.05,2,0.05), # [hr^-1], you have to find the correct km_38 
         kp = (0.05,24,0.1),  # [hr^-1], around 2min for Cirbp
         rho_33 = (0.05,5,0.1), # [hr^-1],  you have to find the correct rho_33 
         rho_38 = (0.05,5,0.1)); # [hr^-1], you have to find the correct rho_38