Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel $\rightarrow$ Restart) and then **run all cells** (in the menubar, select Cell $\rightarrow$ Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name, r-number, collaborators, and the generative AI statement below:

In [None]:
# fill in your name, r-number, and collaborators (if any)
NAME = ""
r_number=""
COLLABORATORS = ""

# gen-AI (LLM) statement. In case large language models (generative AI systems such as chatGPT, co-pilot, ...) were used to complete this notebook, 
# fill in which models were used and what for
gen_AI_statement = ""

---


# Hands‑on Patlak Plot

**Goal.** In this notebook you will implement a Patlak plot to estimate the **influx rate constant** 

$$K_i = \frac{K_1 k_3}{k_2 + k_3}$$

using the irreversible 2-tissue compartment (**FDG**) model described by the two coupled differential equations:
$$
\frac{dC_1(t)}{dt} = K_1 C_A(t) - (k_2 + k_3)C_1(t)
$$
$$
\frac{dC_2(t)}{dt} = k_3 C_1(t)
$$

From dynamic PET images, it is possible to extract the tissue activity concentration given by
$$
C_T(t) = C_1(t) + C_2(t) \ .
$$
for different regions of interest (e.g. a tumor or an organ of interest).
The concentration of the tracer in arterial blood $C_A(t)$ can be measured from arterial blood samples.

**In this notebook, you will** estimate the net influx parameter $K_i$ and the apparent volume of distribution $V_a$ by calculating the slope and intercept in the so-called **Patlak plot** (see lecture and below).



## Recap Patlak Plot

The Patlak plot is a plot of
$$
\underbrace{\frac{C_T(t)}{C_A(t)}}_{\text{y-axis: tissue to arterial blood ratio}}
\quad \text{vs} \quad
\underbrace{\frac{\int_0^t C_A(t^\prime)\, dt^\prime}{C_A(t)}}_{\text{x-axis: "stretched Patlak time"}}
$$


At **"late" times** (in this notebook after ca 18 min.) when the change in the arterial input function is "slow" (in our examples here after 18min), 
the ratio of tissue to arterial blood tracer concentration in the irreversible 2-tisse compartment model is approximately given by
$$
\underbrace{\frac{C_T(t)}{C_A(t)}}_{y(t)} \approx K_i \,\underbrace{\frac{\int_0^t C_A(t^\prime)\, dt^\prime}{C_A(t)}}_{\tilde{t}(t)} \;+\; V_a \, .
$$
Note that the slope of $y$ as a function of the Patlak time $\tilde{t}$ is equal to $K_i$.



## What we will do in this notebook

In this notebook, we will load and visualize four simulated pairs of arterial input function $C_A$ and $C_T$. For each data set, we will do a Patlak plot and extract $K_i$ as the slope at late times.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from scipy.integrate import cumulative_trapezoid
from scipy.stats import linregress

Load 4 pairs of simulated simulated pairs of arterial input function $C_A$ and $C_T$. All data sets contain **34 measurements** of $C_A$ and $C_T$ at discrete time points. We call a set of measurement at a fixed time point **frame**. Not that the spacing of the time points is non-uniform and that when we start counting at 0, we have frames 0, 1, 2, ... 34.

In [None]:
# load the time, arterial input function (CA) and the tissue concentration (CT) from files
# in all cases, t is the time of the measurement, CA, is the arterial tracer concentration, and CT is the tissue tracer concentration 

data1 = np.loadtxt("data_01.txt")
t1, CA1, CT1 = data1[:, 0], data1[:, 1], data1[:, 2]

data2 = np.loadtxt("data_02.txt")
t2, CA2, CT2 = data2[:, 0], data2[:, 1], data2[:, 2]

data3 = np.loadtxt("data_03.txt")
t3, CA3, CT3 = data3[:, 0], data3[:, 1], data3[:, 2]

data4 = np.loadtxt("data_04.txt")
t4, CA4, CT4 = data4[:, 0], data4[:, 1], data4[:, 2]

print("first data set")
for i in range(t1.shape[0]):
    print(f"frame: {i:02}, frame time (min): {t1[i]:8.3f}, CA: {CA1[i]:8.5f}, CT: {CT1[i]:8.5f}")


Plot the arterial and tissue concentrations over time for all 4 data sets

In [None]:
# plot time activity curves
fig, ax = plt.subplots(1, 4, figsize=(12, 4), layout='constrained')
ax[0].plot(t1, CA1, '.-', label="arterial input CA1")
ax[0].plot(t1, CT1, '.-', label="tissue concentration CT1")
ax[0].set_xlabel("Time (min)")
ax[0].set_ylabel("activity concentration (arb. units)")
ax[0].legend()
ax[1].plot(t2, CA2, '.-', label="arterial input CA2")
ax[1].plot(t2, CT2, '.-', label="tissue concentration CT2")
ax[1].set_xlabel("Time (min)")
ax[1].legend()
ax[2].plot(t3, CA3, '.-', label="arterial input CA3")
ax[2].plot(t3, CT3, '.-', label="tissue concentration CT3")
ax[2].set_xlabel("Time (min)")
ax[2].legend()
ax[3].plot(t4, CA4, '.-', label="arterial input CA4")
ax[3].plot(t4, CT4, '.-', label="tissue concentration CT4")
ax[3].set_xlabel("Time (min)")
ax[3].legend()

for axx in ax:
    axx.grid(ls = ':')
    axx.set_ylim(0, 0.085)


## Part 1 - Compute the stretched Patlak time

In order to make the Patlk plot, we first have to calculate
the "stretched" Patlak time (the horizontal axis in the Patlak plot) given by the formula:
$$
\tilde{t}(t,C_a) = \frac{\int_0^t C_A(t^\prime) dt^\prime}{C_A(t)}
$$

In the following cell, **you have to implement** a function that calculates the stretched Patlak time 
given a discrete time series of sampled arterial input function $C_A(t_i)$ values. 
To numerically calculate the time integral of sampled values of $C_A$ you can use the trapezoidal rule.

In [None]:
def patlak_time(t: np.ndarray, CA: np.ndarray) -> np.ndarray:
    """Patlak time based on discretized input arrays t (time) and CA (arterial input function)

    Parameters
    ----------
    t : np.ndarray
        discrete time points of the measurements
    CA : np.ndarray
        arterial input function values at time points t

    Returns
    -------
    np.ndarray
        patlak time values at time points t
    """

    # return a 1D numpy array containing the stretched Patlak time for all 34 time frames
    # make sure that the Patlak time of frame 0 (which has time 0) is 0
    # for the numerical integration of CA(t), use scipy.integrate's cumulative_trapezoid (imported above)
    # see https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.cumulative_trapezoid.html
    
    # YOUR CODE HERE
    raise NotImplementedError()

Given your implemented function, we now calculate the 1D arrays containing the Patlak times for all 4 data sets.

In [None]:
t_patlak1 = patlak_time(t1, CA1)
t_patlak2 = patlak_time(t2, CA2)
t_patlak3 = patlak_time(t3, CA3)
t_patlak4 = patlak_time(t4, CA4)

We also calculate the 1D vectors containing the y values of the Patlak plot. This is a simple elementwise division of $C_T$ by $C_A$.

In [None]:
# instead of simply calculating y_patlak1 = CT1/CA1, we use np.divide to make sure 
# the case CT=0/CA=0 for the first (0) frame is handled correctly, in this case y should be 0.
y_patlak1 = np.divide(CT1, CA1, where=CA1 != 0, out=np.zeros_like(CA1))
y_patlak2 = np.divide(CT2, CA2, where=CA2 != 0, out=np.zeros_like(CA2))
y_patlak3 = np.divide(CT3, CA3, where=CA3 != 0, out=np.zeros_like(CA3))
y_patlak4 = np.divide(CT4, CA4, where=CA4 != 0, out=np.zeros_like(CA4))

The **linearized Patlak equation**
is **only valid for "late" times**, when the changes in the input function are slow compared to the exchange between plasma and tissue.
In our examples this is the case for times larger than approximately 18 minutes.

In the following cell we determine the frame number corresponding to the first frame that starts after 18 minutes.
Since all data sets in this notebook use the same time points, we should the same results for all data sets.

In [None]:
tstart_patlak = 18.0 # "late" time after which patlak plot becomes linear
patlak_start_frame_1 = int(np.searchsorted(t_patlak1, tstart_patlak))
patlak_start_frame_2 = int(np.searchsorted(t_patlak2, tstart_patlak))
patlak_start_frame_3 = int(np.searchsorted(t_patlak3, tstart_patlak))
patlak_start_frame_4 = int(np.searchsorted(t_patlak4, tstart_patlak))

print(f"First frame after which linearized Patlak equation is valid for data set 1: {patlak_start_frame_1}")
print(f"First frame after which linearized Patlak equation is valid for data set 2: {patlak_start_frame_2}")
print(f"First frame after which linearized Patlak equation is valid for data set 3: {patlak_start_frame_3}")
print(f"First frame after which linearized Patlak equation is valid for data set 4: {patlak_start_frame_4}")

In [None]:
fig2, ax2 = plt.subplots(1, 4, figsize=(12, 4), layout='constrained')
ax2[0].plot(t_patlak1, y_patlak1, '.-')
ax2[0].plot(t_patlak1[patlak_start_frame_1:], y_patlak1[patlak_start_frame_1:], '.')
ax2[0].set_ylabel("Patlak y = CT / CA")
ax2[1].plot(t_patlak2, y_patlak2, '.-')
ax2[1].plot(t_patlak2[patlak_start_frame_2:], y_patlak2[patlak_start_frame_2:], '.')
ax2[2].plot(t_patlak3, y_patlak3, '.-')
ax2[2].plot(t_patlak3[patlak_start_frame_3:], y_patlak3[patlak_start_frame_3:], '.')
ax2[3].plot(t_patlak4, y_patlak4, '.-')
ax2[3].plot(t_patlak4[patlak_start_frame_4:], y_patlak4[patlak_start_frame_4:], '.')
for axx in ax2:
    axx.grid(ls = ':')
    axx.set_xlabel("Patlak time (min)")
    axx.set_aspect(45, adjustable='box') # we make sure that the x/y ratio is the same across all plots to compare slopes

## Part 2: Linear regression of Patlak transformed data to estimate $K_i$

Given the calculated Patlak times and the Patlak "y values", you will now implement a linear regression that determines the slope ($K_i$) and the intercept ($V_a$) in the linear part of the Patlak plot.

In [None]:
def patlak_regression(t_patlak: np.ndarray, y_patlak: np.ndarray, start_frame: int) -> tuple[float, float]:
    """Estimate Ki (slope) and VR (intercept) from the linear Patlak region.

    Parameters
    ----------
    t_patlak : (N,) array
        Stretched Patlak time values (minutes) for each frame.
    y_patlak : (N,) array
        C_T / C_A ratio (unitless) for each frame.
    start_frame : int
        Index of the first frame to include in the linear fit (i.e., use
        t_patlak[start_frame:], y_patlak[start_frame:]).

    Returns
    -------
    Ki : float
        Net influx rate constant (1/min).
    Va : float
        Intercept (apparent distribution volume; unitless).
    """

    # perform linear regression using linregress scipy.stats (already imported above)
    # read the documentation at https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html
    # make sure to include the time frames starting from "start_frame" (where the Patlak plot is linear)
    # meaning that we want we want to do the regression of y_patlak[start_frame:] vs t_patlak[start_frame:]

    # YOUR CODE HERE
    raise NotImplementedError()

Given the implemented Patlak regression function, we calculated the slope ($K_i$) and intercept values for all 4 data sets.

In [None]:
Ki1, Va1 = patlak_regression(t_patlak1, y_patlak1, patlak_start_frame_1)
Ki2, Va2 = patlak_regression(t_patlak2, y_patlak2, patlak_start_frame_2)
Ki3, Va3 = patlak_regression(t_patlak3, y_patlak3, patlak_start_frame_3)
Ki4, Va4 = patlak_regression(t_patlak4, y_patlak4, patlak_start_frame_4)

print(f"Estimated Ki values from Patlak plots:")
print(f"Data set 1: Ki = {Ki1:.4f}, Va = {Va1:.4f}")
print(f"Data set 2: Ki = {Ki2:.4f}, Va = {Va2:.4f}")
print(f"Data set 3: Ki = {Ki3:.4f}, Va = {Va3:.4f}")
print(f"Data set 4: Ki = {Ki4:.4f}, Va = {Va4:.4f}")

Last but not least, we add the calculated regression line to the Patlak plots.

In [None]:
fig3, ax3 = plt.subplots(1, 4, figsize=(12, 4), layout='constrained')
ax3[0].plot(t_patlak1, y_patlak1, '.-')
ax3[0].plot(t_patlak1[patlak_start_frame_1:], y_patlak1[patlak_start_frame_1:], '.')
ax3[0].plot(t_patlak1, Ki1 * t_patlak1 + Va1, "--", color = "k")
ax3[0].set_ylabel("Patlak y = CT / CA")
ax3[1].plot(t_patlak2, y_patlak2, '.-')
ax3[1].plot(t_patlak2[patlak_start_frame_2:], y_patlak2[patlak_start_frame_2:], '.')
ax3[1].plot(t_patlak2, Ki2 * t_patlak2 + Va2, "--", color = "k")
ax3[2].plot(t_patlak3, y_patlak3, '.-')
ax3[2].plot(t_patlak3[patlak_start_frame_3:], y_patlak3[patlak_start_frame_3:], '.')
ax3[2].plot(t_patlak3, Ki3 * t_patlak3 + Va3, "--", color = "k")
ax3[3].plot(t_patlak4, y_patlak4, '.-')
ax3[3].plot(t_patlak4[patlak_start_frame_4:], y_patlak4[patlak_start_frame_4:], '.')
ax3[3].plot(t_patlak4, Ki4 * t_patlak4 + Va4, "--", color = "k")
for axx in ax3:
    axx.grid(ls = ':')
    axx.set_xlabel("Patlak time (min)")
    axx.set_aspect(45, adjustable='box')

ax3[0].set_title(f"data set 1: Ki = {Ki1:.3f}, Va = {Va1:.2f}", fontsize = "medium")
ax3[1].set_title(f"data set 2: Ki = {Ki2:.3f}, Va = {Va2:.2f}", fontsize = "medium")
ax3[2].set_title(f"data set 3: Ki = {Ki3:.3f}, Va = {Va3:.2f}", fontsize = "medium")
ax3[3].set_title(f"data set 4: Ki = {Ki4:.3f}, Va = {Va4:.2f}", fontsize = "medium")

## Part 3 - Describe your learnings

Briefly summarize (2-4 bullet points) what you learned by executing this notebook in the markdown cell below.

YOUR ANSWER HERE