#### Notes and plots

Plot sources(figure 1 to 8)

https://i.ibb.co/GvZXHZLs/figure1.png

https://i.ibb.co/8LRCHrF2/figure2.png   

https://i.ibb.co/Jw7wWQtg/figure3.png (1st version)

https://i.ibb.co/8nKwSybf/figure3.png (2nd version)

https://i.ibb.co/t191Cng/figure4.png  

https://i.ibb.co/yJbpvvT/figure5.png

https://i.ibb.co/vv12dTTR/figure6.png

https://i.ibb.co/nsMmYRfK/figure7.png

https://i.ibb.co/chN8Q9pg/figure8.png

# **Introduction**

Our project is about describing flagellar movement patterns by extracting and using their dominant modes via principal component analysis (PCA).


The given **"fluidE.mat** file contains

**XX** and **YY**: Matrices of x-coordinate and y-coordinate of each point along the flagellum at every frame.The shapes of the flagellum are encoded in these two matrices.

**space_scale**: Distance between adjacent tracked points in micrometers.

**time_scape**: Time between frames in seconds.

We can use these datas to understand flagellar movement patterns(how flagellum shapes change over time).




# **Part (1)**

First, we want to visualize the  **raw** flagellum shapes changes over time.


We first output the Dimensions:  $m$, the number spatial points, $n$ the number of time steps(points)

>disp('Data:');

>disp(['Number of spatial points: ', num2str(m)]);

>disp(['Number of time points: ', num2str(n)]);

>disp(['Space scale (\mum): ', num2str(space_scale)]);

>disp(['Time scale (seconds): ', num2str(time_scale)])

Then  we plot the first 5 time steps. (note, we refer figure2 as 1 for explanation purpose).

<img src="https://i.ibb.co/8LRCHrF2/figure2.png" alt="figure2" width="900"/>

 We can also display the duration of the first 5 time steps by doing

 >disp(['Duration of the first 5 time steps: ', num2str(4 * time_scale), ' seconds']);

 And it outputs:

 > Duration of the first 5 time steps: 0.0066007 seconds


This and the figure shows how the flagellum's shape changes over the first 5 frames in 6.6 ms. Each plotted curves represent 5 continuous or successive snapshots(time steps) of the flagellum shape in 6.6 milliseconds.


Next we plot all 80 time steps of the flagellum shapes.

<img src="https://i.ibb.co/GvZXHZLs/figure1.png" alt="figure2" width="900"/>


#### **Creating figures in matlab:**

At any single time step $t_i$ `fluidE.mat` file contains two column vector of length $m$. One contains all the $X$ coordinates and the other containss all the $Y$ coordinates.

$$
XX(:,i) = \bigl[X_1, X_2, \dots, X_m\bigr]^{\top}
\quad\text{and}\quad
YY(:,i) = \bigl[Y_1, Y_2, \dots, Y_m\bigr]^{\top}
$$

Since they're parallel vectors, to get $j$th point's coordinate we simply pair

$(X_j,\,Y_j)\quad\longleftrightarrow\quad (\text{XX}(j,i),\,\text{YY}(j,i))$

In matlab:

> plot(XX(:,i), YY(:,i))

This plots the entire flagellum shape at time $t_i$ by automatically connect each point $(XX(j,i),YY(j,i))$ to $(XX(j+1,i),YY(j+1,i))$ in  sequence.


Thus to visualize all flagellum shapes for the first 5 time steps we simply do:

```
for i = 1:5
    plot(XX(:,i), YY(:,i));
    hold on;
end
```
And for all the  80 time steps:

```
for i = 1:5
    plot(XX(:,i), YY(:,i));
    hold on;
end
```

# **Part (2)**

We want to quantify how the flagellum bends and propagates waves along its length. So we need to convert the shape into tangent angles, which describes the local direction of the flagellum at each point along its length.

#### **Steps in matlab**

For each time step $t_i$ we are given $m$ points along the flagellum $\bigl\{(X_1,Y_1),\,(X_2,Y_2),\dots,(X_m,Y_m)\bigr\}$

Then for each pair of adjacent points $(x_j, y_j)$ and $(x_{j+1}, y_{j+1})$ at that time step $t_i$ we calculate the slope of these two adjacent points: $$ \frac{\Delta y_j (t_i)}{\Delta x_j (t_i) } = \frac{y_{j+1} (t_i) - y_j (t_i) }{x_{j+1}(t_i) - x_j (t_i)}$$.

Then convert them into angle using atan2 which give us the tangent angle $ \phi_j(t_i)$.

$$
\phi_j(t_i) = \mathrm{atan2}\bigl(\Delta y_j,\Delta x_j\bigr)\in(-\pi,\pi]. \color{grey} { \; \equiv \; arctan\left( \frac{\Delta y_j (t_i) }{\Delta x_j (t_i) } \right)}
$$

We repeat the previous steps for every time step $t_i$ for $i = 1,..,80$  and build the $(m-1)\times n$ matrix $\Phi$. Where each  column represent list of the tangents angles for that time $t_i$: $\bigl[\phi_{1}(t_i),\,\phi_2(t_i),\,\dots,\,\phi_{j = m - 1}(t_i)\bigr]^{\top}$

$$
  \Phi =
  \begin{bmatrix}
    \phi_1(t_1) & \phi_1(t_2) & \cdots & \phi_1(t_n) \\
    \phi_2(t_1) & \phi_2(t_2) & \cdots & \phi_2(t_n) \\
    \vdots      & \vdots      & \ddots & \vdots      \\
    \phi_{m-1}(t_1) & \phi_{m-1}(t_2) & \cdots & \phi_{m-1}(t_n)
  \end{bmatrix},
$$

with enties

$$
  \Phi_{j,i} = \phi_j(t_i).
$$


Because $\phi_j \in (-\pi,\pi]$, we use unwrap on tangent angles  function to remove “jump” by $\pm2\pi$.

Next we need to create physical axes for the kymograph.

**X-axis:** Spatial (arclength) axis:

$$s_j = (j-1) \cdot \text{space_scale}$$

**Y-axis:** Temporal axis:

$$t_i=(i-1) \cdot \text{time_scale}$$

This tell us where each tangent angle measurement occurred( in micrometers ) along the length of the flagellum, and when.


This whole process is done in matlab as:

```
phi = zeros(m-1, n);
for i = 1:80
    dx = diff(XX(:,i));  % [x2-x1, ..., x32-x31] at time i
    dy = diff(YY(:,i));
    phi(:,i) = atan2(dy, dx);  % Store column i
end

phi_unwrapped = zeros(size(phi));
for i = 1:80
    phi_unwrapped(:,i) = unwrap(phi(:,i));
end

delta_s = space_scale; % Distance between two adjacent points
delta_t = time_scale;  % Time between frames

%s = [0*space_scale, 1*space_scale, 2*space_scale, ...]
s = (0:size(phi_unwrapped,1)-1)' * delta_s; % Position where phi is defined.
t = (0:n-1)' * delta_t; % When phi is calculated                    

```

<img src="https://i.ibb.co/8nKwSybf/figure3.png" alt="figure3" width="1300"/>


**Intepretation of kymograph**

The **x-axis** is the position along the flagellum, Where $s = 0$ is the base, where the flagellum attaches to the cell. And the rightmost is the tip of the flagellum. And $s\approx 6$,$\mu$m (rightmost) corresponds to the very tip of the flagellum

The color bars indidicate the amount it bends in degree.

If we look at the far left section of graph where $s\lesssim0.4$, we can observe the $\text{orange} \leftrightarrow \text{red}$ repeating pattern as time goes forward, $t  = 0, \;  0.02  \; ,.., \; 0.12,..$

It only performs a small, one-sided oscillation, oscillating between two positive-bend values and never passing through zero (green) or negative (blue). Meanwhile the mid section ($s\approx3$) shows full oscillation: from positive, through zero, and to negative, and back.
  
This indicate there aren't any big motions for this area of the flagellum shape as opposed to the mid section.

#**Part (3):**

Whle the tangent-angle matrix $\Phi$ tells you the local bend angle for every position alongs the flagellum at every $t_i$. the matrix is high dimensional with $m-1 = 40$ spatial points and $n = 80 $ time points. Thus we need to compress the $\Phi$ matrix by performing SVD, and use it to find the dominant spatial patterns of bending (shape modes) that explain the most variance in the data.

#### **Steps in matlab**

First we compute `phi_mean` ($\bar\phi_j$), the average of the tangent angle at each spatial points across all $n$ time steps/frame.

$$
  \bar\phi_j
  = \frac{1}{n}\sum_{i=1}^n \Phi_{j,i},
  \quad j=1,\dots,m-1.
$$


Then we subtract it from every column of the data matrix to get the demeaned matrix `demean_phi` ($\Phi'$)

$$
\Phi' =\Phi_{j,i} - \bar\phi_j
\qquad
\Phi'\in\mathbb R^{(m-1)\times n}.
$$

Next we perform **SVD** on $\Phi'$.

$$
\Phi' = U\,S\,V^T,
$$

Where

$U \in \mathbb{R}^{(m-1) \times (m-1)} $: Left singular vectors where columns are **shape modes** $ \mathbf{u}_k $

$\Sigma \in \mathbb{R}^{(m-1) \times n} $: Diagonal matrix of **singular values** $ \sigma_k = s_k \geq 0 $

$V \in \mathbb{R}^{n \times n} $: Right singular vectors where columns are **temporal modes** $ \mathbf{v}_k $)

```
phi_mean = mean(phi_unwrapped, 2);
demean_phi = phi_unwrapped - phi_mean;
[U,S,V] = svd(demean_phi, 'econ');
```


We then we create 3 plots each of these matrices.

#### **The first 4 shape modes(figure 4)**

```
figure; clf; hold on;
for k = 1:4
    plot(U(:,k))
end
title('First 4 Shape modes');
hold off;
```
<img src="https://i.ibb.co/Vc3nvfq9/figure4.png" alt="figure4" width="1200"/>

Each lines(shape modes) correspond to a singular (column) vector $\mathbf(U_i)$ from $U$.

 **X-axis**: Spatial position along the flagellum, from base ($ j=1 $) to tip ($ j=m-1 $).

**Y-axis**: The bending amplitude (in radians) of mode $k$ at point $j$.




#### **The first 4 temporal modes**

```
figure; clf; hold on;
for k = 1:4
    plot(V(:,k))
end
title('First 4 Temporal Modes');
hold off;
```
<img src="https://i.ibb.co/yJbpvvT/figure5.png" alt="figure5" width="1200"/>


**X-axis**: Time $t_i$ for $ i = 1,\dots,n $

**Y-axis**: Mode coefficient $\mathbf{v_k(t_i)}$ with unit length.


#### **The strength of the modes (Singular values): Variance Retained vs. Number of Modes**

We compute the cumulative fraction of total variance explained by the first  $k$ modes:


$$
\frac{\sum_{k=1}^K s_k^2}{\sum_{k=1}^r s_k^2} \qquad \text{ for k = 1,....,m -1}
$$

Where $s_k^2 = \sigma_k^2$ is the squared singular values.

```
s = diag(S);
figure;
plot(cumsum(s.^2)./sum(s.^2), 'bo', 'MarkerFaceColor','b','MarkerSize',6)
xlabel("kth Singular Value");  
ylabel("% Information Retained");
title("Strength of Singular Values")

```

<img src="https://i.ibb.co/vv12dTTR/figure6.png" alt="figure6" width="1200"/>

**X-axis**: $k$ amount of modes.

**Y-axis**: Fraction of total variance retained

THis figures show that the first two modes $U_1, U_2$ captures most of the flagellum's shape variance.

## Part (4):

Using the shape and temporal modes that we found in part (3), we now want to find the vectors $V_{1}$ and $V_{2}$ that represent the first two dimensions of the shape space. $V_{i}$ can be found by multiplying the $i$ th singular value by the $i$ th temporal mode, or

$$
V_{i}=\sigma_{i} * \vec{v}_{i}
$$

as the $V$ matrix found in part (3) represents the temporal modes. We then want to plot $V_{1}$ against $V_{2}$, which gives
![](https://cdn.mathpix.com/cropped/2025_07_30_73dec793a37a0ff85a0fg-1.jpg?height=906&width=1335&top_left_y=906&top_left_x=366)

Figure 1: $V_{1}$ vs $V_{2}$

It can be observed that these points form a closed loop. We want to find the best fitting low dimensional Fourier series $A \cos (\theta)+B \sin (\theta)$ to describe the loop. To do this, we want to find the angle $(\theta)$ between the $V_{1}$ and $V_{2}$. This can be done with the MATLAB code

$$
\text { theta }=\text { unwrap }(\operatorname{atan} 2(\mathrm{~V}(:, 2), \mathrm{V}(:, 1)))
$$

Now, we can use least squares to fit $V_{i}=A_{i} \cos (\theta)+B_{i} \sin (\theta)$ to find $A$ and $B$ for both $V_{1}$ and $V_{2}$. To perform this, we want to solve the least squares equation:

$$
A^{T} A \vec{x}=A^{T} B
$$

where $B=V_{i}, A=\left[\begin{array}{ll}\overrightarrow{\cos (\theta)} & \overrightarrow{\sin (\theta)}\end{array}\right]$, and $\vec{x}=\left[\begin{array}{ll}A_{i} & B_{i}\end{array}\right]^{T}$. Doing these calculations in MATLAB gives

$$
\begin{aligned}
& V_{1} \approx 4.2925 \cos (\theta)-0.1475 \sin (\theta) \\
& V_{2} \approx-0.1124 \cos (\theta)+4.2011 \sin (\theta)
\end{aligned}
$$

Plotting these best fit series against the data gives
![](https://cdn.mathpix.com/cropped/2025_07_30_73dec793a37a0ff85a0fg-2.jpg?height=909&width=1332&top_left_y=1182&top_left_x=368)

Figure 2: Best Fit Fourier Series



# Part (5)


Using the least squares fitted  $V_1 $ and $ V_2$, we can reconstruct the tangent angle matrix using the first two shapes mode.


$$ \phi_{j,i}^{\rm recon} \;=\; \bar\phi_j \;+\; V_{1,\rm fit}(t_i)\,U_{1,j} \;+\; V_{2,\rm fit}(t_i)\,U_{2,j}, $$

In matlab

```

phi_recon = zeros(size(phi_unwrapped));
for i = 1:size(phi_unwrapped, 2)
    phi_recon(:,i) = phi_mean + V_1_fit(i) * U(:,1) + V_2_fit(i) * U(:,2);
end

```


To reconstruct  $(x,y)$ coordinates of the flagellar centerline, For each time step $t_i$ we initialize:

$$x_{1,i} = 0,\quad y_{1,i} = 0 $$

And then for $j = 2,...,m:$ we do:

$$
\begin{aligned}
  x_{j} &= x_{j-1} + \text{space_scale} \cdot \cos\bigl(\phi_{j,i}^{\rm recon} \bigr)\\
  y_{j} &= y_{j-1} +  \text{space_scale} \cdot \sin\bigl( \phi_{j,i}^{\rm recon}\bigr)
\end{aligned}
$$

In matlab

```
    x_recon = zeros(m, 1);
    y_recon = zeros(m, 1);
    for j = 2:m
        x_recon(j) = x_recon(j-1) + space_scale * cos(phi_recon(j-1, i));
        y_recon(j) = y_recon(j-1) + space_scale * sin(phi_recon(j-1, i));
    end
```

Plotting it give us:

<img src="https://i.ibb.co/chN8Q9pg/figure8.png" alt="figure8" width="1200"/>

# **Part (6)**

We can see the reconstruction shapes roughly matches the original using only two PCA modes $U_1$, $U_2$ and their least squares fitted  $V_{1,\rm fit}$ and $V_{2,\rm fit}$. Which capture roughly $96 \%$ of the total shape variance.

And the deviation is expected since the remaining $4 \%$ resides on modes 3 and 4. So to improve the reconstruction we could simply include one or two shape modes. If we say include the third modes for example we could likely raise the captured variance  to ~98%. The decomposition and subsequent reconstruction of data in a lower dimension would be useful in the real world as compressing the data with minimal information loss would make the data easiser to work with and analyze compared to if it were in a higher dimension.