Skip to content

ikeda042/PhenoPixel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PhenoPixel

PhenoPixel is a backend + frontend app for microscopy cell extraction and batch analytics. The backend exposes APIs under /api/v1, and the frontend provides a UI for running workflows.

The montage below shows a population-level fluorescence overview from two channels, rendered as a GFP / mCherry-style double-stained overlay. It combines the green and magenta fluorescence layers without scale bars, with display intensity balanced across cells while avoiding saturation.

Manual Label 1 Overlay Fluo montage

Cell extraction preview

For Research Use / Citation

This repository is maintained to support reproducible reporting in research papers. If you cite this software in a manuscript, please include:

  • Software name: PhenoPixel
  • Author: Yunosuke Ikeda
  • Contact: d263846@hiroshima-u.ac.jp
  • Repository URL: (this repository URL)
  • Version evidence: Git commit hash used in the analysis
  • Access date: date you accessed the repository

Recommended citation template

Ikeda, Y. PhenoPixel: microscopy single-cell extraction and batch phenotype analysis software.
GitHub repository. URL: <repository-url> (accessed <YYYY-MM-DD>), commit <commit-hash>.

Reproducibility checklist (for papers)

When linking this repository in a paper, we strongly recommend reporting:

  1. Runtime environment (OS, Python, Node.js versions)
  2. Exact backend/frontend dependency snapshots
  3. Input image format and acquisition conditions (e.g., ND2 metadata)
  4. Analysis parameters (Canny thresholds, ROI size, channel count, Auto Annotation on/off)
  5. Labeling protocol used in Annotation (Label 1 criteria)
  6. Bulk Engine mode(s), threshold(s), and export settings
  7. Exact PhenoPixel commit hash used for generating results

ND2 Manager

Manage ND2 files in this page: upload new datasets, delete existing ones, and select a specific ND2 file to proceed to Cell Extraction.

ND2 manager

Cell Extraction

  1. Configure extraction. For the selected ND2 file, choose the Canny algorithm parameters, ROI crop size, number of fluorescence layers, and whether Auto Annotation is on or off. Press Extract cells to start the extraction run.

Cell extraction setup

  1. Auto annotation behavior. When Auto Annotation is On, an additional post-processing step runs after extraction to automatically separate cells from debris.

The current implementation is a contour-only heuristic. After a contour $C = {(x_i, y_i)}_{i=1}^N$ is extracted, the backend computes two geometric scores and assigns Label 1 only when both pass their thresholds.

First, it measures how thick the contour is in the direction orthogonal to the major axis. Let

$$ \Sigma_C = \frac{1}{N-1} \sum_{i=1}^{N} (\mathbf{p}_i - \bar{\mathbf{p}}) (\mathbf{p}_i - \bar{\mathbf{p}})^{\mathsf{T}}, \qquad \mathbf{p}_i = (x_i, y_i)^{\mathsf{T}}. $$

If the eigenvalues of $\Sigma_C$ are $\lambda_1 \ge \lambda_2$, Auto Annotation uses the smaller one, $\lambda_2$, as a width / lateral-spread proxy and accepts only contours satisfying

$$ \lambda_2 \le 120. $$

Second, it measures contour convexity from perimeter ratios. If $P(C)$ is the contour perimeter and $P(\mathrm{Hull}(C))$ is the perimeter of its convex hull, the code defines

$$ \kappa(C) = \frac{P(\mathrm{Hull}(C))}{P(C)}. $$

Because irregular debris or merged objects tend to have a perimeter much longer than their convex hull, they produce smaller $\kappa$. The contour is accepted only when

$$ \kappa(C) > 0.85. $$

The final Auto Annotation score can be written as

$$ s(C) = \mathbf{1}[\lambda_2 \le 120] , \mathbf{1}[\kappa(C) > 0.85]. $$

Auto Annotation assigns Label 1 when $s(C) = 1$ and N/A otherwise. In other words, it keeps contours that are both laterally compact and close to convex, and it filters out broad, jagged, or debris-like shapes before manual review.

Auto annotation processing

  1. Review results and proceed. When extraction finishes, the right panel shows all extracted cell contours across every frame. From here you can open the generated cell database or go to the cell labeling (annotation) page. If contours are not extracted well (for example, due to mismatched Canny parameters), adjust settings in the parameter tuning section and click Re-extract to run extraction again.

Extraction results and next actions

Database Manager

This screen lists the cell databases generated by Cell Extraction. You can upload or download databases here, making it possible to separate an experiment’s database from the system as a single file.

Database manager

When you click Access on a specific cell database row, you are taken to a page where you can review information for each individual cell.

Database access

The function panel offers the following view modes:

  • Contour: view extracted contours only.
  • Replot: refresh the current plot from stored data.
  • Overlay: overlay contours on the default image.
  • Overlay Raw: overlay contours on the raw image.
  • Overlay Fluo: overlay contours on the fluorescence image.
  • Heatmap: visualize signal intensity as a heatmap.
  • Map 256: render a 256-level mapped view.
  • Map Raw: render the mapped view at native pixel resolution.
  • Distribution: show the value distribution for the selected cell or region.

Function panel modes

Annotation

Auto-detected contours can include debris or merged cells (not single cells), so you need to remove these manually.

Annotation cleanup

To label Label 1 (right panel), click a target cell (single cell) or use Shift + drag to select multiple cells, then press Apply. The right panel updates the labels in real time. You can also revert Label 1 back to N/A (backwards labeling is supported).

Annotation labeling

Annotation labeling multiple

Bulk Engine

For a database after annotation, the left panel shows the cells labeled with the default Label 1. If debris or non-single cells are mixed in, return to the Annotation page and relabel. Once only single cells are labeled, you can run batch analytics on this population.

Bulk engine selection

Bulk engine analysis

Batch analysis modes available in Bulk Engine include:

  • Cell length: measure cell length (um) from contours.
  • Cell area: compute cell area (px^2).
  • Normalized median: calculate normalized median intensity per cell for a selected channel.
  • FITC aggregation ratio: compute aggregation ratio for FITC signal.
  • Entropy: quantify intensity distribution using entropy (1 - sparsity).
  • Heatmap: generate heatmap vectors/plots for the selected channel.
  • Contours: visualize aligned contours and export contour coordinates.
  • Map256: render a Map256 strip across cells.
  • Raw data: export raw intensity values inside each contour.

JSON export is also supported, including raw intensity data.

Bulk engine analysis modes

For example, in Heatmap mode you can aggregate and visualize GFP localization for all cells of the selected label in a single plot.

Bulk engine heatmap example

Methods

The quantitative routines in PhenoPixel follow a common single-cell analysis pipeline: detect a contour from the phase-contrast image, re-parameterize the cell in its intrinsic coordinate system, and compute shape or fluorescence descriptors that are directly comparable across cells. Let $C = {(x_i, y_i)}_{i=1}^n$ be the contour points of one cell and let $\Omega_C$ be the set of pixels inside that contour.

1. Contour Extraction, Principal Axis, and Basis Transform

Contours are extracted from phase-contrast images with a Canny-based pipeline. The major elongation axis is estimated from the covariance of contour coordinates,

$$ \Sigma = \begin{pmatrix} \mathrm{Var}[X_1] & \mathrm{Cov}[X_1, X_2] \\ \mathrm{Cov}[X_1, X_2] & \mathrm{Var}[X_2] \end{pmatrix}, $$

and the principal direction is the solution of

$$ \mathbf{w}^* = \underset{|\mathbf{w}| = 1}{\mathrm{arg,max}} \mathbf{w}^{\mathsf{T}} \Sigma \mathbf{w}, \qquad \Sigma \mathbf{w} = \lambda \mathbf{w}. $$

If $Q = (\mathbf{v}_1\ \mathbf{v}_2)$ is the orthonormal eigenvector basis, coordinates are transformed to the cell-aligned frame by

$$ \mathbf{u} = Q^{\mathsf{T}} \mathbf{x}, \qquad \mathbf{x} = Q \mathbf{u}. $$

This removes arbitrary image rotation and makes bent or filamentous cells easier to model analytically.

Because $Q$ is orthonormal, this basis conversion also preserves Euclidean length. For any vector $\mathbf{x}$ and its transformed coordinate $\mathbf{u} = Q^{\mathsf{T}}\mathbf{x}$,

$$ |\mathbf{u}|^2 = \mathbf{u}^{\mathsf{T}}\mathbf{u} = (Q^{\mathsf{T}}\mathbf{x})^{\mathsf{T}} (Q^{\mathsf{T}}\mathbf{x}) = \mathbf{x}^{\mathsf{T}} Q Q^{\mathsf{T}} \mathbf{x} = \mathbf{x}^{\mathsf{T}} \mathbf{x} = |\mathbf{x}|^2, $$

since $Q^{\mathsf{T}}Q = QQ^{\mathsf{T}} = I$. Therefore distances measured before and after the basis transform are identical.

2. Centerline Fitting and Cell Length

In the aligned frame, the cell centerline is approximated by a $k$-th order polynomial

$$ \hat{f}(u_1) = \theta^{\mathsf{T}} \phi(u_1), \qquad \theta = (W^{\mathsf{T}} W)^{-1} W^{\mathsf{T}} f. $$

For a curved cell, the thesis formulation defines cell length as the arc length between the two contour-centerline intersection points:

$$ L = \int_{u_{1,a}}^{u_{1,b}} \sqrt{1 + (\frac{d\hat{f}}{du_1})^2},du_1. $$

Centerline fitting

In the current backend implementation, Cell length is returned as a robust PCA major-axis extent of pixels inside the contour and converted with a fixed pixel size of $0.065,\mu\mathrm{m}/\mathrm{px}$:

$$ L_{\mathrm{API}} \approx (\max_i \pi_i - \min_i \pi_i) \times 0.065. $$

3. Cell Area and Raw Pixel Export

Cell area is the area enclosed by the contour,

$$ A(C) = \iint_{\Omega_C} 1,dA, $$

which is stored during extraction and reported by Cell area. Raw data exports the unaggregated intensity set

$$ { I(p) \mid p \in \Omega_C } $$

for the selected channel.

4. Fluorescence Vectorization Along the Centerline

For each intracellular pixel $(p_i, q_i)$ with intensity $G(p_i, q_i)$, the nearest point on the fitted centerline is found by

$$ u_{1,i}^* = \underset{u_1 \in [u_{1,a}, u_{1,b}]}{\mathrm{arg,min}} [(u_1 - p_i)^2 + (\hat{f}(u_1) - q_i)^2]. $$

This position is converted to arc length,

$$ \ell(u_1) = \int_{u_{1,a}}^{u_1} \sqrt{1 + (\hat{f}'(t))^2},dt, \qquad \ell_i^* = \ell(u_{1,i}^*). $$

To obtain a fixed-dimensional descriptor, the arc-length interval $[0, L]$ is divided into $n$ bins and max-pooled:

$$ g_j = \max { G(p_i, q_i) \mid \ell_i^* \in I_j }. $$

If no projected pixel falls into $I_j$, we set $g_j = 0$. The resulting fixed-length localization vector is

$$ \mathbf{g} = (g_1, \dots, g_n)^{\mathsf{T}}. $$

The current implementation uses $n = 35$ and a default polynomial degree of $k = 4$. Heatmap visualizes these peak vectors either in absolute-length coordinates or in relative-position coordinates.

Peak-vector heatmap construction

5. Normalized Median and Aggregation-Style Scores

For any selected channel, intensities inside a cell are normalized by the cellwise maximum,

$$ \tilde{I}_i = \frac{I_i}{\max_{p \in \Omega_C} I(p)}, \qquad m(C) = \mathrm{median}(\tilde{I}_i). $$

This scalar is reported by Normalized median. A population-level aggregation score can then be written as

$$ R(\tau) = \frac{1}{N} \sum_{c=1}^{N} \mathbf{1}[m(C_c) < \tau]. $$

The current FITC aggregation ratio plot uses this form with a default cutoff $\tau = 0.7414$. In the thesis experiments, the same normalized-median idea was also used for IbpA-GFP and TorA-GFP abnormal-localization calls, with an example threshold of $m \le 0.6$ for those datasets.

6. Thesis-Specific Phenotype Calls

For HU-GFP compaction, a 35-bin peak vector is first computed and summarized as

$$ s(C) = \sum_{j=1}^{35} g_j. $$

Using the control population, the abnormality threshold is defined by the 5th percentile,

$$ \tau_{\mathrm{HU}} = Q_{0.05}({ s(C_c^{\mathrm{ctrl}}) }), $$

and the HU aggregation ratio is the fraction of cells with $s(C) &lt; \tau_{\mathrm{HU}}$.

For PI permeability, the mean intracellular PI intensity is

$$ \mu(C) = \frac{1}{|\Omega_C|} \sum_{p \in \Omega_C} I_{\mathrm{PI}}(p), $$

with a control-derived positivity threshold

$$ \tau_{\mathrm{PI}} = Q_{0.95}({ \mu(C_c^{\mathrm{ctrl}}) }). $$

The PI-positive fraction is then the proportion of cells satisfying $\mu(C) &gt; \tau_{\mathrm{PI}}$.

Requirements

  • Python 3.x (Launch uses python3.14)
  • Node.js with npm (frontend dev/build)
  • SQLite (used by the backend; databases generated by Cell Extraction)

Quick Start

Backend:

python3.14 -m venv venv
source ./venv/bin/activate
cd backend
pip install -r requirements.txt
python main.py

Frontend:

cd frontend
npm install
npm run dev

Local URLs

Docker Deploy (Traefik)

Use docker/compose.yaml to start Traefik + backend.

  1. Create backend/.env (use backend/.env.template as a reference)
  2. Set SERVER_HOST and TRAEFIK_ACME_EMAIL
  3. Start:
cd docker
docker compose -f compose.yaml up -d --build

Traefik uses 80/443. Access the hostname set in SERVER_HOST, and the API is exposed under /api/v1.

Tech Stack

Backend:

FastAPI SQLAlchemy NumPy OpenCV Matplotlib

  • FastAPI, Uvicorn, Pydantic for the API layer
  • SQLAlchemy for SQLite access
  • NumPy, OpenCV, Matplotlib for image processing and plotting

Frontend:

React Vite TypeScript Chakra UI

  • React + React Router for the UI
  • Vite for dev/build tooling
  • Chakra UI and Framer Motion for styling and motion

Docs