PhenoPixel

PhenoPixel is a backend + frontend app for microscopy cell extraction and batch analytics. The backend exposes APIs under /api/v1, and the frontend provides a UI for running workflows.

The montage below shows a population-level fluorescence overview from two channels, rendered as a GFP / mCherry-style double-stained overlay. It combines the green and magenta fluorescence layers without scale bars, with display intensity balanced across cells while avoiding saturation.

For Research Use / Citation

This repository is maintained to support reproducible reporting in research papers. If you cite this software in a manuscript, please include:

Software name: PhenoPixel
Author: Yunosuke Ikeda
Contact: d263846@hiroshima-u.ac.jp
Repository URL: (this repository URL)
Version evidence: Git commit hash used in the analysis
Access date: date you accessed the repository

Recommended citation template

Ikeda, Y. PhenoPixel: microscopy single-cell extraction and batch phenotype analysis software.
GitHub repository. URL: <repository-url> (accessed <YYYY-MM-DD>), commit <commit-hash>.

Reproducibility checklist (for papers)

When linking this repository in a paper, we strongly recommend reporting:

Runtime environment (OS, Python, Node.js versions)
Exact backend/frontend dependency snapshots
Input image format and acquisition conditions (e.g., ND2 metadata)
Analysis parameters (Canny thresholds, ROI size, channel count, Auto Annotation on/off)
Labeling protocol used in Annotation (Label 1 criteria)
Bulk Engine mode(s), threshold(s), and export settings
Exact PhenoPixel commit hash used for generating results

ND2 Manager

Manage ND2 files in this page: upload new datasets, delete existing ones, and select a specific ND2 file to proceed to Cell Extraction.

Cell Extraction

Configure extraction. For the selected ND2 file, choose the Canny algorithm parameters, ROI crop size, number of fluorescence layers, and whether Auto Annotation is on or off. Press Extract cells to start the extraction run.

Auto annotation behavior. When Auto Annotation is On, an additional post-processing step runs after extraction to automatically separate cells from debris.

The current implementation is a contour-only heuristic. After a contour $C = {(x_i, y_i)}_{i=1}^N$ is extracted, the backend computes two geometric scores and assigns Label 1 only when both pass their thresholds.

First, it measures how thick the contour is in the direction orthogonal to the major axis. Let

$$ \Sigma_C = \frac{1}{N-1} \sum_{i=1}^{N} (\mathbf{p}_i - \bar{\mathbf{p}}) (\mathbf{p}_i - \bar{\mathbf{p}})^{\mathsf{T}}, \qquad \mathbf{p}_i = (x_i, y_i)^{\mathsf{T}}. $$

If the eigenvalues of $\Sigma_C$ are $\lambda_1 \ge \lambda_2$, Auto Annotation uses the smaller one, $\lambda_2$, as a width / lateral-spread proxy and accepts only contours satisfying

$$ \lambda_2 \le 120. $$

Second, it measures contour convexity from perimeter ratios. If $P(C)$ is the contour perimeter and $P(\mathrm{Hull}(C))$ is the perimeter of its convex hull, the code defines

$$ \kappa(C) = \frac{P(\mathrm{Hull}(C))}{P(C)}. $$

Because irregular debris or merged objects tend to have a perimeter much longer than their convex hull, they produce smaller $\kappa$. The contour is accepted only when

$$ \kappa(C) > 0.85. $$

The final Auto Annotation score can be written as

$$ s(C) = \mathbf{1}[\lambda_2 \le 120] , \mathbf{1}[\kappa(C) > 0.85]. $$

Auto Annotation assigns Label 1 when $s(C) = 1$ and N/A otherwise. In other words, it keeps contours that are both laterally compact and close to convex, and it filters out broad, jagged, or debris-like shapes before manual review.

Review results and proceed. When extraction finishes, the right panel shows all extracted cell contours across every frame. From here you can open the generated cell database or go to the cell labeling (annotation) page. If contours are not extracted well (for example, due to mismatched Canny parameters), adjust settings in the parameter tuning section and click Re-extract to run extraction again.

Database Manager

This screen lists the cell databases generated by Cell Extraction. You can upload or download databases here, making it possible to separate an experiment’s database from the system as a single file.

When you click Access on a specific cell database row, you are taken to a page where you can review information for each individual cell.

The function panel offers the following view modes:

Contour: view extracted contours only.
Replot: refresh the current plot from stored data.
Overlay: overlay contours on the default image.
Overlay Raw: overlay contours on the raw image.
Overlay Fluo: overlay contours on the fluorescence image.
Heatmap: visualize signal intensity as a heatmap.
Map 256: render a 256-level mapped view.
Map Raw: render the mapped view at native pixel resolution.
Distribution: show the value distribution for the selected cell or region.

Annotation

Auto-detected contours can include debris or merged cells (not single cells), so you need to remove these manually.

To label Label 1 (right panel), click a target cell (single cell) or use Shift + drag to select multiple cells, then press Apply. The right panel updates the labels in real time. You can also revert Label 1 back to N/A (backwards labeling is supported).

Bulk Engine

For a database after annotation, the left panel shows the cells labeled with the default Label 1. If debris or non-single cells are mixed in, return to the Annotation page and relabel. Once only single cells are labeled, you can run batch analytics on this population.

Batch analysis modes available in Bulk Engine include:

Cell length: measure cell length (um) from contours.
Cell area: compute cell area (px^2).
Normalized median: calculate normalized median intensity per cell for a selected channel.
FITC aggregation ratio: compute aggregation ratio for FITC signal.
Entropy: quantify intensity distribution using entropy (1 - sparsity).
Heatmap: generate heatmap vectors/plots for the selected channel.
Contours: visualize aligned contours and export contour coordinates.
Map256: render a Map256 strip across cells.
Raw data: export raw intensity values inside each contour.

JSON export is also supported, including raw intensity data.

For example, in Heatmap mode you can aggregate and visualize GFP localization for all cells of the selected label in a single plot.

Methods

The quantitative routines in PhenoPixel follow a common single-cell analysis pipeline: detect a contour from the phase-contrast image, re-parameterize the cell in its intrinsic coordinate system, and compute shape or fluorescence descriptors that are directly comparable across cells. Let $C = {(x_i, y_i)}_{i=1}^n$ be the contour points of one cell and let $\Omega_C$ be the set of pixels inside that contour.

1. Contour Extraction, Principal Axis, and Basis Transform

Contours are extracted from phase-contrast images with a Canny-based pipeline. The major elongation axis is estimated from the covariance of contour coordinates,

$$ \Sigma = \begin{pmatrix} \mathrm{Var}[X_1] & \mathrm{Cov}[X_1, X_2] \\ \mathrm{Cov}[X_1, X_2] & \mathrm{Var}[X_2] \end{pmatrix}, $$

and the principal direction is the solution of

$$ \mathbf{w}^* = \underset{|\mathbf{w}| = 1}{\mathrm{arg,max}} \mathbf{w}^{\mathsf{T}} \Sigma \mathbf{w}, \qquad \Sigma \mathbf{w} = \lambda \mathbf{w}. $$

If $Q = (\mathbf{v}_1\ \mathbf{v}_2)$ is the orthonormal eigenvector basis, coordinates are transformed to the cell-aligned frame by

$$ \mathbf{u} = Q^{\mathsf{T}} \mathbf{x}, \qquad \mathbf{x} = Q \mathbf{u}. $$

This removes arbitrary image rotation and makes bent or filamentous cells easier to model analytically.

Because $Q$ is orthonormal, this basis conversion also preserves Euclidean length. For any vector $\mathbf{x}$ and its transformed coordinate $\mathbf{u} = Q^{\mathsf{T}}\mathbf{x}$,

$$ |\mathbf{u}|^2 = \mathbf{u}^{\mathsf{T}}\mathbf{u} = (Q^{\mathsf{T}}\mathbf{x})^{\mathsf{T}} (Q^{\mathsf{T}}\mathbf{x}) = \mathbf{x}^{\mathsf{T}} Q Q^{\mathsf{T}} \mathbf{x} = \mathbf{x}^{\mathsf{T}} \mathbf{x} = |\mathbf{x}|^2, $$

since $Q^{\mathsf{T}}Q = QQ^{\mathsf{T}} = I$. Therefore distances measured before and after the basis transform are identical.

2. Centerline Fitting and Cell Length

In the aligned frame, the cell centerline is approximated by a $k$-th order polynomial

$$ \hat{f}(u_1) = \theta^{\mathsf{T}} \phi(u_1), \qquad \theta = (W^{\mathsf{T}} W)^{-1} W^{\mathsf{T}} f. $$

For a curved cell, the thesis formulation defines cell length as the arc length between the two contour-centerline intersection points:

$$ L = \int_{u_{1,a}}^{u_{1,b}} \sqrt{1 + (\frac{d\hat{f}}{du_1})^2},du_1. $$

In the current backend implementation, Cell length is returned as a robust PCA major-axis extent of pixels inside the contour and converted with a fixed pixel size of $0.065,\mu\mathrm{m}/\mathrm{px}$:

$$ L_{\mathrm{API}} \approx (\max_i \pi_i - \min_i \pi_i) \times 0.065. $$

3. Cell Area and Raw Pixel Export

Cell area is the area enclosed by the contour,

$$ A(C) = \iint_{\Omega_C} 1,dA, $$

which is stored during extraction and reported by Cell area. Raw data exports the unaggregated intensity set

$$ { I(p) \mid p \in \Omega_C } $$

for the selected channel.

4. Fluorescence Vectorization Along the Centerline

For each intracellular pixel $(p_i, q_i)$ with intensity $G(p_i, q_i)$, the nearest point on the fitted centerline is found by

$$ u_{1,i}^* = \underset{u_1 \in [u_{1,a}, u_{1,b}]}{\mathrm{arg,min}} [(u_1 - p_i)^2 + (\hat{f}(u_1) - q_i)^2]. $$

This position is converted to arc length,

$$ \ell(u_1) = \int_{u_{1,a}}^{u_1} \sqrt{1 + (\hat{f}'(t))^2},dt, \qquad \ell_i^* = \ell(u_{1,i}^*). $$

To obtain a fixed-dimensional descriptor, the arc-length interval $[0, L]$ is divided into $n$ bins and max-pooled:

$$ g_j = \max { G(p_i, q_i) \mid \ell_i^* \in I_j }. $$

If no projected pixel falls into $I_j$, we set $g_j = 0$. The resulting fixed-length localization vector is

$$ \mathbf{g} = (g_1, \dots, g_n)^{\mathsf{T}}. $$

The current implementation uses $n = 35$ and a default polynomial degree of $k = 4$. Heatmap visualizes these peak vectors either in absolute-length coordinates or in relative-position coordinates.

5. Normalized Median and Aggregation-Style Scores

For any selected channel, intensities inside a cell are normalized by the cellwise maximum,

$$ \tilde{I}_i = \frac{I_i}{\max_{p \in \Omega_C} I(p)}, \qquad m(C) = \mathrm{median}(\tilde{I}_i). $$

This scalar is reported by Normalized median. A population-level aggregation score can then be written as

$$ R(\tau) = \frac{1}{N} \sum_{c=1}^{N} \mathbf{1}[m(C_c) < \tau]. $$

The current FITC aggregation ratio plot uses this form with a default cutoff $\tau = 0.7414$. In the thesis experiments, the same normalized-median idea was also used for IbpA-GFP and TorA-GFP abnormal-localization calls, with an example threshold of $m \le 0.6$ for those datasets.

6. Thesis-Specific Phenotype Calls

For HU-GFP compaction, a 35-bin peak vector is first computed and summarized as

$$ s(C) = \sum_{j=1}^{35} g_j. $$

Using the control population, the abnormality threshold is defined by the 5th percentile,

$$ \tau_{\mathrm{HU}} = Q_{0.05}({ s(C_c^{\mathrm{ctrl}}) }), $$

and the HU aggregation ratio is the fraction of cells with $s(C) < \tau_{\mathrm{HU}}$.

For PI permeability, the mean intracellular PI intensity is

$$ \mu(C) = \frac{1}{|\Omega_C|} \sum_{p \in \Omega_C} I_{\mathrm{PI}}(p), $$

with a control-derived positivity threshold

$$ \tau_{\mathrm{PI}} = Q_{0.95}({ \mu(C_c^{\mathrm{ctrl}}) }). $$

The PI-positive fraction is then the proportion of cells satisfying $\mu(C) > \tau_{\mathrm{PI}}$.

Requirements

Python 3.x (Launch uses python3.14)
Node.js with npm (frontend dev/build)
SQLite (used by the backend; databases generated by Cell Extraction)

Quick Start

Backend:

python3.14 -m venv venv
source ./venv/bin/activate
cd backend
pip install -r requirements.txt
python main.py

Frontend:

cd frontend
npm install
npm run dev

Backend: http://localhost:3000
Frontend dev server: http://localhost:3001

Local URLs

API base: http://localhost:3000/api/v1
Swagger UI (OpenAPI): http://localhost:3000/api/v1/docs
OpenAPI JSON: http://localhost:3000/api/v1/openapi.json
Health check: http://localhost:3000/api/v1/health

Docker Deploy (Traefik)

Use docker/compose.yaml to start Traefik + backend.

Create backend/.env (use backend/.env.template as a reference)
Set SERVER_HOST and TRAEFIK_ACME_EMAIL
Start:

cd docker
docker compose -f compose.yaml up -d --build

Traefik uses 80/443. Access the hostname set in SERVER_HOST, and the API is exposed under /api/v1.

Tech Stack

Backend:

FastAPI, Uvicorn, Pydantic for the API layer
SQLAlchemy for SQLite access
NumPy, OpenCV, Matplotlib for image processing and plotting

Frontend:

React + React Router for the UI
Vite for dev/build tooling
Chakra UI and Framer Motion for styling and motion

Docs

Bulk Engine API: backend/app/bulk_engine/README.md
Cell Extraction API: backend/app/cellextraction/README.md
Frontend: frontend/README.md

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
backend		backend
docker		docker
docs		docs
frontend		frontend
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
Readme_ja.md		Readme_ja.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhenoPixel

For Research Use / Citation

Recommended citation template

Reproducibility checklist (for papers)

ND2 Manager

Cell Extraction

Database Manager

Annotation

Bulk Engine

Methods

1. Contour Extraction, Principal Axis, and Basis Transform

2. Centerline Fitting and Cell Length

3. Cell Area and Raw Pixel Export

4. Fluorescence Vectorization Along the Centerline

5. Normalized Median and Aggregation-Style Scores

6. Thesis-Specific Phenotype Calls

Requirements

Quick Start

Local URLs

Docker Deploy (Traefik)

Tech Stack

Docs

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PhenoPixel

For Research Use / Citation

Recommended citation template

Reproducibility checklist (for papers)

ND2 Manager

Cell Extraction

Database Manager

Annotation

Bulk Engine

Methods

1. Contour Extraction, Principal Axis, and Basis Transform

2. Centerline Fitting and Cell Length

3. Cell Area and Raw Pixel Export

4. Fluorescence Vectorization Along the Centerline

5. Normalized Median and Aggregation-Style Scores

6. Thesis-Specific Phenotype Calls

Requirements

Quick Start

Local URLs

Docker Deploy (Traefik)

Tech Stack

Docs

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages