PhenoPixel is a FastAPI + React application for microscopy single-cell extraction, annotation, visualization, and batch phenotype analysis. It is designed around ND2 microscopy workflows: upload an ND2 file, extract cell contours, clean labels, inspect individual cells, and export population-level shape and fluorescence descriptors.
The montage below shows a label-1 cell population from microscope_data.db.
The overlay renders fluo1 as magenta and fluo2 as green without scale bars,
with display intensity balanced per cell while avoiding saturation.
| Area | What PhenoPixel provides |
|---|---|
| ND2 management | Upload, list, inspect metadata, download, delete, and parse .nd2 files. |
| Cell extraction | Generate cropped cell images, contours, preview frames, and SQLite databases. |
| Annotation | Review auto-detected cells and relabel single cells or debris in bulk. |
| Cell viewer | Inspect phase, fluorescence, overlays, replot views, heatmaps, map views, distributions, and raw images per cell. |
| Bulk analysis | Export and visualize cell length, area, fluorescence intensity summaries, heatmap vectors, contours, Map256 strips, raw intensities, and JSON/CSV data. |
| Research reporting | Keep database files, exports, screenshots, and method formulas close to the analysis code for reproducible reporting. |
| Component | Notes |
|---|---|
| Python | Local launch examples use python3.14; the Docker backend image uses Python 3.11. |
| Node.js / npm | Required for the React frontend, Docusaurus docs build, and Storybook screenshots. Dependency versions are locked in frontend/package-lock.json. |
| SQLite | Used for extracted cell databases through SQLAlchemy. |
| OpenCV system libraries | Local installs use opencv-python; Docker also installs libgl1 and libglib2.0-0. |
Run the backend and frontend from the repository root in separate terminals.
The local quick-start path uses python3.14; the Docker backend image uses
Python 3.11.
python3.14 -m venv venv
source ./venv/bin/activate
cd backend
pip install -r requirements.txt
python main.pycd frontend
npm install
npm run devIf npm reports peer dependency conflicts around Vite and Storybook, use:
npm install --legacy-peer-deps| Service | URL |
|---|---|
| Backend | http://localhost:3000 |
| Frontend dev server | http://localhost:3001 |
| API base | http://localhost:3000/api/v1 |
| Swagger UI | http://localhost:3000/api/v1/docs |
| OpenAPI JSON | http://localhost:3000/api/v1/openapi.json |
| Health check | http://localhost:3000/api/v1/health |
| Item | Notes |
|---|---|
| Primary input | .nd2 files. Upload through the UI or place them under backend/app/nd2files/. |
| Filename rules | Only .nd2 is accepted; path components are stripped and dots in stems are normalized during processing. |
| Channel/layer modes | single, dual, dual(reversed), triple, and quad are supported by the extraction UI/API. |
| Metadata | ND2 metadata can be viewed from the ND2 Manager. Channel count may be inferred when the metadata exposes it. |
| Time/Z data | The parser can expose ND2 frame metadata, but cell extraction expects the selected layer mode to match the frame/channel organization. Validate previews before running large jobs. |
Manage ND2 files: upload datasets, inspect metadata, delete old files, and open a selected ND2 file in Cell Extraction.
Choose the layer mode, objective scale, Canny threshold (param1), crop size,
and Auto Annotation setting, then start extraction. The backend creates contour
previews and one or more SQLite databases for downstream analysis.
When extraction finishes, review the preview frames. If contours are poor, adjust parameters and re-extract.
Cell databases generated by extraction can be uploaded, downloaded, renamed, deleted, and opened for cell-level review.
The cell viewer function panel provides these modes:
| Mode | Description |
|---|---|
Contour |
Show extracted contour coordinates. |
Replot |
Replot a cell in its aligned coordinate system. |
Overlay |
Overlay the contour on the default cell image. |
Overlay Raw |
Overlay fluorescence without contour masking on the raw view. |
Overlay Fluo |
Compose fluorescence channels with selectable colors. |
Heatmap |
Visualize centerline-projected fluorescence intensity. |
Map 256 |
Render a 256-level mapped fluorescence view. |
Map Raw |
Render the mapped view at native pixel scale. |
Distribution |
Plot the intensity distribution for the selected cell/channel. |
Overlay Raw places fluorescence-channel signal on top of the PH image, making
it possible to check fluorescence localization against the original cell
morphology.
Overlay Fluo shows the fluorescence channels as an overlay without the PH
background, which is useful for inspecting signal overlap and channel-specific
patterns.
Heatmap projects fluorescence intensity along the selected cell's long axis.
The target fluorescence channel can be selected before plotting, so the same
cell can be compared across channels.
Map 256 renders fluorescence information as a 256-level intensity map. This
view preserves the spatial pattern inside each cell and is useful for examining
subcellular localization, such as Nucleoid positioning.
The same Map256 representation can also be used at population scale. The
example below arranges Map256 views from the Label 1 population in
microscope_data.db, summarizing Nucleoid localization across the selected cell
group.
Distribution shows a histogram of intracellular fluorescence intensities for
the selected cell/channel.
Auto-detected contours may include debris or merged cells. Use Annotation to
move cells between N/A and analysis labels such as Label 1. Click cells
individually or use Shift-drag to select multiple cells before applying a label.
After annotation, use Bulk Engine to run population-level analysis on a selected label. Return to Annotation if non-single cells remain in the selected group.
Bulk Engine modes:
| Mode | Description |
|---|---|
Cell length |
Measure cell length in micrometers from contour geometry and stored pixel size. |
Cell area |
Export cell area in pixels squared. |
Normalized median |
Compute cellwise median intensity after max normalization. |
FITC aggregation ratio |
Report the fraction of cells below the configured normalized-median threshold. |
Entropy |
Quantify fluorescence distribution with entropy / sparsity-style metrics. |
Heatmap |
Generate centerline heatmap vectors and absolute/relative heatmap plots. |
Contours |
Visualize aligned contours and export transformed contour coordinates. |
Map256 |
Render population-level Map256 strips or contour maps. |
Raw data |
Export raw pixel intensities inside each contour. |
JSON and CSV exports are available for downstream analysis.
For example, Heatmap aggregates and visualizes fluorescence localization for
all cells in the selected label.
Contours mode focuses on contour geometry only. It exports aligned contour
coordinates as JSON or CSV, making it easier to prototype and validate
algorithms that use cell outline information without carrying fluorescence
images or raw intensity data through the analysis pipeline.
| Output | Location / format |
|---|---|
| Uploaded ND2 files | backend/app/nd2files/ |
| Cell databases | backend/app/databases/<nd2_stem>.db |
| Extracted contour previews | backend/app/extracted_data/<nd2_stem>/<frame>.png |
| Cell table | SQLite table cells, including cell_id, manual_label, perimeter, area, img_ph, img_fluo1, img_fluo2, contour, center coordinates, objective, and pixel size. |
| Bulk exports | JSON/CSV/PNG responses from Bulk Engine endpoints or browser downloads. |
| Frontend production build | frontend/dist/; the backend serves this folder when it exists. |
| Parameter | Default / current behavior |
|---|---|
Extraction threshold param1 |
130 |
| Cell crop size | 200 px |
| Default layer mode | dual in the UI |
| Auto Annotation | On by default in the UI |
| Auto Annotation width screen | second PCA variance lambda_2 <= 120 |
| Auto Annotation convexity screen | hull_perimeter / perimeter > 0.85 |
| Extraction concurrency | CELLEXTRACTION_MAX_CONCURRENCY, default 2 |
| Objective presets | 100x = 0.065 um/px, 60x = 0.108 um/px |
| Heatmap vector bins | 35 |
| Centerline polynomial degree | 4 unless a request overrides it |
| FITC aggregation cutoff | 0.7414 |
The quantitative routines in PhenoPixel follow a common single-cell analysis
pipeline: detect a contour from the phase-contrast image, re-parameterize the
cell in its intrinsic coordinate system, and compute shape or fluorescence
descriptors that are directly comparable across cells. Let
When Auto Annotation is On, an additional post-processing step runs after
extraction to automatically separate single-cell candidates from debris and
merged cells. The default backend path loads a bundled supervised model from
backend/autoannotation/artifacts/autoannotator.pkl and assigns Label 1 when
the predicted probability is above the trained threshold; otherwise it assigns
N/A.
The model was trained from the bundled reference SQLite dataset
backend/autoannotation/testdata/autoannotation_testdata.db (520 labeled cells:
300 label 1, 220 N/A). The dataset is a single merged DB built from
microscope_data.db and test_database (1).db, with source_db and
source_cell_id provenance columns. Training compares a small set of
dependency-light classifiers and selects the best 5-fold stratified
cross-validation result. The bundled model is an ensemble of weighted
k-nearest neighbors and L2-regularized logistic regression using 96 features
from:
- contour geometry: area, perimeter, circularity, convexity, solidity, bounding box extent, PCA axis variances, eccentricity, and Hu moments.
- cropped images: PH/Fluo contour-inside and outside-ring intensity quantiles, contrast, gradient, and edge-density descriptors.
The selected model achieved F1 0.9608, accuracy 0.9538, precision 0.9423,
and recall 0.9800 in 5-fold CV. The previous contour-only rule remains as a
safe fallback when the model file cannot be loaded or feature extraction fails.
Set PHENOPIXEL_AUTOANNOTATION_MODEL=/path/to/autoannotator.pkl to use a
different trained model.
The fallback contour screen computes two geometric scores. After a contour
Label 1 only when both pass their
thresholds. First, it measures how thick the contour is in the direction
orthogonal to the major axis. Let
If the eigenvalues of
Second, it measures contour convexity from perimeter ratios. If
Because irregular debris or merged objects tend to have a perimeter much longer
than their convex hull, they produce smaller
The final Auto Annotation score can be written as
The fallback assigns Label 1 when N/A otherwise. In other
words, it keeps contours that are both laterally compact and close to convex.
Contours are extracted from phase-contrast images with a Canny-based pipeline. The major elongation axis is estimated from the covariance of contour coordinates,
and the principal direction is the solution of
If
This removes arbitrary image rotation and makes bent or filamentous cells easier to model analytically.
Because
since
In the aligned frame, the cell centerline is approximated by a
For a curved cell, the thesis formulation defines cell length as the arc length between the two contour-centerline intersection points:
In the current backend implementation, Cell length is returned as a robust PCA
major-axis extent of pixels inside the contour and converted with the stored
pixel size. For the 100x preset, this is:
Cell area is the area enclosed by the contour,
which is stored during extraction and reported by Cell area. Raw data
exports the unaggregated intensity set
for the selected channel.
For each intracellular pixel
This position is converted to arc length,
To obtain a fixed-dimensional descriptor, the arc-length interval
If no projected pixel falls into
The current implementation uses Heatmap visualizes these peak vectors either in absolute-length
coordinates or in relative-position coordinates.
Map 256 keeps more spatial information than the 35-bin heatmap vector. It uses
the same cell-aligned coordinate system and polynomial centerline, but instead
of reducing each longitudinal bin to a single peak value, it remaps every
intracellular fluorescence pixel onto a two-dimensional long-axis / lateral-axis
image.
Before mapping, the selected fluorescence image is converted to grayscale and background-subtracted with a morphological opening:
For each pixel $\mathbf{u}i = (u{1,i}, u_{2,i})$ inside the contour, the
nearest point on the fitted centerline
The projected long-axis coordinate is the centerline arc length,
and the signed lateral coordinate is
These coordinates define a high-resolution map
Each intracellular pixel is assigned to integer map coordinates
and intensities are max-pooled into the mapped image:
The implementation also writes the same value to the four direct neighbors
For Map 256, the high-resolution map is resized by nearest-neighbor
interpolation to a fixed display size:
The display image is then converted to 8-bit intensity,
with a zero image used when the map has no intensity range. Map Raw keeps the
native high-resolution map before the fixed-size resize. The Jet view applies a
pseudocolor map to the same normalized intensity image.
For population-level Map256 contour plots, the backend first builds one map per cell. To make the population summary less sensitive to arbitrary left/right or top/bottom orientation, each cell contributes four symmetric variants: the original map, left-right flip, top-bottom flip, and both flips. The population mean map is
where relative mode,
each cell map is normalized before averaging; in absolute mode, the
background-subtracted raw intensities are averaged. This makes Map256 useful for
summarizing subcellular localization patterns, such as Nucleoid positioning,
across a labeled cell population. The square montage below was generated from
all Label 1 cells in the current microscope_data.db.
For any selected channel, intensities inside a cell are normalized by the cellwise maximum,
This scalar is reported by Normalized median. A population-level aggregation
score can then be written as
The current FITC aggregation ratio plot uses this form with a default cutoff
For HU-GFP compaction, a 35-bin peak vector is first computed and summarized as
Using the control population, the abnormality threshold is defined by the 5th percentile,
and the HU aggregation ratio is the fraction of cells with
For PI permeability, the mean intracellular PI intensity is
with a control-derived positivity threshold
The PI-positive fraction is then the proportion of cells satisfying
This repository is maintained to support reproducible reporting in research papers. If you cite this software in a manuscript, include:
- Software name: PhenoPixel
- Author: Yunosuke Ikeda
- Contact: d263846@hiroshima-u.ac.jp
- Repository URL: this repository URL
- Version evidence: Git commit hash used in the analysis
- Access date: date you accessed the repository
Recommended citation:
Ikeda, Y. PhenoPixel: microscopy single-cell extraction and batch phenotype analysis software.
GitHub repository. URL: <repository-url> (accessed <YYYY-MM-DD>), commit <commit-hash>.
For reproducibility, report OS/Python/Node versions, dependency snapshots, ND2 metadata or acquisition conditions, extraction parameters, labeling criteria, Bulk Engine modes, thresholds, export settings, and the exact commit hash.
Backend checks:
source ./venv/bin/activate
PYTHONPATH=backend python -m unittest discover backend/testsFrontend checks:
cd frontend
npm run build
npm run lintScreenshot/storybook assets:
./docs/make_screenshots.shBuild the frontend before serving the app from the backend:
cd frontend
npm install --legacy-peer-deps
npm run build
cd ../backend
source ../venv/bin/activate
python main.pyWhen frontend/dist/index.html exists, backend/main.py serves the SPA and
static assets alongside the /api/v1 API.
docker/compose.yaml starts Traefik and the backend container. The current
compose file exposes the backend under /api/v1/ and mounts persistent backend
data directories.
Create backend/.env first:
cp backend/.env.template backend/.envEnvironment variables used by local/Docker deployment:
| Variable | Used by | Meaning |
|---|---|---|
SLACK_WEHBOOK_URL |
Backend | Optional Slack webhook URL for job notifications. The variable name intentionally matches the current code/template spelling. |
BASE_PATH |
Backend | Optional frontend base URL used in Slack notification links. |
CELLEXTRACTION_MAX_CONCURRENCY |
Backend | Optional extraction job concurrency limit; defaults to 2. |
SERVER_HOST |
Docker compose / Traefik | Hostname routed to the backend; compose defaults to localhost when unset. |
TRAEFIK_ACME_EMAIL |
Docker compose / Traefik | Email address for Let's Encrypt certificate registration. |
Start the stack:
cd docker
docker compose -f compose.yaml up -d --buildTraefik uses ports 80 and 443. Access the hostname set in SERVER_HOST;
the API is exposed under /api/v1.
| Symptom | What to try |
|---|---|
| Port already in use | Backend defaults to 3000, frontend to 3001, docs dev server to 3002; stop the existing process or change the port. |
eslint: command not found |
Install frontend dependencies with npm install or npm install --legacy-peer-deps. |
| npm peer dependency conflict | Use npm install --legacy-peer-deps, matching docs/make_screenshots.sh. |
| ND2 upload is rejected | Confirm the file extension is .nd2 and the filename does not rely on path components. |
| No contours or poor contours | Check layer mode, param1, crop size, and preview frames before annotating. |
| OpenCV import/runtime errors | Use the project venv and install backend/requirements.txt; Docker also installs libgl1 and libglib2.0-0. |
| SQLite permission errors | Ensure backend/app/databases/, backend/app/extracted_data/, backend/app/nd2files/, and backend/app/tempdata/ are writable. |
Backend:
- FastAPI, Uvicorn, and Pydantic for the API layer
- SQLAlchemy and SQLite for database access
- NumPy, OpenCV, Pillow, and Matplotlib for image processing and plotting
- FastMCP for repository/context tooling
Frontend:
- React + React Router for the UI
- Vite for dev/build tooling
- Chakra UI and Framer Motion for styling and motion
- Docusaurus docs build under
frontend/docs-site
PhenoPixel is released under the MIT License. See LICENSE for details. Third-party dependencies remain under their respective licenses.