Interactive web application for exploring and analyzing scRNA-seq and spatial transcriptomics data. Load an h5ad, 10x Genomics h5, Seurat .rds file, 10x CellRanger matrix folder, or prefixed 10x file trio from GEO, visualize cells on a scatter plot, run Scanpy analysis pipelines, and explore results β all from your browser.
XCell uses pixi to manage its environment. A single
pixi install provisions the exact Python and Node versions plus every
dependency β no manual venv, no Node-version juggling, no version troubleshooting.
If you've never installed software from GitHub before, follow every step below in order. Anything in a code block is meant to be pasted into a terminal:
- macOS β open the Terminal app (β+Space, type "Terminal", press Enter).
- Linux β open your terminal emulator (GNOME Terminal, Konsole, etc.).
- Windows β open PowerShell (Start menu β type "PowerShell" β Enter).
Git is the tool that downloads the source code from GitHub.
- macOS β run
git --version. If Git isn't installed, macOS will prompt you to install the Command Line Tools; click Install and wait for it to finish. - Linux (Debian/Ubuntu) β
sudo apt-get install git - Linux (Fedora) β
sudo dnf install git - Windows β download and run the installer from https://git-scm.com/download/win, accepting the defaults.
Verify with:
git --versionPrefer not to use Git? You can also click the green Code button at https://github.com/cahanlab/xcell, choose Download ZIP, then unzip it anywhere on your machine. Skip ahead to step 3.
Pick a folder where you'd like XCell to live (your home directory is fine) and clone the repository into it:
cd ~ # or wherever you want the xcell/ folder created
git clone https://github.com/cahanlab/xcell.git
cd xcellThis creates an xcell/ directory containing the source code. The final
cd xcell puts your terminal inside that directory β every command from here
on must be run from there.
pixi is what installs Python, Node, and every project dependency in one shot.
curl -fsSL https://pixi.sh/install.sh | bash # macOS / Linux
# Windows (PowerShell): iwr -useb https://pixi.sh/install.ps1 | iexpixi is a single self-contained binary. It does not require β or conflict with β
an existing conda installation. Close and reopen your terminal after the
install so pixi is on PATH, then cd xcell again. Verify with:
pixi --versionFrom inside the xcell/ directory:
pixi install # creates ./.pixi/ with Python, Node, and all dependenciesThis reads pixi.lock, so every platform gets identical, reproducible versions.
The first run downloads several hundred MB and can take a few minutes β that's
normal. You only do this once (or after pulling updates).
XCell runs as two processes: a Python backend and a JavaScript frontend. You'll
need two terminal windows, both cd'd into the xcell/ directory.
In the first terminal:
pixi run backend # FastAPI on http://localhost:8000In the second terminal:
pixi run dev # Vite dev server on http://localhost:5173 (installs frontend deps on first run)Wait until the second terminal prints something like Local: http://localhost:5173/,
then open http://localhost:5173 in your browser. Leave both terminals running
while you use XCell; press Ctrl+C in each one to stop the servers when done.
A bundled toy dataset (toy_spatial.h5ad) loads automatically if no data path is specified. To load your own data, set the XCELL_DATA_PATH environment variable when starting the backend:
XCELL_DATA_PATH=/path/to/your/data.h5ad pixi run backend # also supports .h5 and .rdsFrom inside the xcell/ directory:
git pull # fetch the latest code
pixi install # refresh dependencies if they changedThen restart the two pixi run commands.
Loading
.rdsfiles is optional and needs R with the Seurat and SeuratDisk packages installed separately β SeuratDisk is not available as a conda package.Not using pixi? XCell still installs the classic way (
pip install -e backendin a Python 3.10+ venv,npm installinfrontend/on Node 18+). pixi just removes the version-matching guesswork.
The included test_data/toy_spatial.h5ad dataset is a small spatial transcriptomics dataset for exploring XCell's features. Here's a step-by-step walkthrough:
- The center panel shows cells as points at their embedding coordinates (spatial, UMAP, PCA, β¦). The tab is labeled Embedding; if multiple embeddings are available, switch via the in-plot Embedding dropdown.
- Pan by clicking and dragging
- Zoom with scroll wheel
- Zoom/pan are preserved across in-place data changes (cell delete, filter, normalize, etc.). The camera only re-centers when you explicitly switch embeddings.
- Open Cell Manager (left panel)
- Select a metadata column to color cells by that annotation
- Click the Select button in the toolbar (use the dropdown arrow to choose between Lasso and Polygon tools)
- Lasso: click and drag to draw a freehand selection
- Polygon: click to add vertices, double-click to close and select cells inside
- Hold Shift while selecting to add to the existing selection
- Checkboxes in the Cell Manager also select/deselect cells by category
- Rename a category label by double-clicking the label in the expanded category list. Press Enter to commit (or Escape to cancel). Works on Leiden clusters, Contourize results, user annotations β any categorical metadata.
- Merge two or more labels by clicking the
β―menu in a column header and choosing Merge labelsβ¦. Pick the labels to merge, type a new name (or reuse an existing one to fold them in), then click Merge. - Selected cells can be masked or deleted
The Adjust toolbar dropdown has three sections:
- Rotate β enter Rotate mode then drag inside the plot to rotate around the data centroid. A live angle badge and a faint orange ring at the pivot show what's happening. Hold Shift to snap to 15Β° increments. The bottom-of-viewport toolbar gives Β±90Β° quick buttons and a precise degree input (Enter to apply).
- Quilt β lasso a cell subset, then drag to translate it (or Shift+drag to rotate it) β for stitching together adjacent tissue sections. Arrow keys nudge the selection (Shift+arrow for 10Γ larger step). Press Ctrl/Cmd+Z (or click "Undo") to revert the last quilt transform.
- Flip β one-shot actions: Flip Horizontal mirrors the embedding leftβright (about the y-axis), Flip Vertical mirrors topβbottom (about the x-axis). If you're in Quilt mode with cells selected, the flip applies only to those cells.
All adjustments persist on the backend and are saved on h5ad export.
- Open the Scanpy modal (top toolbar)
- Go to Preprocessing and run in order:
- Normalize Total β normalize counts per cell
- Log1p β log-transform the data
- Highly Variable Genes β identify informative genes
-
In the Scanpy modal, go to Cell Analysis and run in order:
- PCA β reduce dimensionality
- PCA Loadings (optional) β scan the top-loading genes on each side of every PC (hover a gene to see its exact loading). If you spot PCs dominated by technical signal (cell cycle, mitochondrial genes, etc.), check them and click Create PC subset to persist a derived embedding (e.g.
X_pca_noPC2_5). - Neighbors β build cell neighborhood graph (requires PCA). If you created derived subsets in step 2, pick one from the PC source dropdown β UMAP and Leiden inherit the choice automatically through the neighbors graph.
- UMAP β compute 2D embedding (requires Neighbors)
- Leiden β cluster cells (requires Neighbors)
Re-running PCA clears all derived PC subsets (with a toast) since their column indices refer to the previous eigenvectors.
- In Cell Manager, select the
leidencolumn to color by cluster - Switch the embedding to
X_umapto see the UMAP layout
- Open Gene Manager (right panel)
- If the dataset has alternative gene identifier columns (e.g., gene symbols alongside Ensembl IDs), use the Gene IDs dropdown at the top of the panel to switch
- Search or browse genes
- Click a gene to color cells by its expression
To scope the Gene Panel to a relevant gene universe, click the β― button in the Genes panel header and choose Gene maskβ¦. The modal lists all boolean columns in your dataset's .var (for example, highly_variable after running Highly Variable Genes, or spatially_variable after spatial autocorrelation). For each column, choose:
- Off β ignore this column
- Keep β include genes where this column is True
- Hide β exclude genes where this column is True
When you have multiple Keep columns, choose whether to match ANY (union) or ALL (intersection). Hide columns always combine as a union.
The mask applies to the gene browse list, gene search, expanded gene set rows, and gene set score aggregation used for display coloring. It does not apply to analysis operations (Diff Exp, Marker Genes, Gene PCA, etc.) β those have their own gene subset dropdowns. The mask is per-dataset and session-only; reloading the page clears it.
- Create gene sets manually in Gene Manager
- Import gene lists from files
The Manual category at the top of the Gene Panel is the home for gene sets
you create by hand. Click + π to create a named folder (e.g. "Fig 3 markers").
Inside a folder, click + to add a new empty set, or drag an existing
top-level set onto the folder row to move it in. Drag a set back onto the thin
strip above the first folder to move it out. Drag sets within the same container
to reorder them.
Each gene set and folder row has a β― button with secondary actions.
On a gene set row, that's where you find Pin and Cluster genes.
On a manual folder row, that's where you find Pin and Export (JSON/GMT/CSV).
Use the Pin/Unpin option in the β― menu on any set or folder to float it to
the top of its container. Pinning works in every category β including
auto-generated ones β and survives moving a set between folders.
The Export βΈ option in the β― menu on any manual folder lets you export just
that folder's gene sets to JSON, GMT, or CSV. Filename defaults to the sanitized
folder name. JSON round-trips via the existing Import modal.
Use the π button on a category header to hide a whole category from view
(useful when an analysis has filled Gene Clusters or Differential Expression
with results you're done with). A N hidden βΈ footer appears at the bottom of
the Gene Panel β click it and then Unhide to bring a category back.
Tip: double-click any gene set name or manual folder name to rename it inline.
Any gene set with at least 4 genes can be sub-clustered by expression
pattern. Click the β― button on a gene set row and choose Cluster genesβ¦.
Pick a method (Hierarchical or K-means), a number of clusters K (default 3),
and a cell context ("All cells", "Current selection" if you've lasso-picked
some cells, or "Annotation category" to restrict to specific categorical
values in a .obs column). Clicking Run creates a new folder in
Gene Clusters named after the source set, containing one gene set per
cluster. Re-running with different K or a different cell context appends
another folder so you can compare runs side by side.
You can select cells based on a gene's expression or a gene set score without needing to eyeball the scatter plot:
- In the Gene Panel, click the
β―menu on any gene row or gene set row and chooseSelect cellsβ¦. - The modal opens and the scatter plot switches to expression coloring for that source. An interactive histogram of the values is shown.
- Pick a threshold mode (
Above,Below, orBetween) and drag the red cutoff line(s). The match counter updates live. - Choose an action:
- Update selection replaces, adds to, or intersects with your current lasso selection.
- Label cells creates a new annotation column with
high/lowlabels for the cells in the chosen context (current selection or all cells). On success, clickOpen Diff Exp βΈto immediately run differential expression between the two groups.
Typical workflow for "find DEGs by expression state in a region": lasso a region β β― β Select cellsβ¦ on a gene β drag the threshold β Label cells β Open Diff Exp.
- Open the Analyze modal (top toolbar) β Cell Analysis β Compare Cells
- Select an .obs column (e.g.,
leiden) from the dropdown - Check 2 or more groups to compare:
- 2 checked β pairwise differential expression
- 3+ checked β one-vs-rest marker gene analysis
- Set Top N genes and click Run
- You can also use lasso selection: select cells β Set as Group 1 / Set as Group 2 β click Compare in the comparison bar
- Draw lines on the scatter plot
- Click the gear icon on a shape in the Shapes panel to open Line Tools
- Under Gene Association, configure:
- Test against: position along line or distance from line
- Gene subset: filter to highly variable genes or other boolean columns
- Spline knots: number of interior knots for the B-spline model (default 5; higher = more flexible fit)
- FDR: significance threshold (default 0.05)
- Max genes/direction (or /module when clustering is on): cap on genes returned
- Cluster genes into modules (default off): when checked, significant genes are grouped by expression profile shape (increasing, decreasing, peak, trough, complex); when unchecked, only positive/negative lists are returned
- Click Find Associated Genes to run the analysis
- In the results modal, use the Filters bar to refine results interactively: adjust min RΒ², min amplitude, max FDR, or toggle pattern types (increasing, decreasing, peak, trough, complex)
- Click Add to Gene Sets in the results modal to save the genes β each run creates its own folder in the Line Association category of the Gene Panel (one set per module if clustering is on, or a single combined
Associated genesset if clustering is off) - Click Download CSV in the results modal to export stats (gene, f_stat, pval, fdr, r_squared, amplitude, direction) for every gene tested β a ranked-list suitable for GSEA or other external analyses
- Draw a line on each tissue section representing the same biological axis
- For each line, select cells (via lasso or clicking a category value in the Cells panel) and click + to associate them with the line
- Check the lines to include using the checkboxes that appear on lines with projected cells
- Click Find Associated Genes in the action bar
- In the multi-line modal, toggle direction per line if needed (arrow button) and set analysis parameters
- Results pool cells across all lines for a single, higher-powered analysis
- After computing both Neighbors (Cell Analysis) and Spatial Neighbors (Spatial Analysis), open Analyze β Cell Analysis β Combine Neighbors
- Select two or more graphs and set their weights (default: equal weights; weights are normalized to sum to 1)
- Click Combine graphs β the combined graph becomes the default
connectivitiesslot - Run Leiden (or UMAP) afterward and clustering/embedding will reflect both graphs, encouraging spatially neighboring cells to cluster together when the spatial graph is weighted in
- In the Scanpy modal, go to Gene Analysis:
- Build Gene Graph β compute gene-gene similarity
- Cluster Genes β group genes by expression pattern
- Select genes in the Gene Panel (click individual genes or use a gene set)
- Open the Scanpy modal, go to Spatial Analysis > Contourize
- Adjust smoothing sigma, contour levels, and grid resolution as needed
- Click Run β a new categorical column appears in the Cell Panel
- Color cells by the contour column to visualize spatial expression zones
To compare the same tissue across timepoints (or any cross-sample analysis), you can load 2+ spatial-transcriptomics h5ads into one dataset:
- Click File β Combine spatial sectionsβ¦ in the toolbar
- In the load modal, switch the mode toggle to Combine sections (already set when you arrive via the menu)
- Click
.h5adfiles in the browser to add them to the list β each file gets an editable label (defaults to the filename stem) - Adjust the gap (% of mean section width) and the slot to load into
- Click Combine N sections β sections are placed left-to-right along the spatial x-axis with the configured gap; a new
samplecategorical.obscolumn tags each cell with its source file label - The combined dataset behaves like any other β color by
sampleto see the layout, run Compare Cells across timepoints, etc.
Notes:
- Genes = intersection of the input files' var indices. Use Gene IDs swap in the Gene Panel beforehand if your files use different identifier columns.
- v1 supports
.h5adonly. For.rds/ 10x files, load them once via single-file Load and export as h5ad first. - Per-file UMAPs/PCAs are dropped β re-run PCA/UMAP via the Scanpy modal on the combined data.
- Click Load in the toolbar β the modal shows a sidebar with quick-access locations (Home, Desktop, Documents, Downloads) and recently loaded files, plus breadcrumb path navigation for clicking any ancestor directory
- Choose Secondary from the "Load into" dropdown
- Browse or enter the path to a second h5ad, h5, rds file, 10x matrix folder, or prefixed 10x file trio and click Load
- A dataset switcher dropdown appears in the header β switch between Primary and Secondary to compare datasets
- Click the Split button to view both datasets side by side
- Click on either plot to make it the active dataset β the Cell and Gene panels update accordingly
- Each plot has its own embedding selector, legend, and independent pan/zoom
- Click Export in the toolbar to download annotations and results
xcell ships with hardcoded defaults for every form in the Scanpy modal, the Line Association dialog, and the Display Settings panel (e.g. filter_cells β min genes = 25, point size = 3). To change these without touching code, drop a YAML (or JSON) file at ~/.xcell/config.yaml β or set XCELL_CONFIG_PATH to point somewhere else. A sample is included at docs/config.example.yaml.
Shape is a nested mapping matching the form namespace β only include keys you want to override, everything else falls back to the built-in default:
scanpy:
filter_cells:
min_genes: 15 # was 25
neighbors:
n_neighbors: 20 # was 15
line_association:
fdr_threshold: 0.1 # was 0.05
cluster_genes: true # was false
display:
point_size: 4 # was 3
point_opacity: 0.7 # was 0.85
background_color: '#000000' # was '#1a1a2e'
color_scale: magma # was viridis
clip_percentile: 0.5 # was 1.0
gene_set_aggregation: median # was meanA backend restart is required to pick up edits. Verify what was loaded by hitting GET /api/config/defaults; unknown keys are silently ignored. Display defaults are applied to every dataset slot at startup and re-applied on each fresh dataset load β you can still tweak any value in the Display Settings panel for the current session.
Most changes you make in a session survive on the backend process: deleted cells, transformed embeddings, computed PCA / neighbors / UMAP / Leiden, drawn lines, and β as of this version β your gene sets (categories, folders, individual sets). If the browser tab accidentally reloads, the gene panel is rehydrated from the server. Restarting the backend still clears everything; persist important sets via the Gene Panel export controls before shutting down.
- Interactive scatter plot β deck.gl-powered visualization with pan, zoom, lasso selection
- Cell Manager β browse/color by metadata, mask/delete cells
- Gene Manager β search genes, create gene sets, import gene lists
- Scanpy integration β run preprocessing, cell analysis (PCA, Neighbors, UMAP, Leiden), gene analysis, spatial analysis (contourize), and differential expression directly in the browser. Long-running operations (gene neighbors, spatial neighbors, spatial autocorrelation, contourize, line gene association) can be cancelled mid-run without corrupting session data.
- Trajectory analysis β draw lines and associate genes with spatial trajectories
- Quilt mode β lasso and rearrange tissue pieces: drag to translate, shift+drag to rotate, flip to reflect selected cell subsets
- Display settings β adjust point size, opacity, colormaps, bivariate coloring, and an optional coordinate grid behind the plot (with data-coordinate tick labels along the bottom/left axes for visual reference and troubleshooting)
- Highlight overlay β stack one or more colored layers on top of the active coloring without replacing it. Each layer is either a gene-set expression threshold (above / below / between, with a draggable histogram cutoff) or a frozen cell-set mask (current selection or category value). Useful for marking e.g. epithelium in green while keeping bivariate coloring on the rest.
- Figure builder β compose multi-panel publication figures from a cell selection (or the full dataset). Each panel renders the same cells colored independently (single gene, gene set, bivariate two-gene-set, or metadata column), with its own color scale and title. Per-figure point size, opacity, background, and optional NΓN grid overlay are shared so panels stay visually consistent. Per-panel "show highlight layers" toggle blends the dataset's current Highlight overlays into the panel. Shared pan/zoom keeps panels aligned. Export to PNG at 1Γβ4Γ DPI from the new Figure tab.
- Multi-dataset support β load two datasets (h5ad, h5, rds, 10x matrix folders, or prefixed 10x file trios from GEO), switch between them, or view side by side in split mode
- Export β download annotations and analysis results
xcell/
βββ backend/
β βββ xcell/
β β βββ main.py # FastAPI app entry point
β β βββ adaptor.py # DataAdaptor class (wraps AnnData)
β β βββ diffexp.py # Differential expression
β β βββ data/
β β β βββ toy_spatial.h5ad # Bundled toy dataset
β β βββ api/
β β βββ routes.py # REST API endpoints
β βββ pyproject.toml # Python dependencies
βββ frontend/
β βββ src/
β β βββ App.tsx # Main app component
β β βββ store.ts # Zustand state management
β β βββ main.tsx # Entry point
β β βββ components/
β β β βββ ScatterPlot.tsx # deck.gl scatter plot
β β β βββ CellPanel.tsx # Cell metadata manager
β β β βββ GenePanel.tsx # Gene browser / gene sets
β β β βββ ScanpyModal.tsx # Scanpy analysis pipeline UI
β β β βββ DiffExpModal.tsx # Differential expression
β β β βββ LineAssociationModal.tsx # Trajectory analysis
β β β βββ DisplaySettings.tsx # Visualization settings
β β β βββ ShapeManager.tsx # Shape/selection tools
β β β βββ ImportModal.tsx # Gene list import
β β βββ hooks/
β β βββ useData.ts # Data fetching hooks
β βββ package.json # Node dependencies
β βββ vite.config.ts # Vite configuration
βββ README.md
test_data/
βββ toy_spatial.h5ad # Toy dataset for testing
βββ generate_toy.py # Script to regenerate toy data
- Backend: FastAPI + AnnData + Scanpy, serving data and running analysis via REST API
- Frontend: React + TypeScript + Vite + deck.gl + Zustand for state management
- Data flow: h5ad file β DataAdaptor β REST API β React hooks β deck.gl visualization
- API docs: Available at http://localhost:8000/docs when the backend is running
