An AI-powered interactive web application for automated single-cell RNA-seq cell type annotation. CellMaster-UI combines hypothesis-driven analysis with advanced machine learning agents to provide accurate, iterative cell type identification.
- Interactive Workflow: Upload data, provide hypotheses, and refine annotations through an intuitive UI
- Multiple Annotation Methods: Compare results from CellTypist, GPTCellType, and the custom CellMaster pipeline
- Visual Analytics: Real-time UMAP plots, dot plots, and marker gene visualization
- Iterative Refinement: Human-in-the-loop feedback system for improving annotation accuracy
- Python: 3.8 or higher
- Node.js: 14.x or higher
- npm: 6.x or higher
- R: Required for Cell Ontology (CL) lookups
- OpenAI API Key: Required for AI-powered features
git clone <repository-url>
cd CellMaster-UIEdit /config/settings.py and replace the placeholder with your OpenAI API key:
OPENAI_API_KEY = "your-api-key-here"pip install -r requirements.txt
cd server
pip install -r requirements.txt
cd ..cd ai-scientist-ui
npm install
cd ..The application uses R for ontology lookups. Ensure R is installed and the rols package is available:
install.packages("rols")Open a terminal and run:
cd server
python app.pyThe server will start on http://localhost:5000
Open a new terminal and run:
cd ai-scientist-ui
npm startThe UI will open automatically at http://localhost:3000
- H5AD File (Required): Upload your single-cell RNA-seq data in H5AD format
- Marker Genes CSV (Optional): Upload a CSV file with cluster-specific marker genes
- Original Grouping Column: Specify the column name in your H5AD file that contains cluster assignments (e.g., "leiden", "seurat_clusters")
- CellTypist Model (Optional): Specify a CellTypist model name for comparison (e.g., "Healthy_Adult_Heart.pkl")
Enter your hypothesis about the tissue type or expected cell types. For example:
- "This is liver tissue"
- "PBMC sample containing immune cells"
- "Retinal tissue with photoreceptor cells"
Click "Upload and Hypothesis" to start the annotation pipeline. The system will:
- Load and preprocess your data
- Generate marker gene signatures
- Query AI models for cell type predictions
- Compare with CellTypist annotations
- Display results with confidence scores
The interface displays:
- Analysis Results Panel: Iteration history, cluster annotations, and confidence metrics
- Dot Plot: Marker gene expression across clusters
- UMAP Plot: Spatial visualization of cell populations with annotations
- Zoom in and out to change granuality of clustering
- Request re-annotation of uncertain clusters
- Provide feedback to refine predictions
Standard AnnData format with:
.X: Expression matrix (genes × cells).obs: Cell metadata including cluster assignments.var: Gene metadata
CSV file with columns:
cluster: Cluster identifiergene: Gene symbolavg_log2FC(or similar): Fold change metric- Additional metrics as available
Example:
cluster,gene,p_val,avg_log2FC,pct.1,pct.2
0,CD3D,0.001,2.5,0.9,0.1
0,CD3E,0.002,2.3,0.85,0.15
1,CD79A,0.001,3.1,0.95,0.05The application generates outputs in the following directories:
annotation_dict_*.txt: Cluster-to-cell-type mappings for each iteration*_umap_plot.png: UMAP visualizations with annotationsdot_plot_*.png: Marker gene dot plots
- Uploaded input files are stored here
run generate_score.py for scoring the annotation
Edit variables at the top of generate_score.py:
input_dir = "uploads/"
h5ad_file = "your_file.h5ad"
markers_file = 'your_markers.csv'
original_grouping = "leiden"
correct_column = "ground_truth" # If available for benchmarking
threshold = 0.95 # Confidence threshold
tissue_name = "your_tissue"The cell_type_mapping dictionary in generate_score.py can be customized to standardize cell type names across different nomenclatures.
R Environment Not Found Ensure R_HOME is set correctly. The Default Setting is:
export R_HOME=/Library/Frameworks/R.framework/ResourcesCellMaster-UI/
├── ai-scientist-ui/ # React frontend
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── context/ # React context providers
│ │ └── types/ # TypeScript definitions
├── server/ # Flask backend
│ ├── app.py # Main server application
│ ├── uploads/ # User uploaded files
│ └── outputs/ # Generated results
├── agents/ # AI agent modules
│ ├── hypothesis_agent/
│ ├── experiment_agent/
│ ├── evaluation_agent/
│ └── environment_agent/
├── config/ # Configuration files
├── utils/ # Utility functions
└── generate_score.py # Evaluation script