Full-featured Python implementation of R pathview + SBGNview with support for KEGG, Reactome, MetaCyc, and more.
- β KEGG Pathways β Download and visualize any KEGG pathway
- β SBGN Pathways β Support for Reactome, MetaCyc, PANTHER, SMPDB
- β Multiple Formats β PNG (native overlay), SVG (vector), PDF (graph layout)
- β Gene & Metabolite Data β Overlay expression and abundance data
- β Multi-Condition β Visualize multiple experiments side-by-side
- β ID Conversion β Automatic mapping: Entrez β Symbol β UniProt β Ensembl
- β Highlighting β Post-hoc emphasis of specific nodes/edges/paths
- β Spline Curves β Smooth Bezier edge routing
- β Custom Colors β Configurable diverging color scales
- π Full SBGN-ML support β Parse and render SBGN Process Description files
- π Database integration β Direct download from Reactome, MetaCyc
- π SVG vector output β Scalable graphics for web and publication
- π Highlighting system β ggplot2-style composable modifications
- π Spline rendering β Cubic Bezier and Catmull-Rom curves
pip install pathview-plus# Clone repository
git clone https://github.com/raw-lab/pathview-plus
cd pathview-plus
# Install dependencies
pip install -r requirements.txt
pip install .
# Or install specific packages
pip install polars numpy matplotlib seaborn Pillow networkx requestsDependencies:
- Python β₯ 3.10
- polars β₯ 0.19.0
- matplotlib β₯ 3.7.0
- seaborn β₯ 0.12.0
- numpy β₯ 1.24.0
- Pillow β₯ 10.0.0
- networkx β₯ 3.1
- requests β₯ 2.31.0
import polars as pl
from pathview import pathview
# Load your data
gene_data = pl.read_csv("gene_expr.tsv", separator="\t")
# Visualize on KEGG pathway
result = pathview(
pathway_id="04110", # Cell cycle
gene_data=gene_data,
species="hsa",
output_format="png"
)from pathview import download_reactome, parse_sbgn, sbgn_to_df, pathview
# Download Reactome pathway
path = download_reactome("R-HSA-109582") # Hemostasis
# Parse and visualize
pathway = parse_sbgn(path)
node_df = sbgn_to_df(pathway)
# Overlay data
result = pathview(
pathway_id="R-HSA-109582",
gene_data=gene_data,
output_format="svg" # Vector graphics
)# Three experimental conditions
gene_data = pl.DataFrame({
"entrez": ["1956", "2099", "5594", "207"],
"Control": [0.5, -0.3, 1.2, -0.8],
"Treatment_A": [2.1, -1.5, 0.4, 1.3],
"Treatment_B": [1.8, -0.9, 2.3, 0.7],
})
result = pathview(
pathway_id="04010", # MAPK signaling
gene_data=gene_data,
species="hsa",
limit={"gene": 2.5, "cpd": 1.5},
)
# Each node shows 3 color bands (one per condition)result = pathview(
pathway_id="04151",
gene_data=gene_data,
species="hsa",
low={"gene": "#2166AC", "cpd": "#4575B4"}, # Blue
mid={"gene": "#F7F7F7", "cpd": "#F7F7F7"}, # White
high={"gene": "#D6604D", "cpd": "#B2182B"}, # Red
)gene_data = pl.DataFrame({
"symbol": ["TP53", "EGFR", "KRAS", "PIK3CA", "AKT1"],
"log2fc": [-1.8, 2.4, 1.1, 1.5, 0.9],
})
result = pathview(
pathway_id="04151",
gene_data=gene_data,
species="hsa",
gene_idtype="SYMBOL", # Automatic conversion to Entrez
)from pathview import sim_mol_data
gene_data = sim_mol_data(mol_type="gene", species="hsa", n_mol=80)
cpd_data = sim_mol_data(mol_type="cpd", n_mol=30)
result = pathview(
pathway_id="00010", # Glycolysis
gene_data=gene_data,
cpd_data=cpd_data,
species="hsa",
low={"gene": "green", "cpd": "blue"},
high={"gene": "red", "cpd": "yellow"},
)result = pathview(
pathway_id="04110",
gene_data=gene_data,
species="hsa",
output_format="svg", # Scalable vector graphics
)
# Output: hsa04110.pathview.svg
# - Scalable without quality loss
# - Smaller file size
# - Editable in Inkscape/Illustratorresult = pathview(
pathway_id="04010",
gene_data=gene_data,
species="hsa",
kegg_native=False, # Use NetworkX layout
output_format="pdf",
)
# Output: hsa04010.pathview.pdffrom pathview import highlight_nodes, highlight_path
result = pathview("04010", gene_data=data)
# Composable modifications (ggplot2-style)
highlighted = (result
+ highlight_nodes(["1956", "2099"], color="red", width=4)
+ highlight_path(["1956", "2099", "5594"], color="orange"))
highlighted.save("highlighted.png")from pathview import cubic_bezier, catmull_rom_spline
import matplotlib.pyplot as plt
# Smooth Bezier curve
curve = cubic_bezier((0,0), (1,2), (3,2), (4,0), n_points=100)
plt.plot(curve[:, 0], curve[:, 1], linewidth=2)
plt.title("Bezier Curve Edge Routing")
plt.savefig("bezier_example.png")pathways = ["04110", "04010", "04151", "00010"]
for pw_id in pathways:
try:
result = pathview(
pathway_id=pw_id,
gene_data=gene_data,
species="hsa",
out_suffix=f"batch_{pw_id}",
)
print(f"β Completed {pw_id}")
except Exception as e:
print(f"β Failed {pw_id}: {e}")# Basic usage
python pathview_cli.py --pathway-id 04110 --gene-data expr.tsv
# Specify species and ID type
python pathview_cli.py \
--pathway-id 04110 \
--species hsa \
--gene-data expr.tsv \
--gene-idtype SYMBOL
# Custom colors
python pathview_cli.py \
--pathway-id 04010 \
--gene-data expr.tsv \
--low-gene '#2166AC' \
--high-gene '#D6604D' \
--output-format svg
# Simulate data (for testing)
python pathview_cli.py \
--pathway-id 04110 \
--simulate \
--n-sim 200
# Display KEGG legend
python pathview_cli.py --legendCLI Arguments:
Pathway:
--pathway-id ID KEGG pathway number (e.g., '04110')
Input data:
--gene-data TSV Gene expression file (TSV)
--cpd-data TSV Compound abundance file (TSV)
--gene-idtype TYPE Gene ID type: ENTREZ, SYMBOL, UNIPROT, ENSEMBL
--cpd-idtype TYPE Compound ID type: KEGG, PUBCHEM, CHEBI
Species & paths:
--species CODE KEGG species code (default: hsa)
--kegg-dir DIR Directory for files (default: .)
--out-suffix SUFFIX Output filename suffix (default: pathview)
Rendering:
--kegg-native Use KEGG PNG background (default: True)
--output-format FORMAT Output format: png, pdf, svg (default: png)
--map-symbol Replace Entrez with symbols (default: True)
--node-sum METHOD Aggregation: sum, mean, median, max
--no-signature Suppress watermark
--no-col-key Suppress color legend
Color scale:
--limit-gene FLOAT Color scale limit (default: 1.0)
--bins-gene INT Color bins (default: 10)
--low-gene COLOR Low-end color (default: green)
--mid-gene COLOR Mid-point color (default: gray)
--high-gene COLOR High-end color (default: red)
--low-cpd COLOR Low compound color (default: blue)
--high-cpd COLOR High compound color (default: yellow)
Utilities:
--legend Display KEGG legend and exit
--simulate Generate simulated data
--n-sim INT Number of simulated molecules (default: 200)
First column = gene IDs, remaining columns = numeric expression values.
entrez Control Treatment_A Treatment_B
1956 2.31 0.45 1.82
2099 -1.14 -0.88 0.33
5594 0.72 1.33 -0.51
207 -0.88 1.21 0.94gene_symbol log2fc p_value
TP53 -1.8 0.001
EGFR 2.4 0.0001
KRAS 1.1 0.01kegg abundance
C00031 1.45
C00118 -0.83
C00022 2.11pathview(
pathway_id="04110",
gene_data=data,
limit={"gene": 2.0, "cpd": 1.5}, # Β±2.0 for genes, Β±1.5 for compounds
bins={"gene": 20, "cpd": 10}, # Color resolution
low={"gene": "blue", "cpd": "green"},
mid={"gene": "white", "cpd": "gray"},
high={"gene": "red", "cpd": "yellow"},
)The scale maps:
low valueβlow color(default: green/blue)0βmid color(default: gray)high valueβhigh color(default: red/yellow)
both_dirs={"gene": False, "cpd": False}
# Maps: 0 (mid) β max (high)| Type | Value | Example |
|---|---|---|
| Entrez | ENTREZ |
1956 |
| Symbol | SYMBOL |
EGFR |
| UniProt | UNIPROT |
P00533 |
| Ensembl | ENSEMBL |
ENSG00000146648 |
| KEGG | KEGG |
hsa:1956 |
| Type | Value | Example |
|---|---|---|
| KEGG | KEGG |
C00031 |
| PubChem | PUBCHEM |
5793 |
| ChEBI | CHEBI |
4167 |
- Format: KGML (XML)
- Species: 500+ organisms
- Download: Automatic via KEGG REST API
- Example:
pathway_id="hsa04110"
- Format: SBGN-ML
- Species: Human, mouse, rat, and more
- Download:
download_reactome("R-HSA-109582") - Example: Hemostasis, Immune System, Signaling
- Format: SBGN-ML
- Coverage: 2,800+ metabolic pathways
- Download:
download_metacyc("PWY-7210") - Example: Pyrimidine biosynthesis
- Format: SBGN-ML
- Coverage: 177 signaling and metabolic pathways
- Note: Manual download required
- Format: SBGN-ML
- Coverage: Small molecule pathways
- Note: Manual download from website
pathview/
βββ __init__.py # Public API exports
βββ constants.py # Type definitions
βββ utils.py # String/numeric utilities
β
βββ id_mapping.py # Gene/compound ID conversion
βββ mol_data.py # Data aggregation, simulation
β
βββ kegg_api.py # KEGG REST API
βββ databases.py # Reactome, MetaCyc downloaders
β
βββ kgml_parser.py # KEGG KGML (XML) parser
βββ sbgn_parser.py # SBGN-ML (XML) parser
β
βββ color_mapping.py # Colormaps, node coloring
βββ node_mapping.py # Map data onto nodes
β
βββ rendering.py # PNG/PDF renderers
βββ svg_rendering.py # SVG vector renderer
βββ highlighting.py # Post-hoc modifications
βββ splines.py # Bezier curve math
β
βββ pathview.py # Core orchestrator
pathview_cli.py # Command-line interface
requirements.txt # Dependencies
README.md # This file
Module Statistics:
- 15 modules | 3,506 lines of code
- Functional programming style
- Full type hints
- Comprehensive docstrings
pathview(
pathway_id: str,
gene_data: Optional[pl.DataFrame] = None,
cpd_data: Optional[pl.DataFrame] = None,
species: str = "hsa",
kegg_dir: Path = ".",
kegg_native: bool = True,
output_format: str = "png", # "png", "pdf", "svg"
gene_idtype: str = "ENTREZ",
cpd_idtype: str = "KEGG",
out_suffix: str = "pathview",
node_sum: str = "sum",
map_symbol: bool = True,
map_null: bool = True,
min_nnodes: int = 3,
new_signature: bool = True,
plot_col_key: bool = True,
# Color scale parameters
limit: dict = {"gene": 1.0, "cpd": 1.0},
bins: dict = {"gene": 10, "cpd": 10},
both_dirs: dict = {"gene": True, "cpd": True},
low: dict = {"gene": "green", "cpd": "blue"},
mid: dict = {"gene": "gray", "cpd": "gray"},
high: dict = {"gene": "red", "cpd": "yellow"},
na_col: str = "transparent",
) -> dictsim_mol_data(mol_type="gene", species="hsa", n_mol=100, n_exp=1) β pl.DataFrame
mol_sum(mol_data, id_map, sum_method="sum") β pl.DataFrameid2eg(ids, category, org="Hs") β pl.DataFrame
eg2id(eg_ids, category="SYMBOL", org="Hs") β pl.DataFrame
cpd_id_map(in_ids, in_type, out_type="KEGG") β pl.DataFrame# KEGG
parse_kgml(filepath) β KGMLPathway
node_info(pathway) β pl.DataFrame
# SBGN
parse_sbgn(filepath) β SBGNPathway
sbgn_to_df(pathway) β pl.DataFramedownload_kegg(pathway_id, species="hsa", kegg_dir=".") β dict
download_reactome(pathway_id, output_dir=".") β Path
download_metacyc(pathway_id, output_dir=".") β Path
list_reactome_pathways(species="Homo sapiens") β list[dict]
detect_database(pathway_id) β str# API design (full implementation in progress)
result = pathview(...)
highlighted = result + highlight_nodes(["1956", "2099"], color="red")
highlighted.save("output.png")cubic_bezier(p0, p1, p2, p3, n_points=50) β np.ndarray
quadratic_bezier(p0, p1, p2, n_points=50) β np.ndarray
catmull_rom_spline(points, n_points=50, alpha=0.5) β np.ndarray
route_edge_spline(source, target, obstacles, mode="orthogonal") β np.ndarray
bezier_to_svg_path(curve, close=False) β str- KEGG pathways: ~2-5 seconds (download + render)
- SBGN pathways: ~3-8 seconds (more complex)
- Multi-condition: Linear scaling with # conditions
- Batch processing: Parallel processing possible
Optimization tips:
- Cache downloaded files (automatic)
- Use
output_format="svg"for faster rendering - Disable color key for batch jobs:
plot_col_key=False
Contributions welcome! Areas for improvement:
- SBGN rendering β Improve glyph shape variety
- Edge routing β Implement A* pathfinding for splines
- Database integration β Add PANTHER, SMPDB auto-download
- Highlighting β Wire up image modification backend
- Performance β Parallel pathway processing
Creative Commons Attribution-NonCommercial (CC BY-NC 4.0) β See LICENSE file
Citations:
If you are publishing results obtained using Pathview-Plus, please cite:
- Pre-Print Pathview-Plus: Figueroa III JL, Brouwer CR, White III RA. 2026. Pathview-plus: unlocking the metabolic pathways from cells to ecosystems. bioRxiv.
If you using the R version please cite:
- Original Pathview R: Luo, W., & Brouwer, C. 2013. Pathview: an R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics, 29(14), 1830β1831. Pathview
- Original SBGNview R: Shashikant, T., et al. 2022. SBGNview: Data analysis, integration and visualization on all pathways using SBGN. Bioinformatics, 38(11), 3006β3008. SBGNview
We welcome contributions of other experts expanding features in Pathview-plus including the R and python versions. Please contact us via support.
- Issues: open an issue.
- Email: Dr. Richard Allen White III
Made with β€οΈ for the pathway visualization community