Skip to content

color='orange' silently uses literal color instead of column when names collide #619

@timtreis

Description

@timtreis

color="orange" silently uses literal color instead of column when names collide

Environment: spatialdata-plot 0.3.4.dev (main, commit 5cfedc7), Python 3.13


Problem

When color=<string> is passed to any render function, the string is first tested with _is_color_like(). If it is a valid matplotlib color name, it is treated as a literal color and any matching column in the table is silently ignored. The only notification is an INFO-level log line, which is invisible at the default Python logging level (WARNING).

This means a column named "orange", "red", "blue", "green", etc. can never be used for coloring — every call silently renders all elements in the literal color, with no legend and no error.

There is no escape hatch: color="orange" always means the CSS color if _is_color_like("orange") returns True.


Minimal reproducible example

import matplotlib; matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np, pandas as pd, geopandas as gpd, anndata as ad
import dask; dask.config.set({"dataframe.query-planning": False})
from shapely.geometry import box
import spatialdata as sd
from spatialdata.models import ShapesModel, TableModel
import spatialdata_plot

shapes = ShapesModel.parse(gpd.GeoDataFrame(
    {"geometry": [box(i, 0, i+1, 1) for i in range(4)], "radius": [0.5]*4},
    geometry="geometry"
))
obs = pd.DataFrame({
    "region": pd.Categorical(["s"]*4),
    "instance_id": list(range(4)),
    # Column deliberately named after a matplotlib color
    "orange": pd.Categorical(["cell_type_A", "cell_type_B", "cell_type_A", "cell_type_B"]),
})
table = TableModel.parse(
    ad.AnnData(X=np.zeros((4, 1)), obs=obs),
    region="s", region_key="region", instance_key="instance_id"
)
sdata = sd.SpatialData(shapes={"s": shapes}, tables={"table": table})

fig, ax = plt.subplots()
sdata.pl.render_shapes("s", color="orange").pl.show(ax=ax)
# Expected: 4 shapes colored by cell type (A vs B) with a legend
# Actual:   all shapes solid orange, no legend, no warning
print(ax.get_legend())  # None

Expected behaviour

Either:

  • A warning is raised when color=<string> matches both a valid matplotlib color AND an existing column name, telling the user which one is being used
  • Or: column lookup takes priority over literal color (column-first semantics), with a note if the column name happens to also be a color

Actual behaviour

All shapes are rendered in solid orange. No legend, no warning — the "orange" column with cell-type labels is entirely ignored.

The affected color names are any string that passes _is_color_like(): ~800+ named matplotlib colors including "red", "blue", "green", "orange", "black", "white", "gray", "grey", "pink", "purple", "salmon".


Fix sketch

At utils.py:2302 (where _is_color_like detection occurs), upgrade from logger.info to logger.warning when color is also an existing column name:

if _is_color_like(color) and color in <column_names>:
    logger.warning(
        f"'{color}' is both a valid matplotlib color and a column name. "
        "Treating it as a literal color. To color by the column, rename it "
        "or check the _is_color_like documentation."
    )

Alternatively, invert the lookup order: check for a column match first, and only fall back to literal color when no column matches.


Triage tier: Tier 3

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions