Skip to content

Finding a better home for cluster centroids (i.e. 🚫_tmp_cluster_pos) #938

@gokceneraslan

Description

@gokceneraslan

We have a weird temporary global variable called sc.pl._utils._tmp_cluster_pos. We use it for storing the positions of cluster centroids (actually the centroids of any categorical variable for any sort of embedding). The weird part is that it's set in scatterplot functions (see https://github.com/theislab/scanpy/blob/master/scanpy/plotting/_anndata.py#L468 and https://github.com/theislab/scanpy/blob/master/scanpy/plotting/_tools/scatterplots.py#L809) and used only by sc.pl.paga_compare (https://github.com/theislab/scanpy/blob/master/scanpy/plotting/_tools/paga.py#L119).

First, it's not obvious where paga_compare finds centroids (it was a mystery to me until recently). Second, the current design is error-prone (see a corner case #686). Therefore, there should be a better place to store cluster centroids :)

I'm not following the discussion about the future of AnnData, but maybe having something like adata.uns['obs_category_leiden'] and storing colors and centroids in it e.g. adata.uns['obs_category_leiden']['colors'] and adata.uns['obs_category_leiden']['centroids']['X_umap'] would be more structured.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions