Skip to content

mu.pp.neighbor gives a segmentation error. #158

@Young-Sook

Description

@Young-Sook

To Reproduce

test-data.tar.gz

import muon as mu
import scanpy as sc
mdata = mu.read_h5mu('FL135-small.h5mu')
for modality, adata

 in mdata.mod.items():
    adata.obsm['X_pca_l2'] = adata.obsm['X_pca'].copy()
    mu.pp.l2norm(mdata, mod=modality, rep='X_pca_l2')
    nn_modality_key = 'nn_15_euclidean'
    sc.pp.neighbors(adata, use_rep='X_pca_l2', key_added=nn_modality_key)

mu.pp.neighbors(
    mdata=mdata,
    metric='euclidean',
    neighbor_keys={'rna': nn_modality_key, 'prot': nn_modality_key},
)
    

That gives the following error

Segmentation fault: 11

After playing with the source code, I found out the problem was in https://github.com/scverse/muon/blob/main/muon/_core/preproc.py#L519.

In my mdata['prot'], there are some cells with having exactly same rep. That caused nn_indices messed up, which is the result from running nearest_neighbors. For example, if cell 2 and cell 4 have the same reps, then nn_indicies[:,0] is [0, 1, 4, 3, 4, 5,...] instead of [0, 1, 2, 3, 4, 5, ... ]. This messed up constructing a csr_matrix (graph) and that caused in error in adding two graph matrices here: https://github.com/scverse/muon/blob/main/muon/_core/preproc.py#L534.

I suggest changing that line as below:

graph = csr_matrix(
            (
                distances[:, 1:].reshape(-1),
                nn_indices[:, 1:].reshape(-1),
                np.concatenate((np.arange(nn_indices.shape[0]) * n_multineighbors, (nn_indices[:, 1:].size,))),
            ),
            shape=(rep.shape[0], rep.shape[0]),

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions