Skip to content

Dask support for min/max #134

@quentinblampey

Description

@quentinblampey

Thanks again @flying-sheep for merging the PR about min/max.
While using the new release (1.3.0), I saw two bugs:

from fast_array_utils import stats
import dask.array as da
import numpy as np
from scipy.sparse import csr_matrix
from anndata import AnnData
import anndata

# create multiple data inputs

x_np = np.array([
    [1, 0, 0, 3],
    [0, 5, 6, 0],
    [7, 0, -1, 0],
])

x_sparse = csr_matrix(x_np)
x_dask = da.from_array(x_np)

adata = AnnData(X=x_np)
adata.write_h5ad("test_backed.h5ad")
x_backed = anndata.read_h5ad("test_backed.h5ad", backed="r").X

Bug 1 - mean_var on h5py.Dataset

>>> stats.mean_var(x_backed)
TypeError: unsupported operand type(s) for ** or pow(): 'Dataset' and 'int'

Bug 2 - min/max on dask array

>>> stats.max(x_dask)
TypeError: max() got an unexpected keyword argument 'dtype'

I can have a quick look this week

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions