Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): pre-processing functions for dask with sparse chunks #2856

Merged
merged 96 commits into from Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
d3163cc
(chore): add dask sparse chunks creation
ilan-gold Feb 27, 2024
2a7a54c
(feat): add dask summation
ilan-gold Feb 27, 2024
8dd9a7a
(refactor): `materialize_as_ndarray` needs to operate on indidiual da…
ilan-gold Feb 27, 2024
d55b6a4
(feat): `filter_genes` and `filter_cells`
ilan-gold Feb 27, 2024
41a5f15
(feat): normalization
ilan-gold Feb 27, 2024
e36699e
(fix) `lop1p` tests working
ilan-gold Feb 27, 2024
da6eff0
(refactor): clean up writing test
ilan-gold Feb 27, 2024
63ca2f0
(refactor): us `da.chunk.sum`
ilan-gold Feb 27, 2024
fd22a19
(fix): remove `Client`
ilan-gold Feb 27, 2024
8b9a792
(refactor): remove unnecessary `count_nonzero`
ilan-gold Feb 27, 2024
1592571
(fix): change expected fail on sparse normalization
ilan-gold Feb 27, 2024
6ac32e5
(fix): update comment
ilan-gold Feb 28, 2024
78a3ab6
(feat): `_get_mean_var` dask
ilan-gold Feb 28, 2024
60bbdb8
(feat): clean up tests for what should/should not work
ilan-gold Feb 28, 2024
0c1f254
(refactor): `_compat.sum` to `_utils.elem_sum`
ilan-gold Feb 28, 2024
2f4d11a
(chore): add `elem_sum` test
ilan-gold Feb 28, 2024
12502e8
(refactor): `elem_sum` -> `axis_sum`
ilan-gold Feb 29, 2024
b3bb95a
(feat): add `scale` support
ilan-gold Feb 29, 2024
2b6f717
(fix): maintain dtype
ilan-gold Feb 29, 2024
448dc40
(chore): add back condition
ilan-gold Feb 29, 2024
7226bf0
(fix): use `sum` when needed
ilan-gold Feb 29, 2024
2bc7c3a
(chore): release notes
ilan-gold Feb 29, 2024
62c75fc
(fx): don't use `mean_func` name twice
ilan-gold Feb 29, 2024
b987a68
(chore): revert sparse-chunks-in-dask
ilan-gold Feb 29, 2024
902238a
(chore): type hint
ilan-gold Feb 29, 2024
0abbab5
(chore): check `test_compare_to_upstream`
ilan-gold Feb 29, 2024
a8606ae
(chore): remove comment
ilan-gold Feb 29, 2024
feac6bc
(chore): allow passing `dtype` arg in `axis_sum`
ilan-gold Feb 29, 2024
bcdeddb
(fix): revert fixture changes
ilan-gold Feb 29, 2024
4716d8f
(refactor): cleaner with `array_type` conversion before if-then
ilan-gold Feb 29, 2024
3912b63
(chore): clarify hvg support
ilan-gold Feb 29, 2024
c884c63
(chore): handle array types better
ilan-gold Mar 1, 2024
af351d4
(chore): clean up `materialize_as_ndarray`
ilan-gold Mar 1, 2024
da22953
(chore): fix typing/dispatch problem in 3.9
ilan-gold Mar 1, 2024
dbbc6a2
(chore): `list` type -> `Callable`
ilan-gold Mar 1, 2024
6a4f0c5
(feat): `row_divide` for better division handling
ilan-gold Mar 1, 2024
c0182cb
(fix): use `tuple` for `ARRAY_TYPEXXXX`
ilan-gold Mar 1, 2024
a065a78
(refactor): `mean_func` -> `axis_mean` + types
ilan-gold Mar 1, 2024
c3ee138
(chore): remove unnecessary aggregation
ilan-gold Mar 1, 2024
c246a41
(fix): raise `ValueError` for summing over more than one axis
ilan-gold Mar 1, 2024
743c327
(fix): grammar
ilan-gold Mar 1, 2024
db88560
(fix): better type hints
ilan-gold Mar 1, 2024
2a5faa6
(revert): use old `test_normalize_total` siince we have `csr`
ilan-gold Mar 1, 2024
d6ceb4c
(revert): extraneous diff
ilan-gold Mar 1, 2024
48a1a1e
(fix): try `Union`
ilan-gold Mar 1, 2024
07fc5ba
(chore): add column division ability
ilan-gold Mar 1, 2024
3cc4be2
(chore): add scale test
ilan-gold Mar 1, 2024
4cc9eef
(fix): duplicate in release note
ilan-gold Mar 1, 2024
d8afe5c
(refactor): guard clause + comments
ilan-gold Mar 1, 2024
271d5d8
(chore): add `out` check for `dask`
ilan-gold Mar 1, 2024
c61324b
(chore): add `divisor` type hints
ilan-gold Mar 1, 2024
c688aff
(fix): remove some erroneous diffs
ilan-gold Mar 1, 2024
02be7a7
(chore): `axis_{sum,mean}` type hint fixes
ilan-gold Mar 1, 2024
6acc08c
(refactor): generalize to scaling
ilan-gold Mar 4, 2024
0944429
(chore): remove erroneous comment
ilan-gold Mar 4, 2024
3538572
(chore): remove non-public API
ilan-gold Mar 4, 2024
5ef1487
(fix): import from `sc._utils`
ilan-gold Mar 4, 2024
0f43362
(fix): `inidices` -> `indices`
ilan-gold Mar 4, 2024
c100a8f
(fix): remove erroneous `axis_sum` calls
ilan-gold Mar 5, 2024
22b4e90
(fix): return statements for `axis_scale`
ilan-gold Mar 5, 2024
4fef58e
(refactor): return out of `axis_sum` if `X._meta` is `np.ndarray`
ilan-gold Mar 5, 2024
ce574e3
(core): comment fix
ilan-gold Mar 5, 2024
e5a82fc
(fix): use `normalize_total` in HVG test for true reproducibility
ilan-gold Mar 5, 2024
a4e53a6
(refactor): separate out `out` test for dask
ilan-gold Mar 6, 2024
f0b2d97
(fix): correct chunking/rechunking behavior
ilan-gold Mar 6, 2024
f9ea93d
(chore): add guard clause for `sparse` `out != X != None` in scaling
ilan-gold Mar 6, 2024
66f04b6
(fix): guard clause condition
ilan-gold Mar 6, 2024
daca210
(fix): try finishing `|` typing for 3.9
ilan-gold Mar 6, 2024
036391e
(fix): call `register` to allow unions?
ilan-gold Mar 6, 2024
cac4160
(fix): clarify warning
ilan-gold Mar 6, 2024
9ec6935
(feat): test for `max_value`/`zero_center` combos
ilan-gold Mar 6, 2024
0ae76ee
(fix): allow settings of `X` in `scale_array`
ilan-gold Mar 6, 2024
2367f46
(chore): add tests for `normalize` correctness
ilan-gold Mar 6, 2024
b2c3a96
(fix): refactor for pure dask in `median`
ilan-gold Mar 6, 2024
340894b
Merge branch 'main' into dask-sparse-mean-var
ilan-gold Mar 6, 2024
fa66f58
(refactor): add clarifying condition
ilan-gold Mar 6, 2024
2601fe8
Merge branch 'dask-sparse-mean-var' of github.com:scverse/scanpy into…
ilan-gold Mar 6, 2024
750af59
(chore): skip warning computations + tests
ilan-gold Mar 6, 2024
25fe1f9
(fix): actually skip computation in `normalize_total` condition
ilan-gold Mar 6, 2024
57c8389
(fix): actually skip in `filter_genes` + tests
ilan-gold Mar 6, 2024
69ebf98
(fix): use all-in-one median implemetation
ilan-gold Mar 7, 2024
67f47f4
(refactor): remove erreous dask warnings
ilan-gold Mar 7, 2024
e328eb5
(chore): add note about `exclude_highly_expressed`
ilan-gold Mar 7, 2024
0aafabd
(feat): `axis_scale` -> `axis_mul_or_truediv`
ilan-gold Mar 7, 2024
be988c9
(feat): `allow_divide_by_zero`
ilan-gold Mar 7, 2024
3166909
(chore): add notes + type hints
ilan-gold Mar 7, 2024
6552324
Have hvg compute earlier and only once
ivirshup Mar 20, 2024
936eb87
Merge branch 'main' into dask-sparse-mean-var
ilan-gold Mar 21, 2024
37eb1a0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 21, 2024
0cecb94
(refactor): make codecov better by removing dead code/refactoring
ilan-gold Mar 22, 2024
21faa0d
(fix): `np.clip` in dask does not take min/max as `kwargs`
ilan-gold Mar 22, 2024
5998ae8
Update docs/release-notes/1.11.0.md
ilan-gold Mar 22, 2024
f49b929
(chore): move release note
ilan-gold Mar 22, 2024
937c6db
Merge branch 'main' into dask-sparse-mean-var
ilan-gold Mar 22, 2024
ba445f8
(chore): remove erroneous comment
ilan-gold Mar 22, 2024
b3581ea
Merge branch 'dask-sparse-mean-var' of github.com:scverse/scanpy into…
ilan-gold Mar 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
31 changes: 30 additions & 1 deletion scanpy/_compat.py
@@ -1,11 +1,14 @@
from __future__ import annotations

from dataclasses import dataclass, field
from functools import partial
from functools import partial, singledispatch
from pathlib import Path

import dask.array as da
import numpy as np
from legacy_api_wrap import legacy_api
from packaging import version
from scipy import sparse as sp

try:
from functools import cache
Expand Down Expand Up @@ -81,3 +84,29 @@ def pkg_version(package):


old_positionals = partial(legacy_api, category=FutureWarning)


@singledispatch
def sum(X: np.ndarray | sp.spmatrix, axis=None):
return np.sum(X, axis=axis)


@sum.register
def _(X: da.Array, axis=None):
def sum_drop_keepdims(*args, **kwargs):
kwargs.pop("computing_meta", None)
if isinstance(X._meta, (sp.spmatrix, np.matrix)) or isinstance(
args[0], (sp.spmatrix, np.matrix)
): # forcing the `_meta` to be a sparse array really isn't desirable?
kwargs.pop("keepdims", None)
if isinstance(kwargs["axis"], tuple):
kwargs["axis"] = kwargs["axis"][0]
return da.chunk.sum(*args, **kwargs)

dtype = getattr(np.zeros(1, dtype=X.dtype).sum(), "dtype", object)

# operates on `np.matrix` for some reason with sparse chunks in dask so need explicit casting
def aggregate_sum(*args, **kwargs):
return da.chunk.sum(np.array(args[0]), **kwargs)

return da.reduction(X, sum_drop_keepdims, aggregate_sum, axis=axis, dtype=dtype)
11 changes: 4 additions & 7 deletions scanpy/preprocessing/_distributed.py
Expand Up @@ -35,15 +35,12 @@ def materialize_as_ndarray(


def materialize_as_ndarray(
a: ArrayLike | tuple[ArrayLike | ZappyArray | DaskArray, ...],
a: DaskArray | ArrayLike | tuple[ArrayLike | ZappyArray | DaskArray, ...],
) -> tuple[np.ndarray] | np.ndarray:
"""Compute distributed arrays and convert them to numpy ndarrays."""
if isinstance(a, DaskArray):
return a.compute()
if not isinstance(a, tuple):
return np.asarray(a)

if not any(isinstance(arr, DaskArray) for arr in a):
return tuple(np.asarray(arr) for arr in a)

import dask.array as da

return da.compute(*a, sync=True)
return tuple(materialize_as_ndarray(arr) for arr in a)
ilan-gold marked this conversation as resolved.
Show resolved Hide resolved
16 changes: 9 additions & 7 deletions scanpy/preprocessing/_normalization.py
Expand Up @@ -8,7 +8,7 @@
from sklearn.utils import sparsefuncs

from .. import logging as logg
from .._compat import DaskArray, old_positionals
from .._compat import DaskArray, old_positionals, sum
from .._utils import view_to_actual
from ..get import _get_obs_rep, _set_obs_rep

Expand All @@ -35,7 +35,7 @@ def _normalize_data(X, counts, after=None, copy: bool = False):
elif isinstance(counts, np.ndarray):
np.divide(X, counts[:, None], out=X)
else:
X = np.divide(X, counts[:, None]) # dask does not support kwarg "out"
X = X / counts[:, None]
ilan-gold marked this conversation as resolved.
Show resolved Hide resolved
return X


Expand Down Expand Up @@ -187,22 +187,23 @@ def normalize_total(

gene_subset = None
msg = "normalizing counts per cell"

counts_per_cell = sum(X, axis=1)
if exclude_highly_expressed:
counts_per_cell = X.sum(1) # original counts per cell
counts_per_cell = sum(X, axis=1) # original counts per cell
counts_per_cell = np.ravel(counts_per_cell)

# at least one cell as more than max_fraction of counts per cell

gene_subset = (X > counts_per_cell[:, None] * max_fraction).sum(0)
gene_subset = sum((X > counts_per_cell[:, None] * max_fraction), axis=0)
gene_subset = np.asarray(np.ravel(gene_subset) == 0)

msg += (
". The following highly-expressed genes are not considered during "
f"normalization factor computation:\n{adata.var_names[~gene_subset].tolist()}"
)
counts_per_cell = X[:, gene_subset].sum(1)
else:
counts_per_cell = X.sum(1)
counts_per_cell = sum(X[:, gene_subset], axis=1)

start = logg.info(msg)
counts_per_cell = np.ravel(counts_per_cell)

Expand Down Expand Up @@ -253,3 +254,4 @@ def normalize_total(
return adata
elif not inplace:
return dat
return None
10 changes: 5 additions & 5 deletions scanpy/preprocessing/_simple.py
Expand Up @@ -17,7 +17,7 @@
from sklearn.utils import check_array, sparsefuncs

from .. import logging as logg
from .._compat import old_positionals
from .._compat import old_positionals, sum
ivirshup marked this conversation as resolved.
Show resolved Hide resolved
from .._settings import settings as sett
from .._utils import (
AnyRandom,
Expand Down Expand Up @@ -162,7 +162,7 @@ def filter_cells(
X = data # proceed with processing the data matrix
min_number = min_counts if min_genes is None else min_genes
max_number = max_counts if max_genes is None else max_genes
number_per_cell = np.sum(
number_per_cell = sum(
X if min_genes is None and max_genes is None else X > 0, axis=1
)
if issparse(X):
Expand All @@ -172,7 +172,7 @@ def filter_cells(
if max_number is not None:
cell_subset = number_per_cell <= max_number

s = materialize_as_ndarray(np.sum(~cell_subset))
s = materialize_as_ndarray(sum(~cell_subset))
if s > 0:
msg = f"filtered out {s} cells that have "
if min_genes is not None or min_counts is not None:
Expand Down Expand Up @@ -278,7 +278,7 @@ def filter_genes(
X = data # proceed with processing the data matrix
min_number = min_counts if min_cells is None else min_cells
max_number = max_counts if max_cells is None else max_cells
number_per_gene = np.sum(
number_per_gene = sum(
X if min_cells is None and max_cells is None else X > 0, axis=0
)
if issparse(X):
Expand All @@ -288,7 +288,7 @@ def filter_genes(
if max_number is not None:
gene_subset = number_per_gene <= max_number

s = np.sum(~gene_subset)
s = sum(~gene_subset)
if s > 0:
msg = f"filtered out {s} genes that are detected "
if min_cells is not None or min_counts is not None:
Expand Down
52 changes: 51 additions & 1 deletion scanpy/testing/_helpers/data.py
Expand Up @@ -18,11 +18,15 @@

from typing import TYPE_CHECKING

import dask.array as da
from dask import delayed
from scipy import sparse

import scanpy as sc

if TYPE_CHECKING:
from anndata import AnnData

from anndata._core.sparse_dataset import SparseDataset
# Functions returning the same objects (easy to misuse)


Expand Down Expand Up @@ -76,3 +80,49 @@

def pbmc3k_normalized() -> AnnData:
return _pbmc3k_normalized().copy()


class CSRCallable:
"""Dummy class to bypass dask checks"""

def __new__(cls, shape, dtype):
return csr_callable(shape, dtype)


def csr_callable(shape: tuple[int, int], dtype) -> sparse.csr_matrix:
if len(shape) == 0:
shape = (0, 0)
if len(shape) == 1:
shape = (shape[0], 0)

Check warning on line 96 in scanpy/testing/_helpers/data.py

View check run for this annotation

Codecov / codecov/patch

scanpy/testing/_helpers/data.py#L96

Added line #L96 was not covered by tests
elif len(shape) == 2:
pass
else:
raise ValueError(shape)

Check warning on line 100 in scanpy/testing/_helpers/data.py

View check run for this annotation

Codecov / codecov/patch

scanpy/testing/_helpers/data.py#L100

Added line #L100 was not covered by tests

return sparse.csr_matrix(shape, dtype=dtype)


def make_dask_chunk(x: SparseDataset, start: int, end: int) -> da.Array:
def take_slice(x, idx):
return x[idx]

return da.from_delayed(
delayed(take_slice)(x, slice(start, end)),
dtype=x.dtype,
shape=(end - start, x.shape[1]),
meta=CSRCallable,
)


def sparse_dataset_as_dask(x: SparseDataset, stride: int):
n_chunks, rem = divmod(x.shape[0], stride)

chunks = []
cur_pos = 0
for i in range(n_chunks):
chunks.append(make_dask_chunk(x, cur_pos, cur_pos + stride))
cur_pos += stride
if rem:
chunks.append(make_dask_chunk(x, cur_pos, x.shape[0]))

Check warning on line 126 in scanpy/testing/_helpers/data.py

View check run for this annotation

Codecov / codecov/patch

scanpy/testing/_helpers/data.py#L126

Added line #L126 was not covered by tests

return da.concatenate(chunks, axis=0)
ilan-gold marked this conversation as resolved.
Show resolved Hide resolved
10 changes: 1 addition & 9 deletions scanpy/testing/_pytest/params.py
Expand Up @@ -66,15 +66,7 @@ def param_with(
at for (_, spsty), ats in MAP_ARRAY_TYPES.items() if spsty == "dense" for at in ats
)

ARRAY_TYPES_SUPPORTED = tuple(
(
param_with(at, marks=[pytest.mark.xfail(reason="sparse-in-dask not supported")])
if attrs == ("dask", "sparse")
else at
)
for attrs, ats in MAP_ARRAY_TYPES.items()
for at in ats
)
ARRAY_TYPES_SUPPORTED = tuple(at for ats in MAP_ARRAY_TYPES.values() for at in ats)
"""
Sparse matrices in dask arrays aren’t officially supported upstream,
so add xfail to them.
Expand Down
4 changes: 4 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/.zattrs
@@ -0,0 +1,4 @@
{
"encoding-type": "anndata",
"encoding-version": "0.1.0"
}
4 changes: 2 additions & 2 deletions scanpy/tests/_data/10x-10k-subset.zarr/X/.zarray
@@ -1,7 +1,7 @@
{
"chunks": [
2000,
1000
1250,
125
],
"compressor": {
"blocksize": 0,
Expand Down
4 changes: 4 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/X/.zattrs
@@ -0,0 +1,4 @@
{
"encoding-type": "array",
"encoding-version": "0.2.0"
}
Binary file modified scanpy/tests/_data/10x-10k-subset.zarr/X/0.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/0.7
Binary file not shown.
Binary file modified scanpy/tests/_data/10x-10k-subset.zarr/X/1.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/1.7
Binary file not shown.
Binary file modified scanpy/tests/_data/10x-10k-subset.zarr/X/2.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/2.7
Binary file not shown.
Binary file modified scanpy/tests/_data/10x-10k-subset.zarr/X/3.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/3.7
Binary file not shown.
Binary file modified scanpy/tests/_data/10x-10k-subset.zarr/X/4.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/4.7
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/5.7
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/6.7
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.0
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.1
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.2
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.3
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.4
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.5
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.6
Binary file not shown.
Binary file added scanpy/tests/_data/10x-10k-subset.zarr/X/7.7
Binary file not shown.
4 changes: 4 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/.zattrs
@@ -0,0 +1,4 @@
{
"encoding-type": "dict",
"encoding-version": "0.1.0"
}
3 changes: 3 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/.zgroup
@@ -0,0 +1,3 @@
{
"zarr_format": 2
}
8 changes: 8 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/CSR_X/.zattrs
@@ -0,0 +1,8 @@
{
"encoding-type": "csr_matrix",
"encoding-version": "0.1.0",
"shape": [
10000,
1000
]
}
3 changes: 3 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/CSR_X/.zgroup
@@ -0,0 +1,3 @@
{
"zarr_format": 2
}
20 changes: 20 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/CSR_X/data/.zarray
@@ -0,0 +1,20 @@
{
"chunks": [
43245
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<i8",
"fill_value": 0,
"filters": null,
"order": "C",
"shape": [
691914
],
"zarr_format": 2
}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
@@ -0,0 +1,20 @@
{
"chunks": [
86490
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<i4",
"fill_value": 0,
"filters": null,
"order": "C",
"shape": [
691914
],
"zarr_format": 2
}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
20 changes: 20 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/layers/CSR_X/indptr/.zarray
@@ -0,0 +1,20 @@
{
"chunks": [
10001
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<i4",
"fill_value": 0,
"filters": null,
"order": "C",
"shape": [
10001
],
"zarr_format": 2
}
Binary file not shown.
6 changes: 6 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/obs/.zattrs
@@ -0,0 +1,6 @@
{
"_index": "index",
"column-order": [],
"encoding-type": "dataframe",
"encoding-version": "0.2.0"
}
3 changes: 3 additions & 0 deletions scanpy/tests/_data/10x-10k-subset.zarr/obs/.zgroup
@@ -0,0 +1,3 @@
{
"zarr_format": 2
}
Binary file removed scanpy/tests/_data/10x-10k-subset.zarr/obs/0
Binary file not shown.
Expand Up @@ -9,14 +9,13 @@
"id": "blosc",
"shuffle": 1
},
"dtype": [
[
"index",
"|S18"
]
"dtype": "|O",
"fill_value": 0,
"filters": [
{
"id": "vlen-utf8"
}
],
"fill_value": "AAAAAAAAAAAAAAAAAAAAAAAA",
"filters": null,
"order": "C",
"shape": [
10000
Expand Down