Skip to content

[BUG] pydeseq2 with anndata>=0.13.0.dev14 #387

@bsiranosian

Description

@bsiranosian

Describe the bug
The latest version of anndata has introduced some changes that seem to break pydeseq2.

To Reproduce
package versions:

  • anndata>=0.13.0.dev14
  • pydeseq2==0.5.1

Create a dummy anndata and attempt to normalize

design_factors = "type"
counts = pd.DataFrame([[5, 5, 5, 5], [4, 4, 4, 4], [3, 3, 3, 3], [2, 2, 2, 2], [1, 1, 1, 1], [1, 1, 1, 1]])
obs_meta = pd.DataFrame([["A"], ["A"], ["A"], ["B"], ["B"], ["B"]], columns=["type"], index=["A1","A2", "A3", "B1", "B2", "B3"])
var_meta = pd.DataFrame(
    data={
        "gene_id": ["ENSG00000000003", "ENSG00000000005", "ENSG00000000419", "ENSG00000000457"],
        "gene_name": ["TSPAN6", "TNMD", "DPM1", "SCYL3"],
    },
    index=["TSPAN6", "TNMD", "DPM1", "SCYL3"],
)
adata_count = anndata.AnnData(counts, obs=obs_meta, var=var_meta)

dds = DeseqDataSet(
    counts=adata_count.X,
    metadata=adata_count.obs,
    design_factors=design_factors,
    refit_cooks=refit_cooks,
    inference=inference,
    quiet=quiet,
    ref_level=ref_level,
)

dds.fit_size_factors()
dds.fit_genewise_dispersions()
dds.fit_dispersion_trend()
dds.fit_dispersion_prior()
dds.fit_MAP_dispersions()
dds.fit_LFC()
dds.calculate_cooks()

Gives an error like

self = AnnData object with n_obs × n_vars = 6 × 4
    obs: 'sample-type'
    obsm: 'design_matrix', 'size_factors'
    varm: 'non_zero'
    layers: 'normed_counts'

    def fit_genewise_dispersions(self) -> None:
        """Fit gene-wise dispersion estimates.
    
        Fits a negative binomial per gene, independently.
        """
        # Check that size factors are available. If not, compute them.
        if "size_factors" not in self.obsm:
            self.fit_size_factors()
    
        # Exclude genes with all zeroes
        self.varm["non_zero"] = ~(self.X == 0).all(axis=0)
>       self.non_zero_idx = np.arange(self.n_vars)[self.varm["non_zero"]]
E       IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

This can be fixed by storing the "non_zero" column in var instead of varm, but there are further errors that are similar once this is fixed (also related to varm) that I haven't looked into yet.

Expected behavior
Normalization works as expected.

Desktop (please complete the following information):

  • OS: ubunu
  • Version 22.04

Additional context
Some recent commits in anndata reference varm: scverse/anndata@a5deb2a

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions