Skip to content

highly variable genes + batch_key --> reciprocal condition number error #2669

@iamsalil

Description

@iamsalil

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the master branch of scanpy.

What happened?

Note: This bug seems to have been mentioned on 2/13/2022 in this discourse.scverse.org thread: https://discourse.scverse.org/t/error-in-highly-variable-gene-selection/276. However, I can't seem to access scverse.org so I can't see what the cause/resolution was.

Issue: I am running highly_variable_genes with flavor='seurat_v3'. When I do not include a batch_key, the function runs fine. When I add a batch_key, I get a numerical error.

Minimal code sample

sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=1000, batch_key="Covariate")

Error output

----> 1 sc.pp.highly_variable_genes(adata, flavor='seurat_v3', n_top_genes=1000, batch_key="Covariate")

File ~/miniconda3/envs/singlecell/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:428, in highly_variable_genes(adata, layer, n_top_genes, min_disp, max_disp, min_mean, max_mean, span, n_bins, flavor, subset, inplace, batch_key, check_values)
    422     raise ValueError(
    423         '`pp.highly_variable_genes` expects an `AnnData` argument, '
    424         'pass `inplace=False` if you want to return a `pd.DataFrame`.'
    425     )
    427 if flavor == 'seurat_v3':
--> 428     return _highly_variable_genes_seurat_v3(
    429         adata,
    430         layer=layer,
    431         n_top_genes=n_top_genes,
    432         batch_key=batch_key,
    433         check_values=check_values,
    434         span=span,
    435         subset=subset,
    436         inplace=inplace,
    437     )
    439 if batch_key is None:
    440     df = _highly_variable_genes_single_batch(
    441         adata,
    442         layer=layer,
   (...)
    449         flavor=flavor,
    450     )

File ~/miniconda3/envs/singlecell/lib/python3.8/site-packages/scanpy/preprocessing/_highly_variable_genes.py:85, in _highly_variable_genes_seurat_v3(adata, layer, n_top_genes, batch_key, check_values, span, subset, inplace)
     83 x = np.log10(mean[not_const])
     84 model = loess(x, y, span=span, degree=2)
---> 85 model.fit()
     86 estimat_var[not_const] = model.outputs.fitted_values
     87 reg_std = np.sqrt(10**estimat_var)

File _loess.pyx:899, in _loess.loess.fit()

ValueError: b'reciprocal condition number  6.4708e-16'

Versions

Details
-----
anndata     0.9.1
scanpy      1.9.3
-----
PIL                         9.5.0
asttokens                   NA
backcall                    0.2.0
cffi                        1.15.1
comm                        0.1.2
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.5.1
decorator                   5.1.1
defusedxml                  0.7.1
executing                   0.8.3
h5py                        3.8.0
igraph                      0.10.4
importlib_resources         NA
ipykernel                   6.19.2
ipython_genutils            0.2.0
jedi                        0.18.1
joblib                      1.2.0
jupyter_server              1.23.4
kiwisolver                  1.4.4
leidenalg                   0.9.1
llvmlite                    0.40.0
matplotlib                  3.7.1
mpl_toolkits                NA
natsort                     8.3.1
numba                       0.57.0
numpy                       1.24.3
packaging                   23.0
pandas                      2.0.1
parso                       0.8.3
pexpect                     4.8.0
pickleshare                 0.7.5
pkg_resources               NA
platformdirs                2.5.2
prompt_toolkit              3.0.36
psutil                      5.9.0
ptyprocess                  0.7.0
pure_eval                   0.2.2
pydev_ipython               NA
pydevconsole                NA
pydevd                      2.6.0
pydevd_concurrency_analyser NA
pydevd_file_utils           NA
pydevd_plugins              NA
pydevd_tracing              NA
pygments                    2.15.1
pyparsing                   3.0.9
pytz                        2022.7
scipy                       1.10.1
session_info                1.0.0
setuptools                  66.0.0
six                         1.16.0
sklearn                     1.2.2
stack_data                  0.2.0
texttable                   1.6.7
threadpoolctl               3.1.0
tornado                     6.2
traitlets                   5.7.1
typing_extensions           NA
wcwidth                     0.2.5
yaml                        6.0.1
zipp                        NA
zmq                         25.0.2
-----
IPython             8.12.0
jupyter_client      8.1.0
jupyter_core        5.3.0
jupyterlab          3.5.3
notebook            6.5.4
-----
Python 3.8.16 (default, Mar  2 2023, 03:21:46) [GCC 11.2.0]
Linux-4.15.0-212-generic-x86_64-with-glibc2.17
-----
Session information updated at 2023-09-19 02:18

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions