Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with .highly_variable_genes #2242

Closed
dr-sayyadhury opened this issue Apr 22, 2022 · 2 comments
Closed

Problem with .highly_variable_genes #2242

dr-sayyadhury opened this issue Apr 22, 2022 · 2 comments
Labels
Needs info❔ More information needed

Comments

@dr-sayyadhury
Copy link

dr-sayyadhury commented Apr 22, 2022

Hi, I know this issue has been previously opened but I am still unable to resolve this problem. Any help would be great.

I am new to Scanpy and I followed this tutorial link below.
https://nbisweden.github.io/workshop-scRNAseq/labs/compiled/scanpy/scanpy_01_qc.html

Its a great tutorial and everything is working till I start the following code:-

sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5, flavor='seurat')

The error I receive is

sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5, flavor='seurat')
/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/scanpy/preprocessing/_highly_variable_genes.py:200: RuntimeWarning: overflow encountered in expm1
X = np.expm1(X)
/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/scanpy/preprocessing/_utils.py:11: RuntimeWarning: overflow encountered in multiply
mean_sq = np.multiply(X, X).mean(axis=axis, dtype=np.float64)
/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/scanpy/preprocessing/_utils.py:12: RuntimeWarning: invalid value encountered in subtract
var = mean_sq - mean**2
Traceback (most recent call last):

File "/var/folders/xl/40x0m_b12y5fz7w2hqr_yf480000gp/T/ipykernel_11768/414963115.py", line 1, in <cell line: 1>
sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5, flavor='seurat')

File "/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 434, in highly_variable_genes
df = _highly_variable_genes_single_batch(

File "/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/scanpy/preprocessing/_highly_variable_genes.py", line 215, in _highly_variable_genes_single_batch
df['mean_bin'] = pd.cut(df['means'], bins=n_bins)

File "/Users/ShaminiAyyadhury/anaconda3/envs/scIntegration/lib/python3.10/site-packages/pandas/core/reshape/tile.py", line 262, in cut
raise ValueError(

ValueError: cannot specify integer bins when input data contains infinity


Concerns:

  1. I am not sure if its the way I created the anndata that is causing this problem.
  2. I have already log_normalized my data object and I am not sure what else to do.
  3. I have also read the GitHub issues and tried to fix the problems but I am unable to.
  4. I am attaching my own code here, which is exactly the one found on the website above. The error code is in the last few lines of my script. Not sure if anything I am doing before is causing the problem.

[scanpy.txt]
(https://github.com/scverse/scanpy/files/8536536/scanpy.txt)

Thank you.
Regards,
Shamini A

@LisaSikkema
Copy link
Contributor

Hi Shamini,

Have you tried running the highly variable genes function on the non-log-transformed, non-normalised counts? You want to use raw counts, see the documentation:
Expects logarithmized data, except when flavor='seurat_v3', in which count data is expected.
The numbers in your count matrix are too large at some point in the hvg calculation, might be solved by passing it the data in the correct format!

@eroell eroell added the Needs info❔ More information needed label Oct 10, 2023
@eroell
Copy link
Contributor

eroell commented Nov 2, 2023

Just came across this, as we haven't heard back after the follow-up we will close the issue for now, hopefully you obtained the expected behaviour in the end :)

However, please don't hesitate to reopen this issue or create a new one if you have any more questions or run into any related problems in the future.

Thanks for being a part of our community! :)

@eroell eroell closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs info❔ More information needed
Projects
None yet
Development

No branches or pull requests

3 participants