Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Counts data #126

Closed
zhuyewen opened this issue Jun 30, 2023 · 2 comments
Closed

Invalid Counts data #126

zhuyewen opened this issue Jun 30, 2023 · 2 comments

Comments

@zhuyewen
Copy link

zhuyewen commented Jun 30, 2023

Hello, cellphonedb team!

Thank you very much for open sourcing such a great single cell tool. I have been using it from 3.0 until now 4.0 and it has helped me a lot.

But in the recent 4.0, I have encountered some problems. I can work fine using the data you provided. But when using my own data, it prompts an error.

I input these data like this:

cpdb_file_path = '/Users/zhuyewen/Downloads/CellphoneDB-master/db/v4.1.0/cellphonedb.zip'

meta_file_path = '/Users/zhuyewen/R/技术摸索/23.cellphonedb/metadata.tsv'
counts_file_path = '/Users/zhuyewen/R/技术摸索/23.cellphonedb/pbmc3k.h5ad'

out_path = '/Users/zhuyewen/Downloads/CellphoneDB-master/result/method2'

...

from cellphonedb.src.core.methods import cpdb_statistical_analysis_method

deconvoluted, means, pvalues, significant_means = cpdb_statistical_analysis_method.call(
    cpdb_file_path = cpdb_file_path,                 # mandatory: CellPhoneDB database zip file.
    meta_file_path = meta_file_path,                 # mandatory: tsv file defining barcodes to cell label.
    counts_file_path = counts_file_path,             # mandatory: normalized count matrix.
    counts_data = 'hgnc_symbol',                     # defines the gene annotation in counts matrix.
#     microenvs_file_path = microenvs_file_path,       # optional (default: None): defines cells per microenvironment.
    iterations = 1000,                               # denotes the number of shufflings performed in the analysis.
    threshold = 0.1,                                 # defines the min % of cells expressing a gene for this to be employed in the analysis.
    threads = 4,                                     # number of threads to use in the analysis.
    debug_seed = 42,                                 # debug randome seed. To disable >=0.
    result_precision = 3,                            # Sets the rounding for the mean values in significan_means.
    pvalue = 0.05,                                   # P-value threshold to employ for significance.
    subsampling = False,                             # To enable subsampling the data (geometri sketching).
    subsampling_log = False,                         # (mandatory) enable subsampling log1p for non log-transformed data inputs.
    subsampling_num_pc = 100,                        # Number of componets to subsample via geometric skectching (dafault: 100).
    subsampling_num_cells = 1000,                    # Number of cells to subsample (integer) (default: 1/3 of the dataset).
    separator = '|',                                 # Sets the string to employ to separate cells in the results dataframes "cellA|CellB".
    debug = False,                                   # Saves all intermediate tables employed during the analysis in pkl format.
    output_path = out_path,                          # Path to save results.
    output_suffix = None                             # Replaces the timestamp in the output files by a user defined string in the  (default: None).
    )

And I got this error

ParseCountsException: Invalid Counts data

Here's how I generated the expression matrix file in Rstudio


library(SeuratDisk)
library(Seurat)

SaveH5Seurat(pbmc3k.final, filename = "pbmc3k.h5Seurat")
Convert("pbmc3k.h5Seurat", dest = "h5ad",assay = 'RNA')

I am not sure if there is something wrong with the code I used to generate the h5ad file, if so could you please provide the code on how to generate that file using R language?

I browsed through all the issues and found no similar questions or answers. I hope you have time to tell me how to generate h5ad from Seurat object by R, and hopefully provide a solution for people who encounter the same problem later. Thanks a lot.

My platform: MacOs13.3, Apple M1 Ultra, R version 4.3.0, python = 3.8

Best regard

Yewen Zhu

@ktroule
Copy link
Collaborator

ktroule commented Jul 3, 2023

Hi.

We have this notebook on how to convert your seurat object for CellPhoneDB. Other option, can be the use of sceasy to convert from seurat to scanpy.

Kind regards

@luzgaral
Copy link
Contributor

luzgaral commented Jul 6, 2023

Hi,

Also, you can use the strategies mentioned here (how-to-extract-the-cellphonedb-input-files-from-a-seurat-object) to convert Seurat objects into formats accepted by CellPhoneDB.

Best

Luz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants