Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about normalization #104

Open
809825706 opened this issue Nov 22, 2022 · 0 comments
Open

Some questions about normalization #104

809825706 opened this issue Nov 22, 2022 · 0 comments

Comments

@809825706
Copy link

WOT is a great tool for time coures single cell analysis! thank you for developing it. I want to use it to analyze my reprograming data too, but I was confused about the normalization steps in your parper. It seem that you normalized tha data twice (before and after find HVGs).

屏幕截图 2022-11-22 102721

屏幕截图 2022-11-22 102741

I flowed your way (by my own understanding) to use the code below:

import numpy as np
import pandas as pd
import scanpy as sc

adata = sc.read_h5ad("adata_filtered.h5ad")

adata.var_names_make_unique() 
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)

sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
adata

adata.write("ExprMatrix.h5ad")

sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5)
adata = adata[:,adata.var.highly_variable]

sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
adata

adata.write("ExprMatrix.var.genes.h5ad")

I wondered if the following operation is reasonable:

  1. I didn't downsample, It seems not required.
  2. I used the function of find HVGs in scanpy in place of seurat.
  3. And last but not least, which data should the normalization use after select HVGs? raw data? or just like my code above?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant