Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pp.normalize_geometric(protein) #1208

Closed
shendong124 opened this issue May 12, 2020 · 6 comments
Closed

pp.normalize_geometric(protein) #1208

shendong124 opened this issue May 12, 2020 · 6 comments
Labels

Comments

@shendong124
Copy link

Hi Scanpy team,
I am trying to analyse CTE-seq data. At the nomalization step of the protein data, the attibute normalize_geometric is not recognize. Could this be a version issue?
Thank you for your help!

sc.pp.normalize_geometric(protein)

<!-- Put your Error output in this code block (if applicable, else delete the block): -->
```pytb
...AttributeError                            Traceback (most recent call last)
<ipython-input-80-db93ca6d0f1d> in <module>
----> 1 sc.pp.normalize_geometric(protein)

AttributeError: module 'scanpy.preprocessing' has no attribute 'normalize_geometric'

Versions:

scanpy==1.4.7.dev30+g668b6776 anndata==0.7.1 umap==0.3.10 numpy==1.16.2 scipy==1.3.0 pandas==0.24.2 scikit-learn==0.22.2.post1 statsmodels==0.10.1 python-igraph==0.7.1 louvain==0.6.1

@LuckyMD
Copy link
Contributor

LuckyMD commented May 12, 2020

Hi, I'm not sure where you found the function normalize_geometric(), but Scanpy's inbuilt normalization is called sc.pp.normalize_total(). You can find the documentation here:
https://scanpy.readthedocs.io/en/stable/api/scanpy.pp.normalize_total.html

@LuckyMD LuckyMD closed this as completed May 12, 2020
@shendong124
Copy link
Author

shendong124 commented May 12, 2020 via email

@LuckyMD
Copy link
Contributor

LuckyMD commented May 12, 2020

@ivirshup where did you get this function from?

@maximz
Copy link
Contributor

maximz commented Jun 3, 2020

@shendong124 @ivirshup I assume normalize_geometric was intended to be similar to Seurat's centered log ratio transformation, which is implemented as follows in R: log1p(x = x / (exp(x = sum(log1p(x = x[x > 0]), na.rm = TRUE) / length(x = x)))). This is CLR with some safeguards for 0 counts.

Here's a reimplementation of the Seurat CLR transformation for scanpy. Call this with clr_normalize_each_cell(adata):

def clr_normalize_each_cell(adata, inplace=True):
    """Normalize count vector for each cell, i.e. for each row of .X"""

    import numpy as np
    import scipy

    def seurat_clr(x):
        # TODO: support sparseness
        s = np.sum(np.log1p(x[x > 0]))
        exp = np.exp(s / len(x))
        return np.log1p(x / exp)

    if not inplace:
        adata = adata.copy()

    # apply to dense or sparse matrix, along axis. returns dense matrix
    adata.X = np.apply_along_axis(
        seurat_clr, 1, (adata.X.A if scipy.sparse.issparse(adata.X) else adata.X)
    )
    return adata

@maximz
Copy link
Contributor

maximz commented Jun 3, 2020

Actually there's a nice ongoing thread about this at #1117

@andreas-wilm
Copy link

normalize_geometric() is still mentioned in the tutorial at https://scanpy-tutorials.readthedocs.io/en/multiomics/cite-seq/pbmc5k.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants