pp.normalize_geometric(protein) #1208

shendong124 · 2020-05-12T02:00:29Z

Hi Scanpy team,
I am trying to analyse CTE-seq data. At the nomalization step of the protein data, the attibute normalize_geometric is not recognize. Could this be a version issue?
Thank you for your help!

sc.pp.normalize_geometric(protein)

<!-- Put your Error output in this code block (if applicable, else delete the block): -->
```pytb
...AttributeError                            Traceback (most recent call last)
<ipython-input-80-db93ca6d0f1d> in <module>
----> 1 sc.pp.normalize_geometric(protein)

AttributeError: module 'scanpy.preprocessing' has no attribute 'normalize_geometric'

Versions:

scanpy==1.4.7.dev30+g668b6776 anndata==0.7.1 umap==0.3.10 numpy==1.16.2 scipy==1.3.0 pandas==0.24.2 scikit-learn==0.22.2.post1 statsmodels==0.10.1 python-igraph==0.7.1 louvain==0.6.1

LuckyMD · 2020-05-12T10:58:40Z

Hi, I'm not sure where you found the function normalize_geometric(), but Scanpy's inbuilt normalization is called sc.pp.normalize_total(). You can find the documentation here:
https://scanpy.readthedocs.io/en/stable/api/scanpy.pp.normalize_total.html

shendong124 · 2020-05-12T16:21:09Z

I found it in the CITEseq tutorial https://scanpy-tutorials.readthedocs.io/en/multiomics/cite-seq/pbmc5k.html Le mar. 12 mai 2020 à 03:59, MalteDLuecken <notifications@github.com> a écrit :

…

Closed #1208 <#1208>. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1208 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AG6NOAWOLOUUOMFRCYB4KMLRRETYZANCNFSM4M6NCTKQ> .

LuckyMD · 2020-05-12T17:44:35Z

@ivirshup where did you get this function from?

maximz · 2020-06-03T22:34:13Z

@shendong124 @ivirshup I assume normalize_geometric was intended to be similar to Seurat's centered log ratio transformation, which is implemented as follows in R: log1p(x = x / (exp(x = sum(log1p(x = x[x > 0]), na.rm = TRUE) / length(x = x)))). This is CLR with some safeguards for 0 counts.

Here's a reimplementation of the Seurat CLR transformation for scanpy. Call this with clr_normalize_each_cell(adata):

def clr_normalize_each_cell(adata, inplace=True):
    """Normalize count vector for each cell, i.e. for each row of .X"""

    import numpy as np
    import scipy

    def seurat_clr(x):
        # TODO: support sparseness
        s = np.sum(np.log1p(x[x > 0]))
        exp = np.exp(s / len(x))
        return np.log1p(x / exp)

    if not inplace:
        adata = adata.copy()

    # apply to dense or sparse matrix, along axis. returns dense matrix
    adata.X = np.apply_along_axis(
        seurat_clr, 1, (adata.X.A if scipy.sparse.issparse(adata.X) else adata.X)
    )
    return adata

maximz · 2020-06-03T22:38:31Z

Actually there's a nice ongoing thread about this at #1117

andreas-wilm · 2021-05-20T03:36:25Z

normalize_geometric() is still mentioned in the tutorial at https://scanpy-tutorials.readthedocs.io/en/multiomics/cite-seq/pbmc5k.html

shendong124 added the Bug 🐛 label May 12, 2020

LuckyMD closed this as completed May 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pp.normalize_geometric(protein) #1208

pp.normalize_geometric(protein) #1208

shendong124 commented May 12, 2020

LuckyMD commented May 12, 2020

shendong124 commented May 12, 2020 via email

LuckyMD commented May 12, 2020

maximz commented Jun 3, 2020

maximz commented Jun 3, 2020

andreas-wilm commented May 20, 2021

pp.normalize_geometric(protein) #1208

pp.normalize_geometric(protein) #1208

Comments

shendong124 commented May 12, 2020

Versions:

LuckyMD commented May 12, 2020

shendong124 commented May 12, 2020 via email

LuckyMD commented May 12, 2020

maximz commented Jun 3, 2020

maximz commented Jun 3, 2020

andreas-wilm commented May 20, 2021