CLR normalization and scaling #1268

benslack19 · 2019-03-22T23:05:35Z

Hi Seurat team,

I'm using multi-processing output from CITE-seq count as input to Seurat v3. I used CLR normalization on my ADT counts, but I'm wondering about an observation I'm seeing with regards to the distribution of noisy signals. From left to right in the image below, I plotted a simple distribution of raw counts, the CLR normalized values from Seurat and then values where CLR is calculated manually.

You can see that raw counts are heavily right-tailed skewed… not a huge surprise. The Seurat version makes all values positive and has some right skewness, ranging from about 0 to 10 (if you estimate from the x-axis of the plot). However, antibodies that have low signal (the majority) get fattened in this transformation. This okay for the very high expressing antibodies, but for some that have a real but more subtle positive signal, it would get lost in the noise. The manually calculated CLR you can see has a similar range as the Seurat normalization, but you can see that the distribution of the noise is thinner, allowing positive values in the right tail come out. (Some values are less than 0 but I think that’s okay.)

It appears that Seurat applies a scaling factor that brings up the noise of antibody signals which would be otherwise low in the manually calculated CLR normalization. Can you provide more insight on what the scaling factor is doing and possibly comment on our data? (I can't see what the normalization function is doing). Your insights would be much appreciated.

Thanks,
Ben

satijalab · 2019-04-26T23:04:30Z

What function are you using to calculate CLR?

satijalab · 2019-05-10T15:50:03Z

I'm closing this issue now as we have not heard back, but please note that you can see the exact code that we use when running CLR normalization in the NormalizeData.default function.

massonix · 2020-10-09T09:53:59Z

Hi,

I am also concerned in the way CITE-seq data is normalized. Here you can see how the scanpy team is tackling the issue, although they do not provide much detail. Do you have any updates on the current best practices?

Thanks a lot!

ttriche · 2021-07-06T15:34:56Z

It looks like the inverse used for log1p is exp instead of expm1, which is going to cause problems eventually:

clr_function <- function(x) {
                  return(log1p(x = x/(exp(x = sum(log1p(x = x[x > 
                    0]), na.rm = TRUE)/length(x = x)))))
                }

ttriche · 2021-07-06T15:56:26Z

an invertible version of the Seurat CLR is as follows:

cl1pr <- function(x) {
  log1p(x=x/(expm1(x=sum(log1p(x=x[x > 0]), na.rm=TRUE)/length(x=x))))
}

This gives roughly the same results as the stock Seurat CLR but has the benefit of being invertible if needed, given ADT counts.

ttriche · 2021-07-06T16:24:07Z

Note that the corrected nonnegative CLR is roughly interchangeable with stock (plotted on some dendritic cells).

mojaveazure added the more-information-needed We need more information before this can be addressed label Apr 26, 2019

satijalab closed this as completed May 10, 2019

scharch mentioned this issue Feb 16, 2020

CLR transform appears incorrect #2624

Closed

gtca mentioned this issue Jun 17, 2021

Feat / seurat flavored clr scverse/muon#28

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLR normalization and scaling #1268

CLR normalization and scaling #1268

benslack19 commented Mar 22, 2019 •

edited

satijalab commented Apr 26, 2019

satijalab commented May 10, 2019

massonix commented Oct 9, 2020

ttriche commented Jul 6, 2021 •

edited

ttriche commented Jul 6, 2021

ttriche commented Jul 6, 2021

CLR normalization and scaling #1268

CLR normalization and scaling #1268

Comments

benslack19 commented Mar 22, 2019 • edited

satijalab commented Apr 26, 2019

satijalab commented May 10, 2019

massonix commented Oct 9, 2020

ttriche commented Jul 6, 2021 • edited

ttriche commented Jul 6, 2021

ttriche commented Jul 6, 2021

benslack19 commented Mar 22, 2019 •

edited

ttriche commented Jul 6, 2021 •

edited