Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raw is also normalized with clr #126

Closed
GirayEryilmaz opened this issue Sep 9, 2023 · 4 comments
Closed

raw is also normalized with clr #126

GirayEryilmaz opened this issue Sep 9, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@GirayEryilmaz
Copy link

When I save the raw protein counts, and apply clr, the raw counts are also normalized.

Snippet:

adt = pbmc.mod['adt'].copy()
adt.raw = adt

mu.prot.pp.clr(adt)

After this, I expect adt.X to be clr normalized however adt.raw.X to stay the same. Yet, it is also normalized.

@GirayEryilmaz GirayEryilmaz added the bug Something isn't working label Sep 9, 2023
@gtca
Copy link
Collaborator

gtca commented Sep 9, 2023

This seems to have expected behaviour:

import numpy as np
from anndata import AnnData
from muon import prot as pt

adata = AnnData(np.arange(1000).reshape(-1, 10))
adata.raw = adata

pt.pp.clr(adata)
print(adata.X[:2,:2])
# [[0.         0.00276899]
#  [0.02767339 0.03004515]]
print(adata.raw.X[:2,:2])
# [[ 0.  1.]
#  [10. 11.]]

Any idea how your observation can be reproduced?

@GirayEryilmaz
Copy link
Author

GirayEryilmaz commented Sep 9, 2023

Very interesting!
Here is my attempt:

import muon as mu
import anndata
import numpy as np

x = np.array([[10, 10],
              [10, 20]], dtype=float)
adata = anndata.AnnData(x)
adata.raw = adata

mu.prot.pp.clr(adata)
print(adata.X)
# [[0.64662716 0.50558292]
#  [0.64662716 0.83979984]]
print(adata.raw.X)
# [[0.64662716 0.50558292]
# [0.64662716 0.83979984]]
print(mu.__version__) # 0.1.5
print(anndata.__version__) # 0.9.2
print(sc.__version__) # 1.9.3

Also I noticed that giving inplace = False solves the problem for me:

x = np.array([[10, 10],
              [10, 20]], dtype=float)
adata = anndata.AnnData(x)
adata.raw = adata

adata = mu.prot.pp.clr(adata, inplace=False)
print(adata.X)
# [[0.64662716 0.50558292]
# [0.64662716 0.83979984]]
print(adata.raw.X)
# [[10. 10.]
# [10. 20.]]

Still I would like to understand what is going on and avoid an unnecessary copying operation. Any suggestions are welcome!

@GirayEryilmaz
Copy link
Author

I found the source of the problem.

x = np.array([[10, 20]], dtype=float)
adata = anndata.AnnData(x)

adata.raw = adata

print(adata.X is adata.raw.X) # Prints True

For me, apparently, adata.X IS adata.raw.X. That is why when clr normalizes X, raw.X is also normalized.
I don't know if this is the intended behavior of raw and I don't know why I am having this issue whilst @gtca is not.

@gtca would you mind sharing which version of Anndata you are using?

@gtca
Copy link
Collaborator

gtca commented Sep 12, 2023

Indeed, in anndata v0.8:

id(adata.raw.X) == id(adata.X)
# => False

whereas in anndata v0.9.2 as well as in anndata v.10:

id(adata.raw.X) == id(adata.X)
# => True

Probably a new issue in the anndata repo is a better place to track this then: scverse/anndata#1139!

@gtca gtca closed this as completed Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants