Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Lack of "params" entry in adata.uns["pca"] might break compatibility with scanpy #90

Closed
lorenzoamir opened this issue Nov 9, 2023 · 1 comment · Fixed by #91
Closed
Labels
bug Something isn't working

Comments

@lorenzoamir
Copy link

lorenzoamir commented Nov 9, 2023

Describe the bug
I had some trouble using scanpy.tl.ingest to map some observations from reference data to query data. The PCA of the reference data was computed using rsc.pp.pca (rapids_singlecell instead of scanpy). It seems that the problem was the absence of the "params" entry in adata.uns['pca'] which gets normally created by scanpy when running scanpy.pp.pca()

Steps/Code to reproduce bug

import scanpy as sc
import rapids_singlecell as rsc

# Load example dataset
adata = sc.datasets.pbmc68k_reduced()

# Perform PCA with scanpy
sc.pp.pca(
    adata,
    n_comps = 30,
    zero_center = True,
    use_highly_variable = False
)

# Save scanpy entries
scanpy_keys = adata.uns['pca'].keys()

# Perform PCA with rapids_singlecell
rsc.utils.anndata_to_GPU(adata)
rsc.pp.pca(
    adata,
    n_comps = 30,
    zero_center = True,
    use_highly_variable = False
)

# Save rapids_singlecell entries
rapids_keys = adata.uns['pca'].keys()

print(f"scanpy: {scanpy_keys}")
print(f"rapids: {rapids_keys}")

Expected behavior
Entries should be the same after running sc.pp.pca and rsc.pp.pca

Environment details (please complete the following information):

I am unable to provide hardware information about the machine I run the code on at the moment.

  • Method of Rapids install: pip
  • Output of pip list:
 Package                       Version
----------------------------- ---------------------
absl-py                       2.0.0
aiohttp                       3.8.5
aiosignal                     1.3.1
anndata                       0.10.0rc1
annotated-types               0.5.0
anyio                         3.7.1
argon2-cffi                   23.1.0
argon2-cffi-bindings          21.2.0
array-api-compat              1.4
arrow                         1.3.0
asttokens                     2.4.0
async-timeout                 4.0.3
attrs                         23.1.0
backcall                      0.2.0
backoff                       2.2.1
backports.functools-lru-cache 1.6.5
beautifulsoup4                4.12.2
bleach                        6.0.0
blessed                       1.20.0
bokeh                         3.2.2
cachetools                    5.3.1
certifi                       2023.7.22
cffi                          1.16.0
charset-normalizer            3.3.0
chex                          0.1.7
click                         8.1.7
click-plugins                 1.1.1
cligj                         0.7.2
cloudpickle                   2.2.1
colorcet                      3.0.1
comm                          0.1.4
contextlib2                   21.6.0
contourpy                     1.1.1
croniter                      1.4.1
cubinlinker-cu11              0.3.0.post1
cucim                         23.8.0
cuda-python                   11.8.2
cudf-cu11                     23.8.0
cugraph-cu11                  23.8.0
cuml-cu11                     23.8.0
cuproj-cu11                   23.8.1
cupy-cuda11x                  12.2.0
cuspatial-cu11                23.8.1
cuxfilter-cu11                23.8.2
cycler                        0.12.0
Cython                        3.0.2
dask                          2023.7.1
dask-cuda                     23.8.0
dask-cudf-cu11                23.8.0
datashader                    0.15.2
datashape                     0.5.2
dateutils                     0.6.12
debugpy                       1.8.0
decorator                     5.1.1
decoupler                     1.5.0
deepdiff                      6.6.0
defusedxml                    0.7.1
distributed                   2023.7.1
dm-tree                       0.1.8
docrep                        0.3.2
etils                         1.5.0
exceptiongroup                1.1.3
executing                     1.2.0
fastapi                       0.103.2
fastjsonschema                2.18.1
fastrlock                     0.8.2
filelock                      3.9.0
Fiona                         1.9.4.post1
flax                          0.7.4
fonttools                     4.43.0
fqdn                          1.5.1
frozenlist                    1.4.0
fsspec                        2023.9.2
geopandas                     0.14.0
h11                           0.14.0
h5py                          3.9.0
holoviews                     1.17.1
idna                          3.4
igraph                        0.10.8
importlib-metadata            6.8.0
importlib-resources           6.1.0
inquirer                      3.1.3
ipykernel                     6.25.2
ipython                       8.16.1
isoduration                   20.11.0
itsdangerous                  2.1.2
jax                           0.4.17
jaxlib                        0.4.17+cuda11.cudnn86
jedi                          0.19.1
Jinja2                        3.1.2
joblib                        1.3.2
jsonpointer                   2.4
jsonschema                    4.19.1
jsonschema-specifications     2023.7.1
jupyter_client                8.3.1
jupyter_core                  5.3.2
jupyter-events                0.7.0
jupyter_server                2.7.3
jupyter_server_proxy          4.1.0
jupyter_server_terminals      0.4.4
jupyterlab-pygments           0.2.2
kiwisolver                    1.4.5
lazy_loader                   0.3
leidenalg                     0.10.1
lightning                     2.0.9.post0
lightning-cloud               0.5.39
lightning-utilities           0.9.0
linkify-it-py                 2.0.2
llvmlite                      0.41.0
locket                        1.0.0
Markdown                      3.4.4
markdown-it-py                3.0.0
MarkupSafe                    2.1.3
matplotlib                    3.8.0
matplotlib-inline             0.1.6
mdit-py-plugins               0.4.0
mdurl                         0.1.2
mistune                       3.0.2
ml-collections                0.1.1
ml-dtypes                     0.3.1
mpmath                        1.3.0
msgpack                       1.0.7
mudata                        0.2.3
multidict                     6.0.4
multipledispatch              1.0.0
natsort                       8.4.0
nbclient                      0.8.0
nbconvert                     7.9.2
nbformat                      5.9.2
nest-asyncio                  1.5.6
networkx                      3.1
numba                         0.58.0
numpy                         1.25.2
numpyro                       0.13.2
nvidia-cublas-cu11            2022.4.8
nvidia-cublas-cu117           11.10.1.25
nvidia-cuda-cupti-cu11        2022.4.8
nvidia-cuda-cupti-cu117       11.7.50
nvidia-cuda-nvcc-cu11         2022.5.4
nvidia-cuda-nvcc-cu117        11.7.64
nvidia-cuda-runtime-cu11      2022.4.25
nvidia-cuda-runtime-cu117     11.7.60
nvidia-cudnn-cu11             2022.5.19
nvidia-cudnn-cu116            8.4.0.27
nvidia-cufft-cu11             2022.4.8
nvidia-cufft-cu117            10.7.2.50
nvidia-cusolver-cu11          2022.4.8
nvidia-cusolver-cu117         11.3.5.50
nvidia-cusparse-cu11          2022.4.8
nvidia-cusparse-cu117         11.7.3.50
nvtx                          0.2.8
opt-einsum                    3.3.0
optax                         0.1.7
orbax-checkpoint              0.4.1
ordered-set                   4.1.0
overrides                     7.4.0
packaging                     23.2
pandas                        1.5.3
pandocfilters                 1.5.0
panel                         1.2.3
param                         1.13.0
parso                         0.8.3
partd                         1.4.1
patsy                         0.5.3
pexpect                       4.8.0
pickleshare                   0.7.5
Pillow                        10.0.1
pip                           23.2.1
platformdirs                  3.11.0
prometheus-client             0.17.1
prompt-toolkit                3.0.39
protobuf                      4.24.4
psutil                        5.9.5
ptxcompiler-cu11              0.7.0.post1
ptyprocess                    0.7.0
pure-eval                     0.2.2
pyarrow                       11.0.0
pycparser                     2.21
pyct                          0.5.0
pydantic                      2.1.1
pydantic_core                 2.4.0
Pygments                      2.16.1
PyJWT                         2.8.0
pylibcugraph-cu11             23.8.0
pylibraft-cu11                23.8.0
pynndescent                   0.5.10
pynvml                        11.4.1
pyparsing                     3.1.1
pyproj                        3.6.1
pyro-api                      0.1.2
pyro-ppl                      1.8.6
python-dateutil               2.8.2
python-editor                 1.0.4
python-json-logger            2.0.7
python-multipart              0.0.6
pytorch-lightning             2.0.9.post0
pytz                          2023.3.post1
pyviz_comms                   3.0.0
PyYAML                        6.0.1
pyzmq                         25.1.1
raft-dask-cu11                23.8.0
rapids_singlecell             0.9.1
readchar                      4.0.5
referencing                   0.30.2
requests                      2.31.0
rfc3339-validator             0.1.4
rfc3986-validator             0.1.1
rich                          13.6.0
rmm-cu11                      23.8.0
rpds-py                       0.10.3
scanpy                        1.9.5
scikit-learn                  1.3.1
scikit-misc                   0.3.0
scipy                         1.11.3
scvi-tools                    1.0.3
seaborn                       0.13.0
Send2Trash                    1.8.2
session-info                  1.0.0
setuptools                    68.2.2
shapely                       2.0.1
simpervisor                   1.0.0
six                           1.16.0
sniffio                       1.3.0
sortedcontainers              2.4.0
soupsieve                     2.5
sparse                        0.14.0
stack-data                    0.6.2
starlette                     0.27.0
starsessions                  1.3.0
statsmodels                   0.14.0
stdlib-list                   0.9.0
sympy                         1.12
tbb                           2021.10.0
tblib                         2.0.0
tensorstore                   0.1.45
terminado                     0.17.1
texttable                     1.7.0
threadpoolctl                 3.2.0
tinycss2                      1.2.1
toolz                         0.12.0
torch                         2.1.0+cu118
torchaudio                    2.1.0+cu118
torchmetrics                  1.2.0
torchvision                   0.16.0+cu118
tornado                       6.3.3
tqdm                          4.66.1
traitlets                     5.11.2
treelite                      3.2.0
treelite-runtime              3.2.0
triton                        2.1.0
types-python-dateutil         2.8.19.14
typing_extensions             4.8.0
uc-micro-py                   1.0.2
ucx-py-cu11                   0.33.0
umap-learn                    0.5.4
uri-template                  1.3.0
urllib3                       2.0.6
uvicorn                       0.23.2
wcwidth                       0.2.8
webcolors                     1.13
webencodings                  0.5.1
websocket-client              1.6.3
websockets                    11.0.3
wheel                         0.41.2
xarray                        2023.9.0
xyzservices                   2023.10.0
yarl                          1.9.2
zict                          3.0.0
zipp                          3.17.0

Additional context
I believe this could be solved by changing lines 136-139 in _pca.py from

    adata.uns["pca"] = {
        "variance": pca_func.explained_variance_,
        "variance_ratio": pca_func.explained_variance_ratio_,
    }

to something like:

    adata.uns["pca"] = {
        "params": {
            "zero_center": zero_center,
            "use_highly_variable": use_highly_variable
        },
        "variance": pca_func.explained_variance_,
        "variance_ratio": pca_func.explained_variance_ratio_,
    }
@lorenzoamir lorenzoamir added the bug Something isn't working label Nov 9, 2023
@Intron7
Copy link
Member

Intron7 commented Nov 9, 2023

that should be an easy fix. I'll make a release early next week which will fix this

@Intron7 Intron7 mentioned this issue Nov 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants