Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scanpy Integration Not Working with 1.3.0 #119

Closed
dgodovich opened this issue Sep 13, 2023 · 2 comments
Closed

Scanpy Integration Not Working with 1.3.0 #119

dgodovich opened this issue Sep 13, 2023 · 2 comments

Comments

@dgodovich
Copy link

Hello,

I recently updated palantir to the latest release (1.3.0) using pip install -U palantir and found that my previous notebooks do not work. I was using the scanpy integration through sc.external.tl.palantir(adata) and now get an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[56], line 1
----> 1 sc.external.tl.palantir(adata, 
      2                         n_components=20, 
      3                         knn=30)

File ~/miniconda3/envs/sc_analysis/lib/python3.8/site-packages/scanpy/external/tl/_palantir.py:209, in palantir(adata, n_components, knn, alpha, use_adjacency_matrix, distances_key, n_eigs, impute_data, n_steps, copy)
    206     df = pd.DataFrame(adata.obsm['X_pca'], index=adata.obs_names)
    208 # Diffusion maps
--> 209 dm_res = run_diffusion_maps(
    210     data_df=df,
    211     n_components=n_components,
    212     knn=knn,
    213     alpha=alpha,
    214 )
    215 # Determine the multi scale space of the data
    216 ms_data = determine_multiscale_space(dm_res=dm_res, n_eigs=n_eigs)

TypeError: run_diffusion_maps() got an unexpected keyword argument 'data_df'

If I try to recalculate palantir results with already found diffusion maps, I get a similar error:

pr_res = sc.external.tl.palantir_results(adata, early_cell=start_cell, 
                                         ms_data = 'X_palantir_multiscale', num_waypoints=1000)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 pr_res = sc.external.tl.palantir_results(adata, early_cell = start_cell, 
      2                                          ms_data='X_palantir_multiscale', num_waypoints=1000)

File ~/miniconda3/envs/sc_analysis/lib/python3.8/site-packages/scanpy/external/tl/_palantir.py:294, in palantir_results(adata, early_cell, ms_data, terminal_states, knn, num_waypoints, n_jobs, scale_components, use_early_cell_as_start, max_iterations)
    291 from palantir.core import run_palantir
    293 ms_data = pd.DataFrame(adata.obsm[ms_data], index=adata.obs_names)
--> 294 pr_res = run_palantir(
    295     ms_data=ms_data,
    296     early_cell=early_cell,
    297     terminal_states=terminal_states,
    298     knn=knn,
    299     num_waypoints=num_waypoints,
    300     n_jobs=n_jobs,
    301     scale_components=scale_components,
    302     use_early_cell_as_start=use_early_cell_as_start,
    303     max_iterations=max_iterations,
    304 )
    306 return pr_res

TypeError: run_palantir() got an unexpected keyword argument 'ms_data'

I am using the most recent scanpy release 1.9.4. Please let me know if you need any additional information.

Thank you!

@katosh
Copy link
Collaborator

katosh commented Sep 14, 2023

Hello @dgodovich,

Thank you for bringing this to our attention. Given the deprecation of scanpy.external, we've aligned Palantir's functionality to be similar to the Scanpy wrapper for ease of transition.

Default Approach:

Here's how we would suggest to execute your diffusion maps and Palantir analysis:

# Diffusion Maps
palantir.utils.run_diffusion_maps(adata, n_components=20, knn=30)

# Multiscale Space
palantir.utils.determine_multiscale_space(adata)

# Run Palantir
palantir.core.run_palantir(adata, early_cell=start_cell, num_waypoints=1000)

Key Differences:

  1. The naming conventions for adata.obsm keys differ by default.

For Scanpy Naming Scheme:

If you're keen on Scanpy's naming scheme, you can explicitly specify the keys as follows:

# Diffusion Maps
palantir.utils.run_diffusion_maps(
    adata, 
    n_components=20, 
    knn=30, 
    eigvec_key="X_palantir_diff_comp",
    eigval_key="palantir_EigenValues",
    sim_key="palantir_diff_op"
)

# Multiscale Space
palantir.utils.determine_multiscale_space(
    adata,
    eigvec_key="X_palantir_diff_comp",
    out_key="X_palantir_multiscale"
)

# Run Palantir
pr_res = palantir.core.run_palantir(
    adata,
    early_cell=start_cell,
    num_waypoints=1000,
    eigvec_key="X_palantir_multiscale"
)

Note:

The palantir.core.run_palantir wrapper now additionally saves the results in adata.obs and adata.obsm under the following keys:

  • adata.obs["palantir_pseudotime"]
  • adata.obs["palantir_entropy"]
  • adata.obsm["palantir_fate_probabilities"]
  • adata.uns["palantir_waypoints"]

@dgodovich
Copy link
Author

I did not know scanpy is depreciating the external API. I followed the tutorial notebook and was able to reproduce my previous results with very similar code to what you provided here, so I have no issues.

I appreciate saving results in adata.obs and adata.obsm, as that saves a step later on. Generally this workflow is easier to understand as well.

My one note is that the documentation on your home page says that Palantir is fully integrated with scanpy, which is no longer the case.

Thank you for the comprehensive reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants