Scanpy Integration Not Working with 1.3.0 #119

dgodovich · 2023-09-13T22:53:21Z

Hello,

I recently updated palantir to the latest release (1.3.0) using pip install -U palantir and found that my previous notebooks do not work. I was using the scanpy integration through sc.external.tl.palantir(adata) and now get an error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[56], line 1
----> 1 sc.external.tl.palantir(adata, 
      2                         n_components=20, 
      3                         knn=30)

File ~/miniconda3/envs/sc_analysis/lib/python3.8/site-packages/scanpy/external/tl/_palantir.py:209, in palantir(adata, n_components, knn, alpha, use_adjacency_matrix, distances_key, n_eigs, impute_data, n_steps, copy)
    206     df = pd.DataFrame(adata.obsm['X_pca'], index=adata.obs_names)
    208 # Diffusion maps
--> 209 dm_res = run_diffusion_maps(
    210     data_df=df,
    211     n_components=n_components,
    212     knn=knn,
    213     alpha=alpha,
    214 )
    215 # Determine the multi scale space of the data
    216 ms_data = determine_multiscale_space(dm_res=dm_res, n_eigs=n_eigs)

TypeError: run_diffusion_maps() got an unexpected keyword argument 'data_df'

If I try to recalculate palantir results with already found diffusion maps, I get a similar error:

pr_res = sc.external.tl.palantir_results(adata, early_cell=start_cell, 
                                         ms_data = 'X_palantir_multiscale', num_waypoints=1000)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 pr_res = sc.external.tl.palantir_results(adata, early_cell = start_cell, 
      2                                          ms_data='X_palantir_multiscale', num_waypoints=1000)

File ~/miniconda3/envs/sc_analysis/lib/python3.8/site-packages/scanpy/external/tl/_palantir.py:294, in palantir_results(adata, early_cell, ms_data, terminal_states, knn, num_waypoints, n_jobs, scale_components, use_early_cell_as_start, max_iterations)
    291 from palantir.core import run_palantir
    293 ms_data = pd.DataFrame(adata.obsm[ms_data], index=adata.obs_names)
--> 294 pr_res = run_palantir(
    295     ms_data=ms_data,
    296     early_cell=early_cell,
    297     terminal_states=terminal_states,
    298     knn=knn,
    299     num_waypoints=num_waypoints,
    300     n_jobs=n_jobs,
    301     scale_components=scale_components,
    302     use_early_cell_as_start=use_early_cell_as_start,
    303     max_iterations=max_iterations,
    304 )
    306 return pr_res

TypeError: run_palantir() got an unexpected keyword argument 'ms_data'

I am using the most recent scanpy release 1.9.4. Please let me know if you need any additional information.

Thank you!

The text was updated successfully, but these errors were encountered:

katosh · 2023-09-14T18:00:47Z

Hello @dgodovich,

Thank you for bringing this to our attention. Given the deprecation of scanpy.external, we've aligned Palantir's functionality to be similar to the Scanpy wrapper for ease of transition.

Default Approach:

Here's how we would suggest to execute your diffusion maps and Palantir analysis:

# Diffusion Maps
palantir.utils.run_diffusion_maps(adata, n_components=20, knn=30)

# Multiscale Space
palantir.utils.determine_multiscale_space(adata)

# Run Palantir
palantir.core.run_palantir(adata, early_cell=start_cell, num_waypoints=1000)

Key Differences:

The naming conventions for adata.obsm keys differ by default.

For Scanpy Naming Scheme:

If you're keen on Scanpy's naming scheme, you can explicitly specify the keys as follows:

# Diffusion Maps
palantir.utils.run_diffusion_maps(
    adata, 
    n_components=20, 
    knn=30, 
    eigvec_key="X_palantir_diff_comp",
    eigval_key="palantir_EigenValues",
    sim_key="palantir_diff_op"
)

# Multiscale Space
palantir.utils.determine_multiscale_space(
    adata,
    eigvec_key="X_palantir_diff_comp",
    out_key="X_palantir_multiscale"
)

# Run Palantir
pr_res = palantir.core.run_palantir(
    adata,
    early_cell=start_cell,
    num_waypoints=1000,
    eigvec_key="X_palantir_multiscale"
)

Note:

The palantir.core.run_palantir wrapper now additionally saves the results in adata.obs and adata.obsm under the following keys:

adata.obs["palantir_pseudotime"]
adata.obs["palantir_entropy"]
adata.obsm["palantir_fate_probabilities"]
adata.uns["palantir_waypoints"]

dgodovich · 2023-09-14T18:37:48Z

I did not know scanpy is depreciating the external API. I followed the tutorial notebook and was able to reproduce my previous results with very similar code to what you provided here, so I have no issues.

I appreciate saving results in adata.obs and adata.obsm, as that saves a step later on. Generally this workflow is easier to understand as well.

My one note is that the documentation on your home page says that Palantir is fully integrated with scanpy, which is no longer the case.

Thank you for the comprehensive reply!

dgodovich closed this as completed Sep 14, 2023

katosh mentioned this issue Oct 16, 2023

type KeyError with new v1.3.1: output from determine_multiscale_space #124

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scanpy Integration Not Working with 1.3.0 #119

Scanpy Integration Not Working with 1.3.0 #119

dgodovich commented Sep 13, 2023

katosh commented Sep 14, 2023 •

edited

dgodovich commented Sep 14, 2023

Scanpy Integration Not Working with 1.3.0 #119

Scanpy Integration Not Working with 1.3.0 #119

Comments

dgodovich commented Sep 13, 2023

katosh commented Sep 14, 2023 • edited

Default Approach:

Key Differences:

For Scanpy Naming Scheme:

Note:

dgodovich commented Sep 14, 2023

katosh commented Sep 14, 2023 •

edited