PETSC error at cr.tl.initial_states #588

mehrankr · 2021-05-13T00:57:43Z

I installed cellrank in a new environment in python3.8 using

conda install -c conda-forge -c bioconda cellrank-krylov

I think the recipe needs to be updated to require the latest networkx otherwise paga compatibility breaks with matplotlib error

This installs cellrank 1.3.1 currently and in some of the scvelo and cellrank functions, particularly

cr.tl.initial_states(adata, cluster_key='Cluster', n_jobs=1)

I get the following error:

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18194/18194 [00:19<00:00, 948.84cell/s]
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18194/18194 [00:16<00:00, 1134.85cell/s]
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
WARNING: For 1 macrostate, stationary distribution is computed

This used to happen for:

cr.tl.terminal_states(
            adata, cluster_key='Cluster', weight_connectivities=0.2)

But after changing it to:

cr.tl.terminal_states(
            adata, cluster_key='Cluster', weight_connectivities=0.2,
            model="monte_carlo",
            n_jobs=1, method='brandts', n_states=2)

didn't happen any more.

Very surprisingly, the same issue rises some times (not always) when running:

scv.tl.recover_dynamics(adata, n_jobs=1, n_top_genes=1000)

and

scv.tl.velocity(adata, mode='dynamical')

Versions:

cellrank==1.3.1 scanpy==1.7.2 anndata==0.7.6 numpy==1.20.2 numba==0.53.1 scipy==1.6.3 pandas==1.2.4 pygpcca==1.0.2 scikit-learn==0.24.2 statsmodels==0.12.2 python-igraph==0.9.1 scvelo==0.2.3 pygam==0.8.0 matplotlib==3.4.2 seaborn==0.11.1

The text was updated successfully, but these errors were encountered:

michalk8 · 2021-05-14T10:22:12Z

Hi @mehrankr

I believe this is the same issue as in #473 (not sure why, but in some cases, PETSc parallelization doens't play nicely with they way we parallelize [by default through processes]).
Usually, changing the backend to cr.tl.initial_states(adata, cluster_key='Cluster', n_jobs=1, backend='threading') worked, so I'd try this first.

cr.tl.initial_states(adata, cluster_key='Cluster', n_jobs=1)

Hmm, this should not really happen, esp. for n_jobs=1 (based on #473, this should be fine).

scv.tl.velocity(adata, mode='dynamical')
scv.tl.recover_dynamics(adata, n_jobs=1, n_top_genes=1000)

Very strange, since scvelo doesn't use PETSc; only in 0.2.3, the parallelization was added that we're using here (I assume PETSc has been loaded through cellrank). I will take a closer look at this function to look for problematic parts.

But after changing it to: ...

This is expected, since method='brands' is using scipy under the hood, not PETSc, to get the Schur vectors.

Marius1311 · 2021-05-17T14:36:49Z

Hi @mehrankr, did these tips help you with your problem already?

mehrankr · 2021-05-17T15:40:39Z

Unfortunately no, I'm still getting the same error:

In [337]:         cr.tl.initial_states(
     ...:             adata, cluster_key='Cluster', n_jobs=1,
     ...:                 backend='threading')
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18194/18194 [00:22<00:00, 808.30cell/s]
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18194/18194 [00:18<00:00, 959.66cell/s]
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
WARNING: For 1 macrostate, stationary distribution is computed
WARNING: The following states could not be mapped uniquely: `['lup_1']`

Marius1311 · 2021-05-19T13:01:43Z

mhm, @michalk8 , could you look into this please?

mehrankr · 2021-05-19T13:28:05Z

Thanks for following up. Send me an email and we can arrange for passing you the loom file if needed: mkarimzadeh@vectorinstitute.ai

michalk8 · 2021-05-19T20:40:09Z

Hi @mehrankr ,

just to be completely sure, does the code above #588 (comment) actually raise some Python exception (or crashes the ipykernel), or does it simply print out to the error to the console?
Because it seems that it just prints the [0]PETSC ERROR right after the progress bar (where joblib does its parallelization) and it seems to have succesfully computed stationary distribution and mapped the cluster label (the 2nd warning regarding lup_1 comes from this call, which is after the stationary dist. has been computed [and therefore after any PETSc usage]).

If it crashes/raises an exception, I will ping you over the email for the data. Lastly, could you please print the output of the following command?

python -c "import petsc4py; import slepc4py; print(petsc4py.__version__); print(slepc4py.__version__)"

mehrankr · 2021-05-20T18:48:12Z

Hi @michalk8,

It doesn't crash actually. It simply prints the message out.
As long as you can confirm this warning hasn't affected any of the processes and doesn't affect the results, I think we can close this.

The output is:

python -c "import petsc4py; import slepc4py; print(petsc4py.__version__); print(slepc4py.__version__)"
3.15.0
3.15.0

michalk8 · 2021-05-20T20:43:28Z

It doesn't crash actually. It simply prints the message out.

Thanks for confirming this. I can see the same error in our CI, as well as in jupyter's log, i.e. the code below:

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed()
cr.tl.terminal_states(adata)
cr.tl.lineages(adata, n_jobs=1, backend='threading')

produces:

, and the results are unaffected. I see it printed to the console if using just ipykernel:

As long as it doesn't throw an error/crashed the kernel as in #473, it should be fine.

Marius1311 · 2021-05-25T12:51:51Z

I'm closing, as I think you guys figures out that this is not critical.

mehrankr added the bug Something isn't working label May 13, 2021

mehrankr assigned michalk8 May 13, 2021

Marius1311 closed this as completed May 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PETSC error at cr.tl.initial_states #588

PETSC error at cr.tl.initial_states #588

mehrankr commented May 13, 2021 •

edited

Loading

michalk8 commented May 14, 2021

Marius1311 commented May 17, 2021

mehrankr commented May 17, 2021 •

edited

Loading

Marius1311 commented May 19, 2021

mehrankr commented May 19, 2021

michalk8 commented May 19, 2021

mehrankr commented May 20, 2021

michalk8 commented May 20, 2021

Marius1311 commented May 25, 2021

PETSC error at cr.tl.initial_states #588

PETSC error at cr.tl.initial_states #588

Comments

mehrankr commented May 13, 2021 • edited Loading

Versions:

michalk8 commented May 14, 2021

Marius1311 commented May 17, 2021

mehrankr commented May 17, 2021 • edited Loading

Marius1311 commented May 19, 2021

mehrankr commented May 19, 2021

michalk8 commented May 19, 2021

mehrankr commented May 20, 2021

michalk8 commented May 20, 2021

Marius1311 commented May 25, 2021

mehrankr commented May 13, 2021 •

edited

Loading

mehrankr commented May 17, 2021 •

edited

Loading