Running UMAP hangs #53

ts387 · 2021-04-21T23:49:19Z

Hi Ellen,

I have had this issue from the start but have just been going along with PCA. Unless I skip running umap, all my cryodrgn analyze runs appear to hang at this stage (i.e. till the job exits on hitting the cluster time limit). Our system admin tells me the job isn't doing anything, and no errors pop up in the log either:

(cryodrgn) -bash-4.2$ tail -f CryoDRGN-01_vae128_big_z8-analyze_umap.out
2021-04-21 18:36:24 Saving results to /home/ts387/CryoDRGN/01_vae128_big_z8/analyze.24
2021-04-21 18:36:24 Perfoming principal component analysis...
2021-04-21 18:36:24 Explained variance ratio:
2021-04-21 18:36:24 [0.14331131 0.13593155 0.13252919 0.12811412 0.12108949 0.11471545
0.11343906 0.11086983]
2021-04-21 18:36:24 Generating volumes...
2021-04-21 18:36:24 K-means clustering...
2021-04-21 18:36:32 Generating volumes...
2021-04-21 18:36:32 Running UMAP...

I am looking at ~150,000 particles. Can you tell what might be going on, and if there's a solution or workaround? Just in case, I am attaching my slurm script as a text file. I apologise if the question is too tangential to core cryodrgn functionality.

Thanks a million!

Taha

cryodrgn analyze (--skip-vol).txt

zhonge · 2021-04-27T16:07:37Z

Very strange -- Can you try running umap on a subset of your dataset? There is a helper script in the repository, which can stride the dataset to test on ~150 datapoints:

$ python /path/to/repo/analysis_scripts/run_umap.py z.pkl --stride 1000 -o test_umap.pkl

ts387 · 2021-04-28T23:52:59Z

Hi, that does seem to work...

(cryodrgn) -bash-4.2$ python run_umap.py z.pkl --stride 1000 -o test_umap.pkl
(152, 8)

...and I get a 1.4 kb UMAP pickle file as a result. I ran it too without stride, which works too.

Is there a way I may use it as a substitute for the UMAP subroutine in cryodrgn analyze (for now)? So I could see UMAP PNGs marking k-means cluster centres for sampled densities; and subsequently also interact with the raw UMAP output using the Jupyter notebook. That would be super useful!

Thank you very much.

ts387 · 2021-04-29T03:21:32Z

I copied the (full set) umap.pkl file to the analysis directory, and am able to see the UMAP visualisations through Jupyter notebook.

Some of the widgets, in particular the latter interactive ones don't seem to work. However, I gather this is a known issue and partially fixed in v0.3.2?

You can close the issue after this (thanks for all your help so far — cryoDRGN is already proving to be a real asset to our projects!)

Guillawme · 2021-04-29T06:38:05Z

Some of the widgets, in particular the latter interactive ones don't seem to work. However, I gather this is a known issue and partially fixed in v0.3.2?

This was #34 and got fixed in a way, but you cannot have gotten the fix automatically by updating since it was about which dependencies are installed in your conda environment. To benefit from this fix, you need to reinstall in a fresh environment, making sure you follow the directions in the README.

zhonge · 2021-05-01T13:20:39Z

Looking into the umap issue more -- I was able to reproduce UMAP hanging in a different installation environment (python=3.7, pytorch=1.7, umap=0.4.2, numba=0.47.0, ...)

In fact, it hangs even when running the basic example:

import umap
from sklearn.datasets import load_digits

digits = load_digits()

embedding = umap.UMAP().fit_transform(digits.data)

which only takes a few seconds in my previous installation with python=3.6, pytorch=1.1, umap=0.4.1, numba=0.48.0...

I think this is related to an underlying dependency issue in the umap package specifically with numba=0.47.0:
lmcinnes/umap#336

Can you check your numba version (conda list numba) and try installing a different version of numba?

ts387 · 2021-05-02T03:41:39Z

Thanks @Guillawme.

@zhonge Our cryodrgn environment is operating python 3.7.9, pytorch 1.0.0, umap-learn 0.5.1 and numba 0.52.0. Shall we consider backdating numba then?

zhonge · 2021-05-02T13:01:55Z

Does the basic example work for you? You just need to copy the above lines into a python session on the computer that you're testing.

ts387 · 2021-05-04T01:50:00Z

You mean line-by-line copy and compile each statement? I did so, it takes a few seconds following the first and last line – but no hang.

(cryodrgn) -bash-4.2$ python
Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import umap
>>> from sklearn.datasets import load_digits
>>> digits = load_digits()
>>> embedding = umap.UMAP().fit_transform(digits.data)
>>> exit()
(cryodrgn) -bash-4.2$

zhonge · 2021-05-19T13:14:07Z

I was able to reproduce this on another machine, where running UMAP alone runs fine, but running UMAP during cryodrgn analyze hangs. There is some version/dependency incompatibility between pytorch/numpy/umap, and I can reproduce in a standalone python environment, if I import pytorch AND numpy before importing umap.

# runs fine
(cryodrgn) $ python
>>> from cryodrgn import analysis, utils
>>> z = utils.load_pkl('z.20.pkl')
>>> analysis.run_umap(z[::100])
array([[8.358947 , 9.226697 ],
       [4.5772686, 6.358181 ],
       [3.9687192, 6.013609 ],
       ...,
       [7.218935 , 9.058749 ],
       [3.5549293, 6.595441 ],
       [5.7769523, 8.985867 ]], dtype=float32)

# segfaults if you import torch first...
(cryodrgn) $ python
>>> import torch
>>> from cryodrgn import analysis, utils
>>> z = utils.load_pkl('z.20.pkl')
>>> analysis.run_umap(z[::100])
Segmentation fault (core dumped)

# hangs indefinitely if you import torch then numpy...
(cryodrgn) $ python
>>> import torch
>>> import numpy as np
>>> from cryodrgn import analysis, utils
>>> z = utils.load_pkl('z.20.pkl')
>>> analysis.run_umap(z[::100]) # hangs indefinitely

This particular environment has a very old version of pytorch 1.0.1, numpy 1.20.1, numba 0.51.2, and umap-learn 0.5.1.

I am not sure what the underlying incompatibility is right now, but you can avoid the conflicting imports and get cryodrgn analyze to complete successfully if you call the analyze.py command directly:

# instead of cryodrgn analyze...
(cryodrgn) $ python /path/to/repo/cryodrgn/commands/analyze.py [workdir] [epoch]

ts387 · 2021-05-24T03:22:34Z

That works - thank you very much!

zhonge closed this as completed May 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running UMAP hangs #53

Running UMAP hangs #53

ts387 commented Apr 21, 2021 •

edited

zhonge commented Apr 27, 2021

ts387 commented Apr 28, 2021 •

edited

ts387 commented Apr 29, 2021 •

edited

Guillawme commented Apr 29, 2021

zhonge commented May 1, 2021

ts387 commented May 2, 2021

zhonge commented May 2, 2021

ts387 commented May 4, 2021 •

edited

zhonge commented May 19, 2021

ts387 commented May 24, 2021

Running UMAP hangs #53

Running UMAP hangs #53

Comments

ts387 commented Apr 21, 2021 • edited

zhonge commented Apr 27, 2021

ts387 commented Apr 28, 2021 • edited

ts387 commented Apr 29, 2021 • edited

Guillawme commented Apr 29, 2021

zhonge commented May 1, 2021

ts387 commented May 2, 2021

zhonge commented May 2, 2021

ts387 commented May 4, 2021 • edited

zhonge commented May 19, 2021

ts387 commented May 24, 2021

ts387 commented Apr 21, 2021 •

edited

ts387 commented Apr 28, 2021 •

edited

ts387 commented Apr 29, 2021 •

edited

ts387 commented May 4, 2021 •

edited