Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to find the max GPU memory watermark? How to run locally with minimal setup #17

Open
avilella opened this issue Aug 5, 2021 · 7 comments

Comments

@avilella
Copy link

avilella commented Aug 5, 2021

According to the README.md, the memory goes as follows:

Maximum length limits depends on free GPU provided by Google-Colab fingers-crossed

For GPU: Tesla T4 or Tesla P100 with ~16G the max length is ~1400
For GPU: Tesla K80 with ~12G the max length is ~1000
To check what GPU you got, open a new code cell and type !nvidia-smi

I am interested in structures of around either (a) one single chain of 240-280aa or around (b) 2 different chains of ~120 + ~140aa. What would be the minimal GPU that would allow us to run this locally?

I am thinking that given our own custom MSAs, it wouldn't need to connect to MMSeqs2 or download the 2Tb of sequence data, thus going straight into running the prediction based on the MSA of internal data on the docker container?

Or am I missing something obvious that would still require Colab or something else remote?

@milot-mirdita
Copy link
Collaborator

The old K40 GPUs (12GB RAM) we have locally ran all but one (it was 900-1000aa) CASP FM target without issues with the official pipeline, so AF2 doesn't necessarily need very new GPUs.

You might still want to poke at the python code in the Colab, as this will be a lot easier to supply your own MSAs to than the official pipeline. Ideally we want to make the Colabs also runnable on the command line, but haven't started working on that yet.

@avilella
Copy link
Author

avilella commented Aug 5, 2021 via email

@RodenLuo
Copy link

Ideally we want to make the Colabs also runnable on the command line, but haven't started working on that yet.

This is also mentioned in #20. It would be great to have either a local command-line interface or the local notebook version so that we can run on inputs with >1000 amino acids and predict complexes (dimers/trimers) of it.

I'm not that familiar with all the steps involved in the code. Mostly use it as an end-to-end tool. I tried to localize the AlphaFold2_advanced notebook. After solving several package issues, now stuck at No module named 'colabfold'. I also see the database_path all go to googleapis which will work fine on Colab but less smoothly on local I guess. I have a local version of AlphaFold2 running fine. Would be much appreciated if you can give some hints on how to localize the AlphaFold2_advanced notebook. Thanks.

@avilella
Copy link
Author

avilella commented Aug 23, 2021 via email

@milot-mirdita
Copy link
Collaborator

We now have an internal version that runs on a cluster. The main issue still remains that the MMseqs2 API runs on one single server and will probably not scale to multiple research group submitting jobs.

We are still preparing databases, scripts etc. so people could deploy their own server. However, to use MMseqs2 as we use it for ColabFold we do require that all databases are fully in RAM (currently requiring 535GB of RAM + some RAM for each worker process).

We can change the local ColabFold version to work with MMseqs2's usual batch mode where the memory requirements are not as high.

If you want to run a few thousand sequences please contact me directly (email, twitter etc). I can give you access to the local version. We still need to figure something out how to scale the API better though.

@RodenLuo
Copy link

Thanks! I have a local version of AlphaFold2 installed with docker on a server. (I met some problems during installation. And then I was trying to install the non_docker version on a cluster as well but later dropped it as the one on the server worked out fine after changing the Cuda version.)

I have 4 NVIDIA RTX A6000 and 1.0TB RAM on that server. But I still have not got AlphaFold2_advanced.ipynb run through. I would like to predict homotrimers of a protein with more than 1000 aa (more details at issue #93 in AlphaFold2's repo). With trimer settings, the total length is more than 3000 aa.

I am facing the below error if I run them on Colab.

Exception: Input sequence is too long: 3867 amino acids, while the maximum is 2500. Please use the full AlphaFold system for long sequences.
Exception: Input sequence is too long: 3078 amino acids, while the maximum is 2500. Please use the full AlphaFold system for long sequences.

I'm trying to run the notebooks locally on the server now. The previously mentioned No module named 'colabfold' error was because I was launching the notebook within the AlphaFold2's folder and it bypassed this line if not os.path.isdir("alphafold"): in the notebook. I moved the notebook to another folder. After several pip installs and conda installs for the missing packages, I didn't change the database_path, so it should be using googleapis I suppose. I changed the max length to MAX_SEQUENCE_LENGTH = 5000 (I only changed this line when trying to fix the aforementioned error).

And now, the #@title Search against genetic databases cell runs fine and plotted the sequence coverage figure. However, the #@title run alphafold cell gives below error.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_40057/1661749859.py in <module>
    151       cfg.data.eval.num_ensemble = num_ensemble
    152 
--> 153       params = data.get_model_haiku_params(name,'./alphafold/data')
    154       model_runner = model.RunModel(cfg, params, is_training=is_training)
    155       COMPILED = compiled

~/.conda/envs/AF/lib/python3.9/site-packages/alphafold/model/data.py in get_model_haiku_params(model_name, data_dir)
     37     params = np.load(io.BytesIO(f.read()), allow_pickle=False)
     38 
---> 39   return utils.flat_params_to_haiku(params)

~/.conda/envs/AF/lib/python3.9/site-packages/alphafold/model/utils.py in flat_params_to_haiku(params)
     77     if scope not in hk_params:
     78       hk_params[scope] = {}
---> 79     hk_params[scope][name] = jnp.array(array)
     80 
     81   return hk_params

~/.conda/envs/AF/lib/python3.9/site-packages/jax/_src/numpy/lax_numpy.py in array(object, dtype, copy, order, ndmin)
   3085     _inferred_dtype = object.dtype and dtypes.canonicalize_dtype(object.dtype)
   3086     lax._check_user_dtype_supported(_inferred_dtype, "array")
-> 3087     out = _device_put_raw(object, weak_type=weak_type)
   3088     if dtype: assert _dtype(out) == dtype
   3089   elif isinstance(object, (DeviceArray, core.Tracer)):

~/.conda/envs/AF/lib/python3.9/site-packages/jax/_src/lax/lax.py in _device_put_raw(x, weak_type)
   1607   else:
   1608     aval = raise_to_shaped(core.get_aval(x), weak_type=weak_type)
-> 1609     return xla.array_result_handler(None, aval)(*xla.device_put(x))
   1610 
   1611 def zeros_like_shaped_array(aval):

~/.conda/envs/AF/lib/python3.9/site-packages/jax/interpreters/xla.py in device_put(x, device)
    156   x = canonicalize_dtype(x)
    157   try:
--> 158     return device_put_handlers[type(x)](x, device)
    159   except KeyError as err:
    160     raise TypeError(f"No device_put handler for type: {type(x)}") from err

~/.conda/envs/AF/lib/python3.9/site-packages/jax/interpreters/xla.py in _device_put_array(x, device)
    164   if x.dtype is dtypes.float0:
    165     x = np.zeros(x.shape, dtype=np.dtype(bool))
--> 166   return (backend.buffer_from_pyval(x, device),)
    167 
    168 def _device_put_scalar(x, device):

RuntimeError: Resource exhausted: Out of memory while trying to allocate 2097152 bytes.

However, all 4 GPUs and the RAM are available as shown below.

~$ nvidia-smi
Mon Aug 23 14:04:14 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02    Driver Version: 470.57.02    CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A6000    Off  | 00000000:18:00.0 Off |                  Off |
| 30%   24C    P8     6W / 300W |    460MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA RTX A6000    Off  | 00000000:3B:00.0 Off |                  Off |
| 30%   28C    P8    14W / 300W |    550MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA RTX A6000    Off  | 00000000:86:00.0 Off |                  Off |
| 30%   26C    P8     7W / 300W |    456MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA RTX A6000    Off  | 00000000:AF:00.0 Off |                  Off |
| 30%   24C    P8    17W / 300W |    452MiB / 48685MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           1.0T        7.3G        808G         40M        191G        994G
Swap:          7.5G          0B        7.5G

Any help would be very appreciated. Thanks

@sokrypton
Copy link
Owner

The advanced notebook is under active development. I would avoid trying to deploy it locally (unless you are willing to track daily bug fixes and implement them yourself). For a more stable setup see alphafold2_mmseqs2 notebook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants