bugfix for latest jax version (jnp.take() in version jax>=0.3.8 return nans) #513

sokrypton · 2022-06-17T05:07:52Z

alphafold+multimer+templates returns NAN, starting with jax version 0.3.8
@YoshitakaMo traced it down to def batched_gather() in alphafold/model/utils.py

you need to change:
jnp.take(p, i, axis=axis)
to
jnp.take(p, i, axis=axis, mode="clip")

https://jax.readthedocs.io/en/latest/changelog.html#jax-0-3-8-april-29-2022
The reason this bug occurs is that starting in jax 0.3.8, jnp.take() now returns NaNs for indices that do not exist.

the alphafold pinned version of jax:

the output starting with jax=>0.3.8:

The text was updated successfully, but these errors were encountered:

google-deepmind#513

see: google-deepmind/alphafold#513

@hunarbatra

* updating (#65) * adding plotting function to color by pLDDT * moving functions to af2rank class * adding option to specify custom params * typo * remove recycle-dimension, in prep for multimer support * missed a few edits to remove recycle dims * adding multimer files * cleaning up the multimer code * initial experiment towards adding multimer support * adding missing geometry files * typos * bugfixes * dropout_scale support for multimer * all_atom_masks -> all_atom_mask (to be consistent with multimer) * debugging template injection for multimers * fixing multimer nan bug see: google-deepmind/alphafold#513 * adding multimers support for binder-hallucination * moving multimer feature creation to prep.py * fixing config * bugfix; adding multimer support for other protocols * cleanup * cleaning up the prep options * fixing crop typo thanks @hunarbatra * v1.0.6-alpha * typo * main updates (#68) * adding plotting function to color by pLDDT * moving functions to af2rank class * fixed crop_feats error * fixing colab link * adding "seq" to inputs (for custom loss) Co-authored-by: Hunar Batra <i@hunarbatra.com> * cleanup, add iptm * typos * standardizing the template-specific options * cleaning template update code * minor edit * adding fape support for multimers * multimer fape loss bugfix * splitting fape/i_fape * change order of verbose print * rewriting fape function to accept number of homo-oligomeric copies * cleaning up the code * adding weights to fape loss * adding option to control fape_cutoff (aka clamp) * adding seq_ent loss * stabilize entropy calculation * removing seq-ent for now * disabling stats correction by default * Update .gitignore * cleaning up the pair loss * adding experimental copies support to partial protocol (for Possu) * bugfixes involving partial+copies * adding homo-oligomeric support to rewire * undo last commit * bugfix * cleanup * adding seq_ent loss * correcting entropy compute based fix_seq * rescaling entropy loss based on number of fixed positions * cleanup * Update loss.py * cleaning up the code * adding alphafold-multimer support to AF2Rank * typo * adding support for repeat/homooligomers for partial hallucination * bugfixes for partial homo-oligomeric support * minor edits (for future) * refactoring prep_pdb, adding option to offset and extend length * updating binder contact loss to include binder2target and target2binder contacts * adding i_con back (as (tb_con+bt_con)/2) * typos * removing i_con * cleanup * design.py - pull _apply_gradient() out of step() * updating defaults - fixbb confidence loss set to zero and is over all positions - binder only positions in PDB are loaded, missing density ignored - adding num_tot option to control number of total contacts to optimize for * rename num_tot to num_pos * bugfix * fix_seq option replaced with fix_pos to allow control which positions are fixed * refactoring * adding experimentally resolved loss on CA * setting default exp_res weight * adding mlm * rename * bugfix * adding option to disable mlm * remove target_feat * temp, broken, updating use_crop * Update design.py * bugfix, crop_feat, remove add_batch * fixing crop options * adding helper functions * fixing typos * adding i_pae for binder design * bugfix, removing 2stage_binder_hallucination for now * cleanup * revert * updating default mlm_dropout default * cleanup * fixing backprop option for multimer model * update - adding "first" recycle mode, thanks @whitead - adding "hard" annealing step to 3 stage design - adding "ramp_recycles" option to 3 stage design * partial revert * revert latest experiments * typos * refactoring recycle code to reduce compile time * typos; changing recycle_mode default to last * improving num_recycles control * adding small plddt loss to binder hallucination default * updating readme * cleanup * Update design.ipynb * cleaning up recycle code, moving experimental crop functions to crop.py * crop.py import fix * cleaning up predict() * bugfix in partial hallucination protocol -[pos]itions not defined if copies not defined * debugging... * bugfix: fix_pos in partial * bugfix for partial fix_pos option * adding fix_pos option to trRosetta * Update joint_model.py * revert Co-authored-by: Hunar Batra <i@hunarbatra.com>

Htomlinson14 · 2022-09-21T16:59:18Z

Thanks for this suggestion! This has been fixed in https://github.com/deepmind/alphafold/releases/tag/v2.2.4

* Update residue_constants.py * Lower log level for unimportant message * Apply config patch * Apply model patch * Apply patch * Apply protein patch * - Remove seen_sequence from make_msa_features to avoid paired sequences to be removed - change Protein.to_pdb back to default implementation * Change name for publishing * Comment _merge_homomers_dense_msa to improve monomer prediction * Turn on dense MSA again * Bump to 2.1.1 * Rebase fallout * Update repo url * Speed up param loading * Bump to 2.1.2 * Publish form github actions * Stop at score for recycles * Make stop_at_score float for google-deepmind#119 * Show num_recycles for multimer * Bump to 2.1.4 * Set max_subsequence_ratio to 1.0 * Update version to 2.1.6 * Update setup.py * Remove duplicate, template_sequence len and align_ratio check * Update setup.py to 2.1.8 * Try to improve compiling * Update setup.py * Update setup.py * Change folding_multimer to compile faster * Change return type in multimer * Update modules.py * Update modules.py * Update modules.py * adding recycling outside of the jax compiled code * fixing typo * fixing typos * fixing typos * typo * Update modules_multimer.py * Update modules.py * Update model.py * adding manual recycling support for multimers * fixing recycle count * Update modules_multimer.py * rm key subsampling * debugging * debugging * Update modules_multimer.py * Update model.py * cleaning up the code * typo * Cond fix for older jax version * Add recycle early-stopping to the model * Update repository to 2.1.10 * fixing the missing key in fold_iteration * Update version to 2.1.11 * bugfix: removing nans from alphafold-multimer+templates google-deepmind#513 * Update to 2.1.12 * Update model.py * Update model.py * Add 2.1.13 (fix ensemble) * Update setup.py * Workaround for breaking change in PDB.PDBIO v1.80 not working correctly with StringIO * Actually fix PDBIO issue * adding support for fused triangle attention to monomers (google-deepmind#2) * adding option to "unfuse" * Update utils.py * Update utils.py * Squashed commit of the following: commit a394167 Author: Augustin Zidek <augustinzidek@google.com> Date: Thu Jan 12 03:05:53 2023 -0800 Fix typo in a comment in residue constants. PiperOrigin-RevId: 501515486 Change-Id: I2a86a64ebbf0ab8222689268755ba3b7780878e5 commit d6d2fc0 Author: Hamish Tomlinson <htomlinson@google.com> Date: Wed Jan 11 09:05:42 2023 -0800 Bump version to 2.3.1 PiperOrigin-RevId: 501297397 Change-Id: Ic1bb166581047e0e8c46845f41f8c55c10f32ef9 commit e3231de Author: DeepMind <noreply@google.com> Date: Wed Jan 11 07:41:58 2023 -0800 Update of README.md PiperOrigin-RevId: 501279176 Change-Id: I9cf92212322b29691844973ded9e337e81b3a9fd commit 8f1ebd5 Author: Ali Cowen-Rivers <alexcr@google.com> Date: Tue Jan 10 15:08:44 2023 -0800 Fix GPU relax for longer chains by pinning large memory ops to cpu. PiperOrigin-RevId: 501105389 Change-Id: I6c981d1d3231e008ebae192edb4586479eb5eb34 commit 420fb08 Author: Ali Cowen-Rivers <alexcr@google.com> Date: Fri Dec 23 05:51:21 2022 -0800 Adding recycle information for timings. PiperOrigin-RevId: 497360470 Change-Id: I3a9f3ba608ac3ceeaaccfe281a82d43fae1d265e commit a9e5451 Author: Ali Cowen-Rivers <alexcr@google.com> Date: Wed Dec 21 09:14:48 2022 -0800 Adding prediction timings to README. PiperOrigin-RevId: 496937349 Change-Id: I9e9f447b3ce11d1b5a5c7433aeae7b03a3ed19e9 commit f96e254 Author: Josh Abramson <jabramson@google.com> Date: Wed Dec 21 04:02:33 2022 -0800 Add `eval_dropout` option for using dropout in trunk at eval time. PiperOrigin-RevId: 496885301 Change-Id: I42de2dd13784e2b358320349398a3fc88ee0d708 commit 4b726a2 Author: Augustin Zidek <augustinzidek@google.com> Date: Tue Dec 20 05:31:19 2022 -0800 Speed up Colab multimer MSA search: fetch each db chunk only once and run all queries against it. PiperOrigin-RevId: 496636186 Change-Id: I0a428b6269f8e1bcb1a6efb33cad2fc70b0d1f35 commit b21167b Author: Peter Hawkins <phawkins@google.com> Date: Mon Dec 19 07:41:04 2022 -0800 [NumPy] Remove references to deprecated NumPy type aliases. This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str). NumPy 1.24 drops the deprecated aliases, so we must remove uses before updating NumPy. PiperOrigin-RevId: 496396651 Change-Id: Ifbf86edb8c7ba3bf1a427c1b5276e8eb33041ab9 commit e80e252 Author: Andrew Cowie <cowie@google.com> Date: Thu Dec 15 08:43:30 2022 -0800 Restore the max sequence limit to 4000 now that unified memory is fixed. PiperOrigin-RevId: 495602713 Change-Id: I091644565b3bbb2e95ee434c6cd80d6d54d677dc commit ed3ecab Author: Augustin Zidek <augustinzidek@google.com> Date: Thu Dec 15 03:50:14 2022 -0800 Add query_multiple to Jackhmmer. This enables searching with multiple queries against a db chunk without having to re-download it each time. PiperOrigin-RevId: 495553498 Change-Id: I5e3df1cc31cdcef591a1516797f0372f171e413c commit dd643b1 Author: Peter Hawkins <phawkins@google.com> Date: Thu Dec 15 03:01:48 2022 -0800 [NumPy] Remove references to deprecated NumPy type aliases. This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str). NumPy 1.24 drops the deprecated aliases, so we must remove uses before updating NumPy. PiperOrigin-RevId: 495546406 Change-Id: Iaf1bfd2000fee1bffada5138ae16ec192916c076 commit a0b0cd9 Author: Andrew Cowie <cowie@google.com> Date: Tue Dec 13 08:26:52 2022 -0800 Set environment variables before any other code is executed. PiperOrigin-RevId: 495028896 Change-Id: I3b1a6ddceca1961bffdaba443e7db47bbfbc4fad * set eval_dropout when user specifies is_training * Update model.py * adjust config to revert back to old settings for multimer v1/v2 * enable fuse for all models by default * bugfix: typo in config (v1, v2 multimer settings) * add use_cluster_profile and recycle_early_stop_tolerance support * adding recycle_early_stop_tolerance support for monomers * add bfloat16 support for monomers * Update model.py * Create stereo_chemical_props.txt * Update setup.py * fix templates * Create MANIFEST.in * fix stereo_chemical_props.txt path * speedup attempt to speedup the function by removing ensembles when num_ensembles=1 * adding mask to ptm/iptm calculation important for batch compute that uses masking * adding iptm support for ptm models * Update modules.py * moving key splitting to model.py * keep raw outputs * Update model.py * Update modules.py * allow for no batch inputs * v2.3.3 - fixing memory leaks * fix memory leaks * use bfloat16 for representations * move confidence compute inside module.py * move multimer key splitting to model.py * v2.3.4 - fix memory leaks (attempt 2) (google-deepmind#6) * fix memory leaks various edits to fix memory leaks memory leak fix * v2.3.4 - fix memory leaks another attempt to fix memory leaks! * Update config.py * bugfix - num-ensemble * Update modules.py * add option to provide custom offset * Update config.py * Update config.py * Update OpenMM imports to work with new OpenMM API * patch for jax > 0.3.25 * fix single representation msa_activation is (N,L,256) in colabfold v1.5.2 we return msa_activation[0] as our single representation vector looks like there is one extra linear layer to convert msa_activations[0] to single_activation: ![image](https://github.com/sokrypton/alphafold/assets/4187522/1183a0fb-1a07-4626-9ada-12e32fd6891c) If anything the (L,256) representation might be better, as you might be losing some information by doing the extra transformation at the end. But since people are asking, I'm adding the transformation back so that the output is (L,386). typo * Update modules_multimer.py --------- Co-authored-by: konstin <konstin@mailbox.org> Co-authored-by: Martin Steinegger <themartinsteinegger@gmail.com> Co-authored-by: Martin Steinegger <martin.steinegger@mpibpc.mpg.de> Co-authored-by: Sergey O <krypton@uw.edu> Co-authored-by: Sergey O <so@fas.harvard.edu> Co-authored-by: Milot Mirdita <milot@mirdita.de> Co-authored-by: Sergey O <kingsergey@gmail.com>

sokrypton mentioned this issue Jun 17, 2022

NaN values in the output coordinates sokrypton/ColabFold#243

Closed

sokrypton added a commit to steineggerlab/alphafold that referenced this issue Jun 17, 2022

bugfix: removing nans from alphafold-multimer+templates

bd696aa

google-deepmind#513

baba-hashimoto mentioned this issue Jun 17, 2022

Jax Version and pocketfft kalininalab/alphafold_non_docker#45

Closed

This was referenced Aug 4, 2022

add patch for AlphaFold v2.2.2 to fix NaN problem with jax 0.3.9 easybuilders/easybuild-easyconfigs#15874

Merged

The predicted values are nan #120

Closed

sokrypton added a commit to sokrypton/ColabDesign that referenced this issue Aug 18, 2022

fixing multimer nan bug

aab2b5b

see: google-deepmind/alphafold#513

Htomlinson14 closed this as completed Jan 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix for latest jax version (jnp.take() in version jax>=0.3.8 return nans) #513

bugfix for latest jax version (jnp.take() in version jax>=0.3.8 return nans) #513

sokrypton commented Jun 17, 2022 •

edited

Htomlinson14 commented Sep 21, 2022

bugfix for latest jax version (jnp.take() in version jax>=0.3.8 return nans) #513

bugfix for latest jax version (jnp.take() in version jax>=0.3.8 return nans) #513

Comments

sokrypton commented Jun 17, 2022 • edited

Htomlinson14 commented Sep 21, 2022

sokrypton commented Jun 17, 2022 •

edited