Skip to content

Navigation Menu

Explore
By size
By industry
By use case
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

lucidrains / alphafold3-pytorch Public

Notifications You must be signed in to change notification settings
Fork 47
Star 654

Code
Issues 1
Pull requests
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

todo #7

Open

38 of 56 tasks

lucidrains opened this issue May 20, 2024 · 0 comments

Open

38 of 56 tasks

todo #7

lucidrains opened this issue May 20, 2024 · 0 comments

Comments

Copy link

Owner

lucidrains commented May 20, 2024 •

edited

Loading

modules
- SmoothLDDTLoss (thanks to @joseph-c-kim)
- WeightedRigidAlign (@engelberger)
- ExpressCoordinatesInFrame (@engelberger)
- ComputeAlignmentError (@engelberger)
- CentreRandomAugmentation (@engelberger)
miscellaneous
- f_tokenbond embedding to pairwise init (default to one single chain for starters if not passed in)
- take care of normalization and unnormalization of atomic coordinates
- distance labels should be derived from atom positions if not given
- weighted rigid align module needs to account for atom_mask (variable number of atoms per batch sample)
- sample without replacement in MSAModule
- make sure diffusion loss accounts for mask of nucleic acid / ligand + bond loss during fine tuning
- return the entire loss breakdown for logging in eventual trainer
- hook up the centre random augmentation
@lucidrains take care of
- packed atom representation
  - given atom lengths and a sequence, do an average pool based on those lengths - atom -> token
  - given atom lengths and a sequence, expand sequence to consecutives, for token -> atom
- fix packed atom representation when going from token level -> atom level pairwise repr
- packed repr - make sure repeating pairwise is done in one specialized function, also take care of curtailing or padding the mask through some kwarg
- able to pass in residue indices for only protein training, everything else derived
- atom transformer attention bias needs to be calculated efficiently in the Alphafold3 module, use asserts to make sure shape is correct within local_attn fn
- take care of residue identities / indices -> atom feats + atom bonds + attention biasing for atom transformers
- allow for additional modifier embeddings to each molecule with optional scale
training
- validation and test dataset
- add config driven training with pydantic validation for constructing trainer and base model
- saving and loading for both base alphafold3 model as well as trainer + optimizer states
- add trainer orchestrator config that contains many training configs and one model
- able to reconstitute the entire training history
dataset classes for handling
- single protein input
- multimer input
- multimer + nucleic acid(s) input
- multimer + ligand input
- for atom positions, create another dataclass that breaks it down by biomolecule type, and order + validate it automatically against what is given in Alphafold3Input
- handle modifications to residues + nucleotides (phosphory, n-glycans, methylation)
- figure out whether disulfide bonds are provided at any time
dataset pipelines
- proteins - pdb
- nucleic acids
- msa
- templates
improvisations
- add attn logit soft capping, validated in gemma 2
- add register tokens
- improve atom transformer with some linear attention + other efficient attention tricks
- frame averaging in place of their random aug
- rectified flow instead of diffusion
- add layer sharing
- instead of all the attention biasing complexity in atom transformer, alternate between GNN (with the sparse bonds) + flash attention
- additional conditioning on diffusion module
- use conditionally routed attention for atom encoder and decoder
cleanup
- remove unpacked representation

The text was updated successfully, but these errors were encountered:

andre-brainn, guruace, engelberger, bteo98, hnguyentt, amorehead, afvca, prmshra, WHUminghui, youmustfight, and 12 more reacted with thumbs up emoji

andre-brainn, Poko18, alpoge, afvca, Nobody-Zhang, WHUminghui, youmustfight, and imSeaton reacted with hooray emoji

andre-brainn and imSeaton reacted with heart emoji

andre-brainn, afvca, WHUminghui, youmustfight, hw-protein, and imSeaton reacted with rocket emoji

andre-brainn, Poko18, serbulent-av, engelberger, afvca, WHUminghui, youmustfight, zhou-pig, and imSeaton reacted with eyes emoji

All reactions

👍 22 reactions
🎉 8 reactions
❤️ 2 reactions
🚀 6 reactions
👀 9 reactions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

1 participant

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.