Skip to content

Commit 86ea44b

Browse files
ZiyaoLiguolinke
andauthoredSep 22, 2022
Refine colab notebook (dptech-corp#52)
* refine notebook * rephrase * rephrase * rephrase * rephrase * rm output * rephrase * rephrase Co-authored-by: Guolin Ke <guolin.ke@outlook.com>
1 parent 8dae0b7 commit 86ea44b

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed
 

‎notebooks/unifold.ipynb

+15-15
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,11 @@
66
"id": "jMGcXXPabEN4"
77
},
88
"source": [
9-
"# Uni-Fold Colab\n",
9+
"# Uni-Fold Notebook\n",
1010
"\n",
11-
"This Colab notebook provides an online runnable version of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/) for users to predict the structure of a protein, single chain or multimer, with custom settings.\n",
11+
"This notebook provides protein structure prediction service of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/) as well as [UF-Symmetry](https://www.biorxiv.org/content/10.1101/2022.08.30.505833v1). Predictions of both protein monomers and multimers are supported. The homology search process in this notebook is enabled with the [MMSeqs2](https://github.com/soedinglab/MMseqs2.git) server provided by [ColabFold](https://github.com/sokrypton/ColabFold). For more consistent results with the original AlphaFold(-Multimer), please refer to the open-source repository of [Uni-Fold](https://github.com/dptech-corp/Uni-Fold/), or our convenient web server at [Hermite™](https://hermite.dp.tech/).\n",
1212
"\n",
13-
"Thanks to [MMSeqs2](https://github.com/soedinglab/MMseqs2.git) and the server provided by [ColabFold](https://github.com/sokrypton/ColabFold), the homogeneous searching in this notebook is very fast and is comparable with the original AlphaFold(-Multimer). If you want more consistent results with the original AlphaFold(-Multimer), you can use the [full open source Uni-Fold](https://github.com/dptech-corp/Uni-Fold/), or the convenient web server at [Hermite™](https://hermite.dp.tech/).\n",
14-
"\n",
15-
"Please note that this Colab notebook is not a finished product and is provided as an early-access prototype. It is provided for theoretical modeling only and caution should be exercised in its use. \n",
13+
"Please note that this notebook is provided as an early-access prototype, and is NOT an official product of DP Technology. It is provided for theoretical modeling only and caution should be exercised in its use. \n",
1614
"\n",
1715
"**Licenses**\n",
1816
"\n",
@@ -23,16 +21,15 @@
2321
"\n",
2422
"Please cite the following papers if you use this notebook:\n",
2523
"\n",
26-
"* Jumper et al. \"[Highly accurate protein structure prediction with AlphaFold.](https://doi.org/10.1038/s41586-021-03819-2)\" Nature (2021)\n",
27-
"* Evans et al. \"[Protein complex prediction with AlphaFold-Multimer.](https://www.biorxiv.org/content/10.1101/2021.10.04.463034v1)\" biorxiv (2021)\n",
28-
"* Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. \"[ColabFold: Making protein folding accessible to all.](https://www.nature.com/articles/s41592-022-01488-1)\" Nature Methods (2022) \n",
2924
"* Ziyao Li, Xuyang Liu, Weijie Chen, Fan Shen, Hangrui Bi, Guolin Ke, Linfeng Zhang. \"[Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold.](https://www.biorxiv.org/content/10.1101/2022.08.04.502811v1)\" biorxiv (2022)\n",
3025
"* Ziyao Li, Shuwen Yang, Xuyang Liu, Weijie Chen, Han Wen, Fan Shen, Guolin Ke, Linfeng Zhang. \"[Uni-Fold Symmetry: Harnessing Symmetry in Folding Large Protein Complexes.](https://www.biorxiv.org/content/10.1101/2022.08.30.505833v1)\" bioRxiv (2022)\n",
31-
"\n",
26+
"* Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S and Steinegger M. \"[ColabFold: Making protein folding accessible to all.](https://www.nature.com/articles/s41592-022-01488-1)\" Nature Methods (2022)\n",
3227
"\n",
3328
"**Acknowledgements**\n",
3429
"\n",
35-
"We thank [@sokrypton](https://twitter.com/sokrypton) for many helpful suggestions to this notebook.\n"
30+
"The model architecture of Uni-Fold is largely based on [AlphaFold](https://doi.org/10.1038/s41586-021-03819-2) and [AlphaFold-Multimer](https://www.biorxiv.org/content/10.1101/2021.10.04.463034v1). The design of this notebook refers directly to [ColabFold](https://www.nature.com/articles/s41592-022-01488-1). We specially thank [@sokrypton](https://twitter.com/sokrypton) for his helpful suggestions to this notebook.\n",
31+
"\n",
32+
"Copyright © 2022 DP Technology. All rights reserved."
3633
]
3734
},
3835
{
@@ -127,7 +124,6 @@
127124
"output_dir_base = \"./prediction\"\n",
128125
"os.makedirs(output_dir_base, exist_ok=True)\n",
129126
"\n",
130-
"\n",
131127
"def clean_and_validate_sequence(\n",
132128
" input_sequence: str, min_length: int, max_length: int) -> str:\n",
133129
" \"\"\"Checks that the input sequence is ok and returns a clean version of it.\"\"\"\n",
@@ -203,21 +199,25 @@
203199
"def add_hash(x,y):\n",
204200
" return x+\"_\"+hashlib.sha1(y.encode()).hexdigest()[:5]\n",
205201
"\n",
202+
"jobname = 'unifold_colab' #@param {type:\"string\"}\n",
206203
"\n",
207204
"sequence_1 = 'LILNLRGGAFVSNTQITMADKQKKFINEIQEGDLVRSYSITDETFQQNAVTSIVKHEADQLCQINFGKQHVVCTVNHRFYDPESKLWKSVCPHPGSGISFLKKYDYLLSEEGEKLQITEIKTFTTKQPVFIYHIQVENNHNFFANGVLAHAMQVSI' #@param {type:\"string\"}\n",
208205
"sequence_2 = '' #@param {type:\"string\"}\n",
209206
"sequence_3 = '' #@param {type:\"string\"}\n",
210207
"sequence_4 = '' #@param {type:\"string\"}\n",
211208
"\n",
209+
"#@markdown Use symmetry group `C1` for default Uni-Fold predictions.\n",
210+
"#@markdown Or, specify a **cyclic** symmetry group (e.g. `C4``) and\n",
211+
"#@markdown the sequences of the asymmetric unit (i.e. **do not copy\n",
212+
"#@markdown them multiple times**) to predict with UF-Symmetry.\n",
213+
"\n",
212214
"symmetry_group = 'C1' #@param {type:\"string\"}\n",
213215
"\n",
214216
"use_templates = True #@param {type:\"boolean\"}\n",
215217
"msa_mode = \"MMseqs2\" #@param [\"MMseqs2\",\"single_sequence\"]\n",
216218
"\n",
217219
"input_sequences = [sequence_1, sequence_2, sequence_3, sequence_4]\n",
218220
"\n",
219-
"jobname = 'unifold_colab' #@param {type:\"string\"}\n",
220-
"\n",
221221
"basejobname = \"\".join(input_sequences)\n",
222222
"basejobname = re.sub(r'\\W+', '', basejobname)\n",
223223
"target_id = add_hash(jobname, basejobname)\n",
@@ -1046,7 +1046,7 @@
10461046
},
10471047
"gpuClass": "standard",
10481048
"kernelspec": {
1049-
"display_name": "Python 3.8.10 ('ProteinMD')",
1049+
"display_name": "Python 3.8.10 64-bit",
10501050
"language": "python",
10511051
"name": "python3"
10521052
},
@@ -1056,7 +1056,7 @@
10561056
},
10571057
"vscode": {
10581058
"interpreter": {
1059-
"hash": "af92dc656850d97b5469b75c9ef2009aaa936e713f0093b069a7ff14eeb2ca8d"
1059+
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
10601060
}
10611061
}
10621062
},

0 commit comments

Comments
 (0)
Please sign in to comment.