# **Haruspex@Colab**


---

---

This is [Haruspex](https://github.com/thorn-lab/haruspex), a deep neural network trained to identify oligonucleotides and secondary structure in Cryo-EM maps.

You can use this notebook to predict secondary structure (helices, sheets and nucleotides) in any EMDB map as well as in your own Cryo-EM map, using the powerful cloud GPUs provided by Google. To run it, please log into your **Google account** first, then follow the instructions step-by-step. If you get a warning when running it the first time, just press **"Run Anyway"**.

Code and Documentation: https://github.com/thorn-lab/haruspex

The details are described in our publication - please cite us if you use Haruspex for your work:

> Mostosi, P., Schindelin, H., Kollmannsberger, P., Thorn, A. **Haruspex: A Neural Network for the Automatic Identification of Oligonucleotides and Protein Secondary Structure in Cryo‐EM Maps.** (2020) *Angew. Chem. (Int. Ed.)* https://doi.org/10.1002/ange.202000421

Questions & comments about this notebook to <Philip.Kollmannsberger@uni-wuerzburg.de>









---

---



## **1) Check GPU availability**

Please change the runtime to GPU by selecting `Runtime->Change runtime type` if not already done so, and then run the following cell. If the last line of the output says something like `Tesla-P100`
, you are fine. Otherwise, please try again now or at a later time.

In [0]:
#@title *Run this cell to check GPU availability*
%tensorflow_version 1.x

from tensorflow.python.client import device_lib 
device_lib.list_local_devices()



---

---


## **2) Install requirements and clone Haruspex repository**




The default Colab environment already contains Tensorflow and most packages required by Haruspex. We only need to install `mrcfile` to handle maps, and then clone the Haruspex repository from github, including the pre-trained network described in the paper.

In [0]:
#@title *Run this cell to install requirements and get Haruspex*
!pip install mrcfile
!git clone https://github.com/thorn-lab/haruspex
%cd haruspex



---

---


## **3) Setup map to be predicted**

There are three options: you can enter an EMDB ID, then the corresponding map will automatically be downloaded into this session. Alternatively, you can select "Choose Files" and upload your own map. As a third option, you can connect to your Google Drive.

---

**Option 1: Use EMDB map**

Please enter an EMDB ID in the following field and run the cell to download the corresponding map from the EMDB, then proceed with step 4)



In [0]:
#@title *Enter an EMDB ID and run this cell to download the corresponding Cryo-EM map:*

EMDB = '9627' #@param {type:"string"}
Filename = "emd_"+EMDB+".map.gz"
ftp_string = "ftp://ftp.ebi.ac.uk/pub/databases/emdb/structures/EMD-"+EMDB+"/map/emd_"+EMDB+".map.gz"
!wget $ftp_string
print("\n ---> your filename is "+Filename)


---

**Option 2: Upload your own map**

If you want to upload your own map: run the following cell, click "Choose Files" and select a map file on your local computer to be uploaded.

In [0]:
#@title *Run this cell and click "Choose Files" to upload your own map.*
from google.colab import files
file = files.upload()
for fn in file.keys(): Filename=fn
print("\n ---> your filename is "+Filename)

---

**Option 3: Connect to your Google Drive**

If your map is stored on your Google Drive, you can connect this notebook and enter the path below.

In [0]:
#@title *Run this cell and follow instructions to connect your GDrive*
from google.colab import drive
drive.mount('/content/drive')


---

---

## **4) Run prediction**

Enter the filename of the map to be predicted (as shown above), and run the following cell. If you connected your Google Drive, you need to enter the full path (`"/content/drive/...your map file..."`)

In [0]:
#@title *Enter map filename and run this cell to let Haruspex predict your map:*

Filename = "emd_9627.map.gz" #@param {type:"string"}

# this command disables logging in newer TF 1.x versions
newstr = "tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)"

# modify the haruspex code from the repo which is for an older TF version
!sed -i '22i$newstr' source/hpx_unet_190116.py

# run the prediction from the shell, disabling warnings via the environment
!TF_CPP_MIN_LOG_LEVEL='3' python source/hpx_unet_190116.py -n network/hpx_190116 -d map-predict "$Filename" -o .



---

---


## **5) Download results**



Run the following cell to automatically generate and download a `.zip` archive containing the three predicted classes as `.mrc` files. Please be patient, this may take a while.

Alternatively, you can open the **Files** pane on the top left (click on the folder symbol), and download the `.mrc` files manually by **right-clicking** on them and selecting **Download**.

In [0]:
#@title *Run this cell to download results as `.zip` archive*
!zip -r haruspex_result.zip /content/haruspex/$Filename_*.mrc
from google.colab import files
files.download("haruspex_result.zip")