# Netsurf
Here I document the steps for obtaining the netsurfp2 predicitons for the gray2018 sequences.

## Installation of netsurf
Since netsurf needs hhblits and its hige database, it cannot be run locally. There is a webserver, but to have more control I use a local version. A tar.gz can be obtained from the website of netsurf after creating an account. The software needs tensorflow 1.14 (wrongly stated as tf1.14 +, but with 1.15 does not work). The easiest way for me is to first pull the tensorflow docker image and run it on singularity.

```
singularity run docker://tensorflow/tensorflow:1.14.0-py3
```

From inside the singularity image I create a python virtual environment and activate it

```
pip install virtualenv
python -m virtualenv -p3 netsurfp2_env
source netsurfp2_env/bin/activate
```

Now I modify the setup.py file in the netsurf unpacked folder such that tensorflow is installed with version 1.14 (I just change >= to ==).
I install netsurf with pip (I am still inside the singularity image with the new env activated.


```
pip install ./netsurfp-2.0
```

Now, when I want to use netsurf I just need to activate in succession the singularity image and the venv

```
singularity run docker://tensorflow/tensorflow:1.14.0-py3
source netsurfp2_env/bin/activate
```

## Running on a single protein

I always run netsurf in hhblits mode and not mmseq

```
netsurfp2 --npz <npz_ouput> --csv <csv_output> --hhdb <hhblits_database> hhblits netsurfp-2.0/models/hhsuite.pb <input_seq> <output_dir_for_hhblits>
```

## Inspection of the output

For convenience I save both the csv and the npz output

In [16]:
import numpy as np

vecs = np.load('../processing/gray2018/netsurfp2/P00552_netsurf.npz')
for key in vecs:
    print(key, vecs[key].shape)

rsa (1, 264, 1)
asa (1, 264, 1)
phi (1, 264, 2)
psi (1, 264, 2)
disorder (1, 264, 2)
q3 (1, 264, 3)
q8 (1, 264, 8)


In [14]:
import pandas as pd

pd.read_csv('../processing/gray2018/netsurfp2/P00552_netsurf.csv')

Unnamed: 0,id,seq,n,rsa,asa,q3,p[q3_H],p[q3_E],p[q3_C],q8,...,p[q8_H],p[q8_I],p[q8_B],p[q8_E],p[q8_S],p[q8_T],p[q8_C],phi,psi,disorder
0,sp|P00552|KKA2_KLEPN,M,1,0.822529,164.588128,C,0.000160,0.000145,0.999696,C,...,0.000058,1.001165e-07,0.000102,0.000058,0.000181,0.000224,0.999350,-87.118301,142.407318,0.943008
1,sp|P00552|KKA2_KLEPN,I,2,0.600963,111.178132,C,0.003962,0.022914,0.973124,C,...,0.002222,1.387788e-05,0.006198,0.014861,0.005186,0.003495,0.967295,-98.703560,130.798309,0.830139
2,sp|P00552|KKA2_KLEPN,E,3,0.735305,128.457765,C,0.005751,0.038289,0.955960,C,...,0.002869,2.413134e-05,0.005109,0.025928,0.008985,0.006244,0.949729,-100.015266,133.604431,0.831449
3,sp|P00552|KKA2_KLEPN,Q,4,0.596247,106.489802,C,0.015636,0.035524,0.948840,C,...,0.007874,1.045063e-04,0.005960,0.027620,0.017990,0.023693,0.911902,-96.384109,126.175262,0.790746
4,sp|P00552|KKA2_KLEPN,D,5,0.748247,107.822457,C,0.022436,0.014552,0.963013,C,...,0.012470,1.152428e-04,0.003055,0.012657,0.043818,0.089318,0.827739,-91.361374,61.306732,0.724121
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
259,sp|P00552|KKA2_KLEPN,L,260,0.025977,4.756476,H,0.932171,0.000387,0.067442,H,...,0.862717,2.281900e-03,0.000193,0.000240,0.004548,0.048463,0.026031,-66.348572,-34.871666,0.002238
260,sp|P00552|KKA2_KLEPN,D,261,0.349241,50.325661,H,0.870714,0.001176,0.128111,H,...,0.636169,1.810826e-03,0.000369,0.000721,0.006864,0.117017,0.030602,-65.019424,-31.914127,0.008015
261,sp|P00552|KKA2_KLEPN,E,262,0.536688,93.759315,H,0.746612,0.001392,0.251996,H,...,0.543203,1.943279e-03,0.000493,0.000731,0.009041,0.225058,0.048271,-69.802399,-26.102558,0.009341
262,sp|P00552|KKA2_KLEPN,F,263,0.126805,25.449741,H,0.499056,0.002546,0.498399,H,...,0.396480,3.465551e-03,0.001158,0.001273,0.034058,0.313743,0.134817,-85.543221,-14.046273,0.013830
