# Homology Modelling Tutorial
Astrid Brandner, Dheeraj Prakaash and Syma Khalid. Based on ealier work by Max Epstein, Bjarne Feddersen, Phil Biggin and Syma Khalid.

## Some notes about recent advancements in structure prediction

Homology modelling has been a very useful technique for many years for obtaining predictions of the three-dimensional structures of proteins based on the sequence similarities/homology with proteins of known structure. In the last 2 years this field of bioinformatics has seen a major revoluation with the advent of the AI driven structure prediction tool; Alpha Fold2 (AF2).

AF2 has been shown to be very accurate for the prediction of single domain, single chain, globular proteins. It is currently being extended to improve its capabilities for dealing with membrane proteins, multidomain proteins and oligomeric assemblies. We do not have time to go into this in detail here, but you can read all about it here: https://alphafold.ebi.ac.uk . Indeed it now even has a facebook interface which you can have a play around with in your own time.


## Objectives

In this tutorial we will be building an homology model of the Yersinia Pestis outer membrane protein, OmpA. There are quite a few homology modelling tutorials available online - most of them guide you through a very simple case of a single sequence with a known good alignment to a target with no gaps.   However, often though we are not so fortunate. This tutorial is designed to take you through a more typical scenario, where you may want to first generate an initial multiple sequence alignment to improve the pair-wise alignment that you will use for the structure generation, what to do about missing parts of the template, how to generate multiple domain models, loop refinement and finally how to score and select a good model.

In general it is also worth thinking about some of the following points before one starts making models:


- In what state do we want to model our protein (i.e open, closed or desensitized)?

- Does a structure of a phylogenetically related protein in that particular state exist?

- If so, are there multiple structures of this state?

- What structure gives the best sequence alignment?

- What if you have high sequence alignment but low resolution? Perhaps it's worth considering a different template...

- Are there any other reasons why you might not want to select a particular structure? I.e. inaccurate assignemnt of EM density - missing loops or regions. 

We will use this jupyter notebook to explore this.  If you are following the OxCompBio course, then an introduction to python and juypter notebooks is the first session.

The idea of this tutorial is that you should be able to follow it through at your own pace.  If you are of a timetabled course, then there will be some demonstrator help available, but this tutorial has been designed such that you should be able to complete it without help, even if you get stuck.

**BEFORE YOU START - Please check if all the programs are installed locally in the desktop you are using.

Open the linux Virtual Machine and search for a terminal. In the terminal type:
- `% jalview --version`
- `% pymol -c`
- `% mod9.23` 

Please raise your hand if some command does not work and let a demonstrator help you.

### 1. Perform a mulitple sequence alignment

- Make a directory entitled 'OmpA_model' where you can save outputs to.  This will be relative to the location of this notebook.

In [2]:
! mkdir OmpA_model

#### Get Sequences

- Go to [uniprot.org](https://www.uniprot.org/) to search for the sequences of interest.

- The gene encoding OmpA is 'ompa', so type that in to the search bar and look for OMPA_YERPE.

- This is the OmpA protein of the bacterium Yersinia Pestis

- Click on the entry number Q8ZG77 and locate the FASTA sequence.

- The full length protein has two domains, the N-terminal domain is the outer membrane spanning domain, this is connected via a linker to the soluble C-terminal domain which is located in the periplasm
- Within the sequence the first 21 residues are the signal domain, so we can discard these residues.

- We will first build a model of the transmembrane region.

- The transmembrane region starts at residues 'APKD' and ends at the residues: 'SYRFG'

- Save the FASTA file of only the transmembrane residues

- Save the FASTA file of the C-terminal domain residues (starting from GQED)

- On both FASTA files include the first line which contains important metadata


- We'll now do a blast search to look for a suitable template sequence for modelling the membrane spanning domain. This will be a sequence for which a 3D structure exists that we can use as a template on which to base our homology model.

- Go to the [Blast](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) website

- In the 'choose Search set', change the Database to PDB.

- Copy and paste the sequence of the full length OmpA into the ENTER QUERY SEQUENCE box and click BLAST.

- 1BXW is a structure of the OmpA transmembrane domain from E. coli (look in the accession code column)

- 2K0L is a structure of the OmpA transmembrane domain from K. pneuomoniae

- What is the % sequence identity of both of these with the Y. pestis protein? 

- Now locate the FASTA sequences of these proteins in the PDB database 

- Finally, save these two sequences (The E. coli and K. pneuomoniae transembrane domains) as FASTA files (the transmembrane domains start and end with the same residues as the Y. pestis OmpA) - for each protein, save the transmembrane domain and the C-terminal domain residues as separate files.

#### Generate the alignment

- Once you have both sequences saved, go to the [MUSCLE](https://www.ebi.ac.uk/Tools/msa/muscle) multiple sequence alignment website:

- Copy and paste in all three FASTA sequences. Remember to include the first line of each respective FASTA sequence that contains the metadata. If you want to include your email address to send the resulting file to you can, although this is not necessary.

- When the alignment is ready, right click on the 'Download Alignment File' tab and click 'Save link as'.

- Save the resulting .clw file as 'OmpA_alignment_TM.clw' to your 'OmpA_model' directory.

- Do the same for the C-terminal domain FASTA sequences and save the resulting .clw file as 'OmpA_alignment_Cter.clw' to your OmpA_model directory

Next we will visualize the alignments...

#### Visualize the alignment

- Open up Jalview (if you are on windows you will have to click on the desktop icon) and view each alignment file in turn.

In [None]:
! jalview # linux/mac - windows users hit the desktop icon

- Close all the indivual pop-up windows first and load your alignment saved in the previous step (OmpA_alignment_TM.clw). To load it: File-> Input Alignment -> From file -> OmpA_alignment_TM.clw

- Click on the 'Colour' tab and select 'Clustalx'. This will visualise your alignment by amino acid properties. 

- Scroll through your alignment. If there are any insertions or deletions ('indels') denoted by dashes in regions of high sequence identity then you should scrutinize your alignemnt carefully and if needed, move them at the point of generating your final alignment. Indels commonly show up in regions of low sequence identity where flexible loops exist. These are usually less problematic but should be analysed before model generation. 

- Jalview can also interface to a secondary structure prediction.

- Click on the 'Web Service' tab and then find 'JPred secondary structure prediction'.
You should be able to see red lines where alpha-helices have been predicted and green lines for beta-sheets.

- Congratulations- you've completed the MSA part of this tutorial!   

- Before you quit jalview, use to save a new format that is easier to manipulate in Modeller.  Go to the tab and save as OmpA_alignment_TM.pir in your OmpA_model directory.

- Do the same for OmpA_alignment_Cter.clw - save as: OmpA_alignment_Cter.pir



### 2.  Build a model of the transmembrane (N-ter) domain of OmpA first

- For multidomain proteins like the OmpA, it may help to just try and build a model of a single domain in the first instance.  

- Homology modelling at its simplest just needs a template and the sequence to be modelled.   We could have made an alignment of OmpA from a single bacterium.  However,  if you have other sequences from related proteins, the quality of the alignment is usually much better, which is why did the above multiple sequence alignment. 

- Ok let's focus on the transmembrane domain first.


- Which structure should we use as the template for our homology model of the Y. pestis OmpA transmembrane domain? We have two options remember; (1) the structure from E. coli or (2) the structure from K. pneumoniae. Lets take a look at both structures. Go to [rcsb](http://www.rcsb.org) and search for 1bxw and 2k0l. 
- Visualise both structures in Pymol.

In [None]:
! pymol ~/Downloads/2k0l.pdb ~/Downloads/1bxw.pdb

- 2k0l is an ensemble of 20 structures determined by NMR. You can cycle through the structures using the controls on the right hand side
- You will see that some of the beta strands that make up the barrel are splayed somewhat and in some cases the periplasmic loops/turns are located in the region where the membrane would be.
- This is not the case for 1bw which is an X-ray structure.
- Therefore despite the K. pneumoniae OmpA having slighly higher sequence identity with the Y.pestis protein, the structure of the E.coli transmembrane domain is probably more representative of the protein in a membrane environment and thus this (1bxw) is the one we will use.


To begin making a homology model, we first need an alignment file we can use with the homology modelling package, Modeller. Firstly create a directory inside OmpA_model/copy your pir file into a new file with extension .ali

In [None]:
! cp OmpA_model/OmpA_alignment_TM.pir OmpA_model/pairwise_alignment_TM.ali

- Although we copied the file and called it pairwise_alignment_TM.ali it currently does not have the .ali format. 


- You should have a file with the entries for the Y. pestis, E. coli and the K. pneumoniae 
- Delete the entry for the K. pneumoniae protein
    
- You will also need to replace the line starting with SP under the OMPA_ECOLI entry with the following string :- 

    `structureX:OmpATM.pdb:START:A:END:A::: 0.00: 0.00`
  
  
- Similarly replace the line starting with SP under the OMPA_YERPE with this:-

    `sequence:seq_YPTM::::::: 0.00: 0.00`
  
  
- These lines contain information about whether the sequence is from the template (Structure) or from the sequence we want to model.  : characters separate bits of information (full details can be found [here](https://salilab.org/modeller/8v2/manual/node176.html)).  The key ones for us today to worry about are fields 3,4 5 and  6 for the template sequence - these fields are the starting residue,the chain letter (A in our case), the last residue and the chain letter (again just A for us).  Here, at the moment we have place holders (START, END) for the residue mumbers, which we have to replace later.  You might think we already know the start and last residue of the template from the uniprot sequence but as we will see, this will often not work directly because the crystal structure has some bits missing.

    
- Take time with this step as it's the most important. Mistakes made at this point carry through to the final model.
    
- If you are worried about this (or want to cheat!) you can look at the format of one we prepared earlier in the backup directory (note that this does not quite match - we will take you through how to work how long the alignment is in the section below - ie how to work out what to put in place of END).
    
- We are going to have edit this file further later.

In [None]:
! vim pairwise_alignment_TM.ali # linux/mac

In [None]:
! notepad pairwise_alignment_TM.ali # windows

In [None]:
! cat backup/single_subunit/alpha_pairwise_alignment_TM.ali

- Once you are satisfied with your alignment you can focus on generating a 3D model. To do this we need to process the 1bxw.pdb file a bit. Firstly open the file with Pymol




In [None]:
! pymol OmpA_model/1bxw.pdb

- Within pymol, type:

`select A, chain A and not HETATM`


- Then on the right menu click 'A' and then 'copy to object'.

- Save the resulting obj01 as OmpATM.pdb (careful to select .pdb output)

- You also use the followin command to get the sequence to cut and paste

`print(cmd.get_fastastr('obj01'))`  

- You can cut and paste that (just the sequence -not the line begining with >) to a file; lets call that OmpA_structure_sequence.fasta.   Before you quit pymol take a look at just Obj01 (Chain A of the protein) to check that there are no missing sections (breaks) in the structure.

- When proteins are solved experimentally (e.g. using X-ray crytallography or Cryo-electron microscopy), often, certain flexible motifs or loops are removed as without doing this, structure determination may be difficult or even impossible. As such, the uniprot sequence that you have for the E. coli OmpA transmembrane domain may not match precisely the sequence of the X-ray structure of OmpA that we are using as a template. Do you see a difference between the Uniprot sequence and the PDB one for E.coli OmpA?

- For Y. pestis we have only 171 residues remember as we will skip the initial 'M' that is present in the E .coli structure. 

In [None]:
 ! tr -d '\n\r' < OmpA_model/OmpA_structure_sequence.fasta |wc -m

- If you haven't done so, make those edits to pairwise_alignment_TM.ali now using your favourite editor or vim

In [None]:
! vim pairwise_alignment_TM.ali # linux/mac

In [None]:
! notepad pairwise_alignment_TM.ali # windows

### Your file should look something like this:

<br>>P1;OmpATM<br>
structureX:OmpATM:0:A:172:A::: 0.00: 0.00 <br>
MAPKDNTWYTGAKLGWSQYHDTGLI-----NNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLGRMPYKGSV<br>
ENGAYKAQGVQLTAKLGYPITDDLDIYTRLGGMVWRADTYSNVYG-----KNHDTGVSPVFAGGVEYAITPE<br>
IATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFG*<br>


<br>>P1;seq_YPTM<br>
sequence:seq_YPTM:0::171:::: 0.00: 0.00<br>
-APKDNTWYTGGKLGWSQYQDTGSI----INNDGPTHKDQLGAGAFFGYQANQYLGFEMGYDWLGRMPYKGDI<br>
NNGAFKAQGVQLAAKLSYPVAQDLDVYTRLGGLVWRADAKGSFDGGLDRASGHDTGVSPLVALGAEYAWTKN<br>
WATRMEYQWVNNIGDRETVGARPDNGLLSVGVSYRFG*<br>



- Now we should be ready to run modeller.

In [None]:
from modeller import *
from modeller.automodel import *    # Load the automodel class

log.verbose()
env = environ()

# directories for input atom files
env.io.atom_files_directory = ['.', './OmpA_model']

a = automodel(env,
              alnfile  = 'OmpA_model/pairwise_alignment_TM.ali',     # alignment filename
              knowns   = 'OmpATM',              # codes of the templates
              sequence = 'seq_YPTM')              # code of the target - we can call this what we like

a.starting_model= 1                 # index of the first model
a.ending_model  = 1                # index of the last model as this is 1 and the start is 1 - we will just generated 1 model
                                    # (determines how many models to calculate)
a.make()                            # do comparative modeling


- You can now fire up Pymol to check the structure looks OK - visual inspection is incredibly important!

In [None]:
!pymol seq_YPTM.B99990001.pdb

That concludes this part of the tutorial.  

### 3.  Build a model of the full length OmpA protein

- Next we will investigate how to make a model of the complete protein.  We will retain the N-ter transmembrane domain we have made already using the E. coli structure as a template.

- We will do this in two stages. Firstly we will create a model of the C-terminal domain, and then secondly we will link the transmembrane domain with the C-terminal via an overlapping residue (Gly).

- First lets create a new directory to do this in - lets call that Cter

In [None]:
! mkdir Cter

### The C-terminal domain ###
 
- Now lets look for s suitable structure to use as a template.

- If you have already saved the sequence of the C-ter domain of Y.pestis OmpA in the previous part as suggested, go straight to the [Blast](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) website. 
- If not, go back to uniprot to retrieve the sequence of Y. pestis OmpA (Q8ZG77), download the FASTA sequence. Edit the file by discarding all the first residues starting from the begiining until -and including- the 'SYRF' motif. Now you are ready to go to [Blast](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins).

- In the 'choose Search set', change the Database to PDB.

- Copy and paste the sequence of the C-ter Y. pestis OmpA into the ENTER QUERY SEQUENCE box and click BLAST.

- You will see that there is a structure of the OmpA C-ter from K. pneumoniae with accession code 5NHX.

- Download this structure from [rcsb](http://www.rcsb.org)

- You may note that there is also an ensemble of structures from E. coli, but here we will stick with the K. pneumoniae structure.


- Ask previously use Pymol to ensure we only have protein residues of chain A in the pdb file.
- Save the resultant file as OmpACter.pdb

- Let's take a look at the sequence of this structure (use Pymol as previously)

- Are there any residues missing? Are they any residues with alternative conformations?

- Within pymol, type:

`select A, chain A and not HETATM`

- Then on the right menu click 'A' and then 'copy to object'.

- Use the command from the previous section to get the sequence to cut and paste

`print(cmd.get_fastastr('obj01'))`  

- Does this sequence legth match the one you download from the pdb database? You might have seen that there are alternative conformations for certain residues. We will need to fix this before going further with the modelling.

- Still within pymol, type:

`remove not (alt ''+A)`

- Print out the sequence after getting rid of alternative confromation 'B':

`print(cmd.get_fastastr('obj01'))`  

- You can cut and paste that (just the sequence -not the line begining with >) to a file; lets call that OmpA_structureCT_sequence.fasta. Before you quit pymol take a look at just obj01 (Chain A of the protein with only 1 alternate conformation) to check that there are no missing sections (breaks) in the structure. Save the modified structure as a pdb as you didi in the previous section (obj01).
 

- Now lets follow the same procedure as previously: 
- Copy the OmpA_alignment_Cter.pir file to pairwise_alignment_Cter.ali

In [None]:
! cp OmpA_model/OmpA_alignment_Cter.pir OmpA_model/Cter/pairwise_alignment_Cter.ali

- From the pairwise_alignment_Cter.ali file, remove the entry for the E.coli protein
- Further modify the alignment file by altering the starting lines for the structure and sequence entries as previously and work out the number of residues, to add the start and end residue numbers. 
- Remember we are adding an extra G at the beginning of the C-ter sequence to overlap later with the transmembrane structure.
- If you are getting stuck with this part, please ask a demonstrator for help.

- Modify the Modeller script below to accomodate your naming if needed.

In [None]:
from modeller import *
from modeller.automodel import *    # Load the automodel class

log.verbose()
env = environ()

# directories for input atom files
env.io.atom_files_directory = ['.']

a = automodel(env,
              alnfile  = 'pairwise_alignment_Cter.ali',     # alignment filename
              knowns   = 'OmpACter', # codes of the templates - ie your "cleaned up" pdb above
              sequence = 'seq_YPC',  # code of the target 
              assess_methods=(assess.DOPE,	
                              assess.GA341)))              # 

a.starting_model= 1                 # index of the first model
a.ending_model  = 10                # index of the last model
                                    # (determines how many models to calculate)
a.make()                            # do comparative modeling

- lets take a look at the resulting structures in Pymol

### 4. Building the full protein ###

Now we have two different models for the transmembrane domain and the C-ter domain. We will need to put them together in a single pdb structure.

- First, we will concatenate all the separate modeller pdb files intpo one multi-pdb file. In the directory where you obtained the C-ter model type:

In [None]:
!cat *B99*pdb | sed 's/END/ENDMDL/g' > seq_YPC_models.pdb 

Now we will open all the C-ter models in pymol along the transmembrane domain model.

In [None]:
!pymol seqYPC_models.pdb seq_YPTM.B99990001.pdb

- On your pymol terminal write:

`intra_fit seq_YPC_models and resi 1` 

`align resi 176 and seq_YPTM.B99990001.pdb, resi 1 and seq_YPC_models`

Have a look at the structures and save a frame that you think is a good structure (Hint: are they clashes?, is the C-ter domain positioned in a region that will be occupied by a membrane?)

- Save the C-ter structure from the frame you choose. File -> Save Molecule -> Select seqYPC_models and the "object's current" option. Save it as fit_seqYPC.pdb.
- Do the same for the TM domain saving it as fit_seqYPTM.pdb and close pymol.
- On your terminal type:

In [None]:
!cat fit_seqYPTM.pdb  fit_seqYPC.pdb > fit_model.pdb

This command concatenated both pdb files, now you need to edit the pdb file with the text editor of your choice: 
- Look for the 'END' in the middle of the file and delete it, as well as the line containing OXT immediatly above it. 
- Now we need to make sure we have a correct GLY residue. Delete the N CA C and O atoms from the C-ter model (i.e the ones corresponding to residue "1").
- Save your modified file as prep_full_OmpA_YP.pdb

In order to visualize the structure properly, we will need to renumber all the atoms so that the pdb file has atoms and residues with consecutive numbers (we don't have missing residues anymore!). To do this, we will use a coordinate editor from the gromacs molecular dynamics package. On your terminal type:

In [None]:
! gmx editconf -f prep_Full_OmpA_YP.pdb -resnr 1 -o Full_OmpA_YP.pdb

Now you have your final model: Full_OmpA_YP.pdb

### 5. Assessing the quality of your final model

- Create a new directory called quality_check and copy your final model with both domains into it.

- In order to have a compatible pdb file we need to make sure we have no atoms with occupancy 0. On your terminal, making sure that you are in the new directory, type: 

`:%!sed -i  "s/\ \ 0\.00/\ \ 0\.10/" Full_OmpA_YP.pdb `

- Next we will upload this models to the QMEAN server.

 - The QMEAN (Qualitative Model Energy ANalysis) scoring function derives a quality estimate for both local and global (per-residue) quality estimates. For more information on the scoring function see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2703985/

 - Go to the QMEAN server website via:

  https://swissmodel.expasy.org/qmean/

 - Select just the default option for quality assessment. (If we where assessing the TM domain you are better of choosing 'QMEANBrane' from the three method options available)

 - Note that you can give several models as an input and use the server to help you choose the best models. We will check our final full model but you are welcome to check e.g your 10 C-ter models after you are done with today's practical. Upload your structure by clicking the 'select coordinate file' option and click submit once these have been uploaded. Feel free to add your email if you'd like a copy of the results sent to you.

 - In the output you should see QMEAN scores for each model as well as a breakdown of the local quality.

 - By clicking Project Archive download we can obtain a .zip with a breakdown of the results.

 - For each model you should be able to scroll over the local quality score which will also simultaneously display amino acid residues on a 3D projection of your model on the panel to the right. If you uploaded several pdb models. select the model with the highest QMEAN4 value for the next step, keeping the QMEAN tab open. In our case, we will continue our analysis just continue with our full-length model.
    
- We will now check the model's backbone torsions by generating a Ramachandran plot.

 - Go to the following website and upload your structure:

    http://molprobity.biochem.duke.edu/index.php 
    
 - Follow the website instructions to generate a ramachandran analysis.

 - Are there any violations and if so, why might they be occuring there?
 - Open up your .pdb file in pymol or VMD to check them in the structure. 

 - Do these regions also correspond to low local quality scores in QMEAN?

### 6. Refining Loops


- Due to the uncertainty of how to model flexible loops, you may need to try a few different loop conformations before you obtain something that makes sense.

- Select your final model that you worked on previously.

- You do not need a .ali file here, just modify the holding modeller script below.

- A modeller objective function will be displayed in the pdb files as before.

- Open all these structures in pymol to see the different loop conformation outputs. In this case we are refining the linker between the TM doamin and the C-ter. It is technically not a loop, but more meaningful to refine in the context of this tutorial.

In [None]:
# Loop refinement of an existing model
from modeller import *
from modeller.automodel import *

log.verbose()
env = environ()

# directories for input atom files
env.io.atom_files_directory = ['.']  # needed if you put your coordinate files somewhere

# Create a new class based on 'loopmodel' so that we can redefine
# select_loop_atoms (necessary)
class MyLoop(loopmodel):
    # This routine picks the residues to be refined by loop modeling
    def select_loop_atoms(self):
        # 10 residue insertion 
        return selection(self.residue_range('177:A', '195:A'))  # You will have CHANGE THIS FOR YOUR MODEL -
                                                                # these refer to the Full_OmpA_YP.pdb file -
                                                                # so change for your template that you call below... 

m = MyLoop(env,
           inimodel='Full_OmpA_YP.pdb', # initial model of the target - CHANGE TO YOUR TEMPLATE PDB
           sequence='Full_OmpA_YP')          # code of the target

m.loop.starting_model= 1           # index of the first loop model 
m.loop.ending_model  = 10          # index of the last loop model
m.loop.md_level = refine.very_fast # loop refinement method; this yields
                                   # models quickly but of low quality;
                                   # use refine.slow for better models

m.make()


### 7. Compare your full model with the AlphaFold model

- Go to [uniprot.org](https://www.uniprot.org/) and search OMPA_YERPE.
- Scroll down until the Structure section and download the AlphaFold structure
- Open your model and the Alphafold structure together in pymol
- Align them by typing the following command in the pymol (adapt the molecules' names if needed to match the names in pymol's list displayed at the right upper corner):
`align Full_OmpA_YP, AF-Q8ZG77-F1-model_v4`
- What RMSD value did you get (displayed on pymol's terminal)? Where are the main differences in the structures?
- Can you think of a better strategy to structurally compare these 2 models?


### THE END  Congratulations you are now ready to build your own homology models!

### APPENDIX I

**If you want to run the practical locally in your own computer later**

- Most of software necessary can be very simply installed under the conda framework. Conda is available on windows, linux and mac.

- It will be necessary to obtain the (free) license key for modeller (see https://salilab.org/modeller/ for details).

- This is the first step if you have not already installed conda (or miniconda) on your laptop - Rather than repeating the instructions here, simply head over to https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html and pick up the relevant instructions for your operating system.

- First create and activate an environment (you might have already done this if you did other parts of this course - in which case just activate it)


- `% conda create --name OxHomology`


- `% conda activate OxHomology`


- Next, install the following packages with conda

  `% conda config --env --add channels defaults` 
  
  `% conda config --env --add channels bioconda` 
  
  `% conda config --env --add channels conda-forge` 
  
  `% conda install notebook`

  `% conda install -c salilab modeller`    <-- Note - you will need to get a (free) license key - do this in advance of commencing the tutorial!
  
  `% conda install jalview`   <-- if this gives problems - go to installers at http://www.jalview.org/builds/nextrel/Web_Installers/install.htm
 
  `% conda install -c anaconda openjdk` 
  
  `% conda install -c schrodinger pymol-bundle`  <-- Note - this is the schrodinger version - a freeware version is available depending on your OS.
  
  `% conda install -c binstar unxutils`  <-- **windows users only**
  
  
- windows users might like to install notepad++ which has better manipulation features than the default Notepad (and will be needed for the last section)

- Don't forget to edit config.py (under you modeller-9.25/modlib/modeller/ directory) such that the license says what your license key is.

### APPENDIX II

**Some basic linux commands tu use in the terminal**

To learn on your own: https://ubuntu.com/tutorials/command-line-for-beginners#1-overview

The most common commands:

ls      -->  lists files in current directory

pwd     -->  print working directory", shows global path of the directory you are currently in

cd dir1 -->  "change directory" to the directory called "dir1"

cd ..   -->  move up one directory  

cd ../../  --> move up 2 directories

cd      -->  change directory to your home (login) directory

mkdir new_folder --> creates folder called new_folder

rm file1 --> deletes file named "file1"

mv file1 file2 --> renames file1 to file2

cp ../file1 .  --> copies file1, located one directory above to your current directory


cp -rp ../dir1 . --> copies directory dir1, located one directory above to your current directory

