Skip to content

Commit

Permalink
Merge branch 'master' of github.com:hng/BiomolecularStructures
Browse files Browse the repository at this point in the history
  • Loading branch information
gp0 committed Mar 29, 2015
2 parents eb4bb5f + e243afc commit b95d5a6
Show file tree
Hide file tree
Showing 2 changed files with 135 additions and 14 deletions.
66 changes: 56 additions & 10 deletions docs/modeller.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ gen_modeller_script(name::String)

Generates Julia templates for MODELLER usage with Julia.

name: The name of the script (minus the extension). Possible values: "align2d", "build_profile", "compare", "evaluate_model", "model-single", "plot_profiles".
*name:* The name of the script (minus the extension). Possible values: "align2d", "build_profile", "compare", "evaluate_model", "model-single", "plot_profiles".

These scripts are based on the basic example scripts from the [tutorial](https://salilab.org/modeller/tutorial/basic.html) on the MODELLER website.
Scripts are generated in the current working directory. You can find all scripts that can be generated in `src/MODELLER/modeller-basic-example-julia`.
Expand All @@ -24,15 +24,25 @@ gen_modeller_script("build_profile")
This module also provides a few simple functions that provide common MODELLER tasks. These are again adapted from the MODELLER basic example. The given examples calls should work inside the modeller-basic-example directory.

```julia
build_profile(;seq_database_file::String = "", seq_database_format::String="PIR", alignment_file::String = "", alignment_format::String = "PIR", output_name::String = "build_profile", output_profile_format::String="TEXT", output_alignment_format::String="PIR")
build_profile(;seq_database_file::String = "", seq_database_format::String="PIR", sequence_file::String = "",
sequence_format::String = "PIR", output_name::String = "build_profile", output_profile_format::String="TEXT", output_alignment_format::String="PIR")
```

Note: this function is using keyword arguments
Searches a given sequence database for a given sequence (file) and writes hits to a profile and a alignment file with the given name/path and format.

**Example:**
Note: this function is using keyword arguments.

```julia
build_profile(seq_database_file="pdb_95.pir", alignment_file="TvLDH.ali")
compare(pdbs)
```
Prints out a table with the similarities between the given structures and a dendrogram.

*pdbs:* Array of pairs of pdb-files and chains that should be compared.

**Example:**
```julia
compare((("1b8p", "A"), ("1bdm", "A"), ("1civ", "A"),
("5mdh", "A"), ("7mdh", "A"), ("1smk", "A")))
```

```julia
Expand All @@ -48,11 +58,11 @@ align2d("1bdm.pdb", ("FIRST:A","LAST:A"), "1bdmA", "1bdm.pdb", "TvLDH.ali", "TvL
```julia
model_single(alnf::String, known_structure::String, seq::String)
```
alnf: path to alignment file
*alnf:* path to alignment file

known_structure: path or name of known pdb structure
*known_structure:* path or name of known pdb structure

seq: sequence
*seq:* sequence

**Example:**

Expand All @@ -63,16 +73,52 @@ model_single("TvLDH-1bdmA.ali", "1bdmA", "TvLDH")
```julia
evaluate_model(pdbfile::String, outputfile::String = "")
```
pdbfile: path to pdb file
*pdbfile:* path to pdb file

outputfile: optional path to output file. Defaults to pdbfile+".profile"
*outputfile:* optional path to output file. Defaults to pdbfile+".profile"

**Example:**

```julia
evaluate_model("TvLDH.B99990002.pdb", "TvLDH.profile")
```

```julia
plot_profiles(alignment_file::String, template_profile::String, template_sequence::String, model_profile::String, model_sequence::String, plot_file::String = "dope_profile.png")
```

Plots a two profiles with data from a corresponding alignment file (for both structures).

*alignment_file:* path to alignment file

*template_profile:* path to profile that was used as the template

*template_profile:* sequence name from *template_profile* that should be used

*model_profile:* path to model profile

*model_profile:* sequence name from *model_profile* that should be used

*plot_file:* (optional) path where created plot should be saved. Default: dope_profile.png

**Example:**

```julia
plot_profiles("TvLDH-1bdmA.ali", "1bdmA.profile", "1bdmA", "TvLDH.profile", "TvLDH")
```

## Usage / Usecase
This usecase is again based on the [tutorial](https://salilab.org/modeller/tutorial/basic.html) on the MOELLER website. You should download the corresponding files and call the julia functions with the julia interactive prompt inside the tutorial example folder.

```julia
julia> build_profile(seq_database_file="pdb_95.pir", sequence_file="TvLDH.ali")
```

Searches the sequence database ``pdb_95.pir`` for the sequence(s) in the sequence file ``TvLDH.ali``. Creates a profile (``build_profile.prf"``) and an alignment file with the database matches (``build_profile.ali``) in the current folder.

## Background
...

##References

* N. Eswar, M. A. Marti-Renom, B. Webb, M. S. Madhusudhan, D. Eramian, M. Shen, U. Pieper, A. Sali. Comparative Protein Structure Modeling With MODELLER. Current Protocols in Bioinformatics, John Wiley & Sons, Inc., Supplement 15, 5.6.1-5.6.30, 2006.
Expand Down
83 changes: 79 additions & 4 deletions src/MODELLER/modeller.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,9 @@
"

module Modeller
export gen_modeller_script, build_profile, model_single, evaluate_model, align2d
export gen_modeller_script, build_profile, compare, model_single, evaluate_model, align2d, plot_profiles
using PyCall
using PyPlot

# Generator for MODELLER julia scripts.
# These scripts are based on the basic example scripts from the MODELLER website.
Expand All @@ -18,8 +19,8 @@ export gen_modeller_script, build_profile, model_single, evaluate_model, align2
end
end

function build_profile(;seq_database_file::String = "", seq_database_format::String="PIR", alignment_file::String = "", alignment_format::String = "PIR", output_name::String = "build_profile", output_profile_format::String="TEXT", output_alignment_format::String="PIR")
seq_database_file_out = string(split(seq_database_file, "."), ".bin")
function build_profile(;seq_database_file::String = "", seq_database_format::String="PIR", sequence_file::String = "", sequence_format::String = "PIR", output_name::String = "build_profile", output_profile_format::String="TEXT", output_alignment_format::String="PIR")
seq_database_file_out = string(seq_database_file, ".bin")

pyinitialize()
@pyimport modeller
Expand All @@ -45,7 +46,7 @@ export gen_modeller_script, build_profile, model_single, evaluate_model, align2

#-- Read in the target sequence/alignment
aln = modeller.alignment(env)
aln[:append](file=alignment_file, alignment_format=alignment_format, align_codes="ALL")
aln[:append](file=sequence_file, alignment_format=sequence_format, align_codes="ALL")

#-- Convert the input sequence/alignment into
# profile format
Expand All @@ -67,6 +68,22 @@ export gen_modeller_script, build_profile, model_single, evaluate_model, align2

end

function compare(pdbs)
@pyimport modeller

env = modeller.environ()
aln = modeller.alignment(env)
for (pdb, chain) in pdbs
m = modeller.model(env, file=pdb, model_segment=("FIRST:$chain", "LAST:$chain"))
aln[:append_model](m, atom_files=pdb, align_codes=string(pdb, chain))
end
aln[:malign]()
aln[:malign3d]()
aln[:compare_structures]()
aln[:id_table](matrix_file="family.mat")
env[:dendrogram](matrix_file="family.mat", cluster_cut=-1.0)
end

function model_single(alnf::String, known_structure::String, seq::String)

@pyimport modeller
Expand Down Expand Up @@ -122,4 +139,62 @@ export gen_modeller_script, build_profile, model_single, evaluate_model, align2
aln[:write](file=string(outputname, ".ali"), alignment_format="PIR")
aln[:write](file=string(outputname, ".pap"), alignment_format="PAP")
end

function plot_profiles(alignment_file::String, template_profile::String, template_sequence::String, model_profile::String, model_sequence::String, plot_file::String = "dope_profile.png")

@pyimport modeller

function r_enumerate(seq)
"""Enumerate a sequence in reverse order"""
# Note that we don"t use reversed() since Python 2.3 doesn"t have it
num = pybuiltin(:len)(seq) - 1
while num >= 0
produce((num, get(seq, num)))
num -= 1
end
end

function get_profile(profile_file, seq)
"""Read `profile_file` into a Python array, and add gaps corresponding to
the alignment sequence `seq`."""
# Read all non-comment and non-blank lines from the file:
f = open(profile_file)
vals = Any[]
for line in eachline(f)
if line[1] != '#' && length(line) > 10
spl = split(line)
push!(vals, float(last(spl)))
end
end
close(f)
# Insert gaps into the profile corresponding to those in seq:
taskAtr = @task r_enumerate(seq[:residues])
for x in taskAtr
n = x[1]
res = x[2]
for gap in range(1, res[:get_leading_gaps]())
# Julia arrays start from 1 so we have to add 1 to the position
insert!(vals, n+1, nothing)
end
end
# Add a gap at position "0", so that we effectively count from 1:
insert!(vals, 1, nothing)
return vals
end

e = modeller.environ()
a = modeller.alignment(e, file=alignment_file)

template = get_profile(template_profile, get(a, template_sequence))
model = get_profile(model_profile, get(a, model_sequence))

# Plot the template and model profiles in the same plot for comparison:
PyPlot.figure(1, figsize=(10,6))
PyPlot.xlabel("Alignment position")
PyPlot.ylabel("DOPE per-residue score")
PyPlot.plot(model, color="red", linewidth=2, label="Model")
PyPlot.plot(template, color="green", linewidth=2, label="Template")
PyPlot.legend()
PyPlot.savefig(plot_file, dpi=65)
end
end

0 comments on commit b95d5a6

Please sign in to comment.