# Pfam API

- [Types](#Types)
- [Methods](#Methods)

In [1]:
using MIToS.Pfam

In [2]:
?MIToS.Pfam



No documentation found.

`MIToS.Pfam` is of type `Module`:

**Summary:**

```julia
type Module <: Any
```

**Fields:**

```julia
name   :: Symbol
parent :: Any
```


<div class="panel panel-info">
    <div class="panel-heading">
        <strong>Julia help mode</strong>
    </div>
    <div class="panel-body">
        <p>If you type <code>?</code> at the beginning of the Julia REPL line, you will enter in the Julia help mode. In this mode, Julia prints the help or <strong>documentation</strong> of the entered element. This is a nice way of getting information about MIToS functions, types, etc. from Julia.</p>
    </div>
</div>

<a href="#"><i class="fa fa-arrow-up"></i></a>

## Types

In [3]:
?MIToS.Pfam.Stockholm

No documentation found.

**Summary:**

```julia
immutable MIToS.MSA.Stockholm <: MIToS.Utils.Format
```


<a href="#"><i class="fa fa-arrow-up"></i></a>

## Methods

In [5]:
?MIToS.Pfam.downloadpfam

Download a gzipped stockholm full alignment for the `pfamcode`. The extension of the downloaded file is `.stockholm.gz` by default. The `filename` can be changed, but the `.gz` at the end is mandatory.


In [6]:
?MIToS.Pfam.getseq2pdb

Generates from a Pfam `msa` a `Dict{ASCIIString, Vector{Tuple{ASCIIString,ASCIIString}}}`. Keys are sequence IDs and each value is a list of tuples containing PDB code and chain.

```
julia> getseq2pdb(msa)
Dict{ASCIIString,Array{Tuple{ASCIIString,ASCIIString},1}} with 1 entry:
  "F112_SSV1/3-112" => [("2VQC","A")]

```


In [7]:
?MIToS.Pfam.msacolumn2pdbresidue

This function returns a `Dict{Int64,ASCIIString}` with **MSA column numbers on the input file** as keys and PDB residue numbers (`""` for missings) as values. The mapping is performed using SIFTS. This function needs correct *ColMap* and *SeqMap* annotations. This checks correspondence of the residues between the sequence and SIFTS (It throws a warning if the are differences). If you are working with a **downloaded Pfam MSA without modifications**, you should `read` it using `generatemapping=true` and `useidcoordinates=true`.

If you don't indicate the path to the `siftsfile` used in the mapping, this function downloads the SIFTS file in the current folder.

If you don't indicate the Pfam accession number (`pfamid`), this function tries to read the *AC* file annotation.


In [8]:
?MIToS.Pfam.hasresidues

Returns a `BitVector` where there is a `true` for each column with PDB residue.


In [9]:
?MIToS.Pfam.msacontacts

This function takes an `AnnotatedMultipleSequenceAlignment` with correct *ColMap* annotations and two dicts:

1. The first is an `OrderedDict{ASCIIString,PDBResidue}` from PDB residue number to `PDBResidue`.

1. The second is a `Dict{Int,ASCIIString}` from **MSA column number on the input file** to PDB residue number.

This returns a `PairwiseListMatrix{Float64,false}` of `0.0` and `1.0` where `1.0` indicates a residue contact (inter residue distance less or equal to 6.05 angstroms between any heavy atom). `NaN` indicates a missing value.


In [10]:
?MIToS.Pfam.msaresidues

This function takes an `AnnotatedMultipleSequenceAlignment` with correct *ColMap* annotations and two dicts:

1. The first is an `OrderedDict{ASCIIString,PDBResidue}` from PDB residue number to `PDBResidue`.

1. The second is a `Dict{Int,ASCIIString}` from MSA column number **on the input file** to PDB residue number.

This returns an `OrderedDict{Int,PDBResidue}` from input column number (ColMap) to `PDBResidue`. Residues on iserts are not included.


In [11]:
?MIToS.Pfam.getcontactmasks

This function takes a `msacontacts` or its list of contacts `contact_list` with 1.0 for true contacts and 0.0 for not contacts (NaN or other numbers for missing values). Returns two `BitVector`s, the first with `true`s where `contact_list` is 1.0 and the second with `true`s where `contact_list` is 0.0. There are useful for AUC calculations.


In [12]:
?MIToS.Pfam.AUC

Returns the Area Under a ROC (Receiver Operating Characteristic) Curve (AUC) of the `scores` for `msacontact` prediction. `score` and `msacontact` lists are vinculated (inner join) by their labels (i.e. column number in the file). `msacontact` should have 1.0 for true contacts and 0.0 for not contacts (NaN or other numbers for missing values).

Returns the Area Under a ROC (Receiver Operating Characteristic) Curve (AUC) of the `scores` for `true_contacts` prediction. `scores`, `true_contacts` and `false_contacts` should have the same number of elements and `false_contacts` should be `true` where there are not contacts.

Returns the Area Under a ROC (Receiver Operating Characteristic) Curve (AUC) of the `scores_list` for `true_contacts` prediction. The three vectors should have the same length and `false_contacts` should be `true` where there are not contacts.
