Skip to content

Commit

Permalink
Update to 0.1.10
Browse files Browse the repository at this point in the history
Major update that includes a number of big changes
1) Improvement to plotting functions. Linear composition analysis now dynamically alters text size and line thickness to refelct sequence length, allowing high resolution fugures to be generated without the need for extensive post-processing

2) General linear plotting functions now dynamically set the edge withd on bars such that when sequences are long the edge width tends to zero. This fixes the issue where for longer sequences the bar edges would overwhelm the total bar width. meaning bar color was lost

3) New PPII propensity scales added (Kallenbach and Creamer)

4) New sequence shuffling function (get_shuffle()) that allows for a sequence to be efficienctly shuffled. In addition to efficienct shuffling, an arbitrary number of residues can be 'frozen', so shuffling occurs over a subset of residues. Useful when we wish to hold key motifs fixed but shuffle the remainder of the sequence.

5) The get_Omega parameter was updated with the appropriate reference to the newly published work

6) Additional tests added
  • Loading branch information
alexholehouse committed Nov 25, 2016
1 parent 4a43232 commit 3675a5f
Show file tree
Hide file tree
Showing 30 changed files with 343 additions and 104 deletions.
11 changes: 10 additions & 1 deletion CHANGES.txt
Expand Up @@ -25,4 +25,13 @@


0.1.9
*
* Local compositional analysis has been added to help identify local regions which are enriched in a particular type of amino acid. Proteins are complex heteropolymers, so plotting the local density of each amino acid with a sliding window is very difficult to visualize and interpret. To overcome this challenge, we take a two-pronged approach: firstly, amino acids are group by physiochemical properties. Secondly, we then fit the local density noisy data to a univariate spline fit to give a smoothed description of the local density along the sequence. Combined, this allows us to generate easy-to-read plots that illustrate the local density of residues at any given location along the sequence

* The Omega patterning parameter (previously kappa_proline) is re-defined as per the manuscript by Martin & Holehouse *et al.*

* A generalized patterning parameter for considering 2 or 3 letter alphabets is provided (kappa_X), allowing users to ask general questions about sequence patterning.

* The figure generation code has been further optimised, as has code documentation

If there are sequence feautures you would like to see added to localCIDER please don't hesitate to get in touch - we're always looking for new features to add and to further grow localCIDER as a general purpose protein analysis framework.

2 changes: 1 addition & 1 deletion localcider/__init__.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/__init__.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/backendtools.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
4 changes: 2 additions & 2 deletions localcider/backend/config.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down Expand Up @@ -54,4 +54,4 @@


# DO NOT CHANGE ANYTHING BELOW THIS LINE
VERSION = "0.1.9"
VERSION = "0.1.10"
2 changes: 1 addition & 1 deletion localcider/backend/data/__init__.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
97 changes: 93 additions & 4 deletions localcider/backend/data/aminoacids.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down Expand Up @@ -260,7 +260,6 @@ def get_PPII_Hilser():
Returns an amino acid dictionary with the PPII propensity of
each residue as calculated by Elam et al [1] (Taken from [2]).
[1] - Elam WA, Schrank TP, Campagnolo AJ, Hilser VJ. Evolutionary
conservation of the polyproline II conformation surrounding intrinsically
disordered phosphorylation sites.
Expand Down Expand Up @@ -293,6 +292,89 @@ def get_PPII_Hilser():
'LYS': 0.56,
'ARG': 0.38}

def get_PPII_Creamer():
"""
Returns an amino acid dictionary with the PPII propensity of
each residue as calculated by Rucker et al [1] (Taken from [2]).
Note that Trp and Try do not have values reported by Rucker
et al., so we followed the convention uesed by Tomasso et al.
and set both to the mean value (0.58)
[1] - Rucker, A.L., Pager, C.T., Campbell, M.N., Qualls, J.E.,
and Creamer, T.P. (2003). Host-guest scale of left-handed
polyproline II helix formation. Proteins 53, 68-75.
[2] - Tomasso, M. E., Tarver, M. J., Devarajan, D. & Whitten, S. T.
Hydrodynamic Radii of Intrinsically Disordered Proteins Determined
from Experimental Polyproline II Propensities.
PLoS Comput. Biol. 12, e1004686 (2016).
"""
return {'ILE': 0.50,
'VAL': 0.49,
'LEU': 0.58,
'PHE': 0.58,
'CYS': 0.55,
'MET': 0.55,
'ALA': 0.61,
'GLY': 0.58,
'THR': 0.53,
'SER': 0.58,
'TRP': 0.58,
'TYR': 0.58,
'PRO': 0.67,
'HIS': 0.55,
'GLU': 0.61,
'GLN': 0.66,
'ASP': 0.63,
'ASN': 0.55,
'LYS': 0.59,
'ARG': 0.61}


def get_PPII_Kallenbach():
"""
Returns an amino acid dictionary with the PPII propensity of
each residue as calculated by Shi et al [1] (Taken from [2]).
Note that Gly and Pro do not have values reported by Shi et al.,
so we followed the convention uesed by Tomasso et al. and
set Gly = 0.5 and Pro = 1.0.
[1] - Shi, Z., Chen, K., Liu, Z., Ng, A., Bracken, W.C., and
Kallenbach, N.R. (2005). Polyproline II propensities from
GGXGG peptides reveal an anticorrelation with beta-sheet scales.
Proc. Natl. Acad. Sci. U. S. A. 102, 17964-17968.
[2] - Tomasso, M. E., Tarver, M. J., Devarajan, D. & Whitten, S. T.
Hydrodynamic Radii of Intrinsically Disordered Proteins Determined
from Experimental Polyproline II Propensities.
PLoS Comput. Biol. 12, e1004686 (2016).
"""
return {'ILE': 0.519,
'VAL': 0.743,
'LEU': 0.574,
'PHE': 0.639,
'CYS': 0.557,
'MET': 0.498,
'ALA': 0.818,
'GLY': 0.500,
'THR': 0.553,
'SER': 0.774,
'TRP': 0.764,
'TYR': 0.630,
'PRO': 1.000,
'HIS': 0.428,
'GLU': 0.684,
'GLN': 0.654,
'ASP': 0.552,
'ASN': 0.667,
'LYS': 0.581,
'ARG': 0.638}




Expand Down Expand Up @@ -345,7 +427,9 @@ def build_amino_acids_skeleton():
# get a dictionary of 3 letter to KD hydrophobicity
KD_Dict = get_KD_shifted()

PPII_Dict = get_PPII_Hilser()
PPII_Dict_Hilser = get_PPII_Hilser()
PPII_Dict_Creamer = get_PPII_Creamer()
PPII_Dict_Kallenbach = get_PPII_Kallenbach()

# build the initial skeleton of amnino acids
skeleton=[["Alanine", "ALA", "A"],
Expand Down Expand Up @@ -380,7 +464,12 @@ def build_amino_acids_skeleton():
res.append(charge_Dict[res[1]])

# update the residue with the PPII content
res.append(PPII_Dict[res[1]])
res.append(PPII_Dict_Hilser[res[1]])
res.append(PPII_Dict_Creamer[res[1]])
res.append(PPII_Dict_Kallenbach[res[1]])

# each residue is now defined by a list of length 8 which are
# [full name, 3 letter, one letter, Hydrophobicity, charge, PPII_Hilser, PPII_Creamer, PPII_Kallenbach]

return skeleton

Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/data/buildHighComplexitySequences.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/data/highComplexitySequences.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/keyfile.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
2 changes: 1 addition & 1 deletion localcider/backend/localciderExceptions.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down
47 changes: 41 additions & 6 deletions localcider/backend/plotting.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down Expand Up @@ -645,12 +645,15 @@ def show_linearComplexity(complexityVector, complexityType, seqlen, getFig=False
"""

n_bars = len(complexityVector[0,:])
LW = __get_bar_edge_width(n_bars)

# first generate the bar-plot and save the list of bars
barlist = plt.bar(complexityVector[0,:],
complexityVector[1,:],
width=1,
linewidth=1.0,
linewidth=LW,
edgecolor='k',
color='#A8A8A8')

Expand Down Expand Up @@ -690,7 +693,17 @@ def show_linearComplexity(complexityVector, complexityType, seqlen, getFig=False
##
# Functions below allow construction of the various Linear Sequence Plots - i.e. should
# be considered internal to this file
##
#
def __get_bar_edge_width(n_bars):
if n_bars < 110:
LW = 1.1
else:
LW = (220 - n_bars)/(220.0)
if LW < 0:
LW=0

return LW


#...................................................................................#
def __build_linear_plot(
Expand All @@ -711,18 +724,32 @@ def __build_linear_plot(
"""




# define the width of the bar edges - when we're plotting lots of bars the bar
# edges can be come overwheliming so the width gets linearly reduced to zero between
# 110 and 220 bars (empyrically derived)
n_bars = len(data[0,:])
LW = __get_bar_edge_width(n_bars)

# plot the data
barlist = plt.bar( data[0,:],
data[1,:],
width=1,
linewidth=1.1,
linewidth=LW,
edgecolor='k',
color='#A8A8A8')

# this is really inefficient but means we have a consistent
#
if setPositiveNegativeBars:
if setPositiveNegativeBars:
for bar in xrange(0, len(barlist)):

# connects +/- bars *if* the bar edge withd is zero
if data[1, bar] == 0 and LW == 0:
barlist[bar].set_linewidth(0.2)

if data[1, bar] < 0:
barlist[bar].set_color('r')
barlist[bar].set_edgecolor('k')
Expand All @@ -736,10 +763,18 @@ def __build_linear_plot(
plt.plot([0, len(barlist)], [-0.26, -0.26],
color='k', linewidth=1.5, linestyle="--")


# set Y lims
plt.ylim(ylimits)
plt.xlim([1, len(data[0, :])])

# set general font properties first
font = {'family' : 'Bitstream Vera Sans',
'weight' : 'normal',
'size' : 14}
matplotlib.rc('font', **font)

# then set of axis labels and title
axes_pro = FontProperties()
axes_pro.set_size('large')
axes_pro.set_weight('bold')
Expand All @@ -748,7 +783,7 @@ def __build_linear_plot(
plt.xlabel(xlabel, fontproperties=axes_pro)
plt.ylabel(ylabel, fontproperties=axes_pro)

axes_pro.set_size('x-large')
axes_pro.set_size('large')
plt.title(title, fontproperties=axes_pro)

# return plot object
Expand Down
6 changes: 3 additions & 3 deletions localcider/backend/residue.py
Expand Up @@ -4,7 +4,7 @@
!--------------------------------------------------------------------------!
! This file is part of localCIDER. !
! !
! Version 0.1.9 !
! Version 0.1.10 !
! !
! Copyright (C) 2014 - 2016 !
! The localCIDER development team (current and former contributors) !
Expand Down Expand Up @@ -54,10 +54,10 @@ class Residue:
Residue class which defines the properties for each residue.
"""

def __init__(self, name, letterCode3, letterCode1, hydropathy, charge, PPII):
def __init__(self, name, letterCode3, letterCode1, hydropathy, charge, PPII_Hilser, PPII_Creamer, PPII_Kallenbach):
self.name = name # full name [glycine]
self.letterCode3 = letterCode3 # 3 letter code [gly]
self.letterCode1 = letterCode1 # 1 letter code [g]
self.hydropathy = hydropathy # hydropathy score
self.charge = charge # charge
self.PPII = PPII # PPII propensity (Hilser)
self.PPII = {'hilser':PPII_Hilser, 'creamer':PPII_Creamer, 'kallenbach': PPII_Kallenbach}

0 comments on commit 3675a5f

Please sign in to comment.