Molecule properties and activities are most often represented in continuous, real values (e.g., EC<sub>50</sub>, IC<sub>50</sub>, LD<sub>50</sub>, LogP, etc.).  Usually for graphs I convert these values into discrete classifications (e.g., active/inactive, toxic/non-toxic, soluble/insoluble, etc.).  When making a new graph I can represent a molecule's properties or activities using a set of a few colors (red for active, green for inactive, etc).  Pretty straight forward stuff.  However, if I wanted to color the compound's real value it'd be a little more difficult; I would need a color gradient.  For example, the most active compounds the color would be red and the least active would be green.  Molecules that fall inbetween would be a color that is a mix between red and green, more red for more active and vice versa.  
<!-- TEASER_END -->
Apparently, none of the plotting packages (Bokeh, matplotlib, Plotly, seaborn) have this feature in their color libraries.  At least nothing I could find that allows you to set the colors from any two colors you choose.  Something similar I found was allowing you to import a "spectrum" of colors.  A discrete set of blues, for example, that go from dark blue to light blue.  However, this is not suitable for the infinite number of values that a LogP or LD<sub>50</sub> could take.  

I took to the internet and found a great [blog post](http://bsou.io/posts/color-gradients-with-python) by Ben Southgate, which I'll sumarize here.  In it he lays down the framework how to use [Bezier curves](https://en.wikipedia.org/wiki/B%C3%A9zier_curve) create a color gradient between any number of colors.  Unfortunately, the code he wrote generates a discrete set of color values, so in this post I'll change it up to accept any real value between $0$ and $1$.  

First, let's think about color as an RGB value, which is simply a vector in 3-dimensional space `[R, G, B]`, and let's plot that out.  

In [3]:
import plotly.plotly as py
import plotly.graph_objs as go


red = [255, 0, 0]
green = [0, 255, 0]
blue = [0, 0, 255]

colors = [red, green, blue]

data = []

for color in colors:
    data.append(
                    go.Scatter3d(
                                x=color[0],
                                y=color[1],
                                z=color[2],
                                mode='markers',
                                marker=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            size=12,
                                            symbol='circle',
                                            line=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            width=1
                                                    )
                                            )
                                    )
                        )
    
layout = go.Layout(showlegend=False)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig)

ImportError: 
The plotly.plotly module is deprecated,
please install the chart-studio package and use the
chart_studio.plotly module instead. 


So now, being in 3D color space and all, it's pretty easy to see that you could generate an infinite amount of colors from red to green by simply drawing a line from the red circle located at $[255, 0, 0]$ to the green circle $[0, 255, 0]$.  If we define a variable $t \in [0,1]$, it can be thought of the distance traveled on the line from red to green.  When $t = 0.5$, we're halway from red to green.  If we do this linearly, we get an equation like this:  

$$color(t) = [255, 0, 0] + t\times([0, 255, 0]-[255, 0, 0])$$

Which is an easy function to code using numpy.

In [37]:
import numpy as np

def linear_color(t, colors):
    """ returns a color at distance t from color 1 to color 2 
    
    t: floating point between [0,1]
    colors: list of two colors    
    """
    colors = list(map(np.asarray, colors))
    return colors[0] + t * (colors[1] - colors[0])

Any molecule activity, when scaled to be $[0,1]$ would then have a color on this gradient line. We can see what that would look like for one hundered values traveling from red to green.

In [38]:
ts = np.linspace(0, 1, 100)

colors_red_to_green = [linear_color(t, [red, green]) for t in ts]
line_data = []

for color in colors_red_to_green:
    line_data.append(
                    go.Scatter3d(
                                x=color[0],
                                y=color[1],
                                z=color[2],
                                mode='markers',
                                marker=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            size=12,
                                            symbol='circle',
                                            line=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            width=1
                                                    )
                                            )
                                    )
                        )
    
layout = go.Layout(showlegend=False)
fig = go.Figure(data=data+line_data, layout=layout)
py.iplot(fig)

The problem with that is that it's pretty ugly for inconclusive colors, that is molecules that are equidistant between active and inactive.  Let's say we wanted to have these colors take on a nicer blue color to show moderate activity.  We can do that if we take a detour in our 3D space.  Instead of linearly traveling from red to green, let's go around the ugly colors and approach the blue space.  To do that, we can use a Bezier curve.  A Bezier curve can create a nice smooth curve, given $n$ control points that dictate its shape.  As an example, we can use the red, blue, and green as our control points.  What the Bezier curve essentially does is it draws two lines, one from red to blue, the other from blue to green.  If $t$ is a point on each of these lines starting at $0$ (the green point for the first line and the blue for the second) the Bezier curve is the curve that connects these two points as $t$ travels from $0$ to $1$ along the lines.  This is much easier to understand when demonstrated visually, so [here](https://www.jasondavies.com/animated-bezier/) is a nice interactive I found that shows how a Bezier curve can be drawn for an arbitrary number of control points at any distance.  Mathematically, the equation looks like this for n+1 control colors, where $P$ is a control color:

$$B(t) = \sum\limits_{i=1}^n b_{i,n} (t)P_{i}$$

$b_{i,n}(t)$ is the Bernstein coefficient defined as 

$$b_{i,n}(t) = \bigg(\frac{n!}{i!(n-i)!}\bigg)t^i(1-t)^{n-i}$$

So, what that would look like for our three control points `[red, blue, green]`, would be this:

$$ B(t) = b_{0,2}(t)[255, 0, 0] + b_{1,2}(t)[0, 0, 255] + b_{2,2}(t)[0, 255, 0] $$

Using these equations, we can write two functions that can calculate a color for any point $t$ given any number of control points. 

In [41]:
def bernstein_coefficient(t, i, n):
    """ return the bernstein_constant for a Bezier curve
    
    t: floating point between [0,1]
    i: int between [0,n]
    n: number of control points
    """
    import math
    term1 = math.factorial(n)/(math.factorial(i)*math.factorial(n-i))
    term2 = ((1-t)**(n-i))*(t**i)
    return term1*term2

def bezier_point(t, control_points):
    """ Return RGB values for a point on a bezier curve given at t given a list of control points
    
    t: floating point between [0,1]
    control_points: list of control points, which should be in three dimensions (R,G,B)
    """
    control_points = list(map(np.asarray, control_points))
    result = [
        bernstein_coefficient(t, i, len(control_points)-1)*pnt for i, pnt in enumerate(control_points)
    ]
    return [int(sum(pnt)) for pnt in zip(*result)]

Finally, the Bezier curve for our three control points would look something like this:

In [40]:
colors_red_to_green = [bezier_point(t, [red, blue, green]) for t in ts]

curve_data = []

for color in colors_red_to_green:
    curve_data.append(
                    go.Scatter3d(
                                x=color[0],
                                y=color[1],
                                z=color[2],
                                mode='markers',
                                marker=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            size=12,
                                            symbol='circle',
                                            line=dict(
                                            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                                            width=1
                                                    )
                                            )
                                    )
                        )
    
layout = go.Layout(showlegend=False)
fig = go.Figure(data=data+curve_data, layout=layout)
py.iplot(fig)

We are left with a much nicer looking color gradient using a Bezier curve.  Many thanks to [Nastassia Pouradier Duteil](https://sites.google.com/site/nastassiapouradierduteil/) for working through the math with me.  