Molecule properties and activities are most often represented in continuous, real values (e.g., EC<sub>50</sub>, IC<sub>50</sub>, LD<sub>50</sub>, LogP, etc.).  Usually for graphs I convert these values into discrete classifications (e.g., active/inactive, toxic/non-toxic, soluble/insoluble, etc.) for graphical purposes.  So, when making a new graph I can represent a molecule's properties or activities using a set of a few colors (red for active, green for inactive).  Pretty easy stuff if I have an interesting plot I want to carry out.  However, if I wanted to color the compound's real value it'd be a little more difficult, I'd need a scale.  For example, for the most active compounds the color would be red, and the least active would be green.  Molecules that fall inbetween would be a color that is a mix between red and green.  

Apparently, none of the plotting packages (Bokeh, matplotlib, Plotly) have this feature in their color libraries.  Most allow you to import a color spectrum.  A discrete set of blue colors, for example.  Not suitable for the infinite number of values that a LogP or LD<sub>50</sub> could take.  

So I took to the internet, and found a great [blog post](http://bsou.io/posts/color-gradients-with-python) by Ben Southgate, which I'll sumarize here.  Unfortunately, the code he wrote still generates a discrete set of color values, but he lays down how to use [Bezier curves](https://en.wikipedia.org/wiki/B%C3%A9zier_curve) to solve this problem, which I'll summarize here.  

First, let's think about color as an RGB value, which is simply a vector in 3-dimensional space `[R, G, B]`.  So, let's plot that out.  

In [1]:
import plotly.plotly as py
import plotly.graph_objs as go


red = [255, 0, 0]
green = [0, 255, 0]
blue = [0, 0, 255]

colors = [red, green, blue]

data = []

for color in colors:

    data.append(
        go.Scatter3d(
        x=color[0],
        y=color[1],
        z=color[2],
        mode='markers',
        marker=dict(
            color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
            size=12,
            symbol='circle',
            line=dict(
                color='rgb({0}, {1}, {2})'.format(color[0], color[1], color[2]),
                width=1
                    )
            )
        )
    )

fig = go.Figure(data=data)
py.iplot(fig)

PlotlyLocalCredentialsError: 
Couldn't find a 'username', 'api-key' pair for you on your local machine. To sign in temporarily (until you stop running Python), run:
>>> import plotly.plotly as py
>>> py.sign_in('username', 'api_key')

Even better, save your credentials permanently using the 'tools' module:
>>> import plotly.tools as tls
>>> tls.set_credentials_file(username='username', api_key='api-key')

For more help, see https://plot.ly/python.


So how, being in 3D color space and all, it's pretty easy to see that you could generate an infinite amount of colors from red to green by simply drawing a line from the red circle located at `[255, 0, 0]` to the green circle `[0, 255, 0]`.  If we a variable $t \in [0,1]$, it can be thought of the distance traveled on the line from red to green.  When $t = 0.5$, we're halway from red to green.  If we do this linearly, we get an equation like this:  

$$color(t) = [255, 0, 0] + t\times([0, 255, 0]-[255, 0, 0])$$

And that line would look something like this:

Any molecule activity, when scaled to be $[0,1]$ would then have a color on this gradient line.  The problem with that is that it's pretty ugly for inconclusive colors.  Let's say we wanted to have these colors take on a nicer, blue color to show moderate activity.  We can do that if we take a detour in our 3D space.  Instead, of linearly traveling from red to green, let's go around the ugly colors and approach the blue space.  To do that, we can use a Bezier curve.  A Bezier curve .

In [4]:
def bernstein_constant(t, i, n):
    """ return the bernstein_constant for a Bezier curve
    
    t: floating point between [0,1]
    i: int between [0,n]
    n: number of control points
    """
    import math
    term1 = math.factorial(n)/(math.factorial(i)*math.factorial(n-i))
    term2 = ((1-t)**(n-i))*(t**i)
    return term1*term2

def bezier_point(t, control_points):
    """ Return RGB values for a point on a bezier curve given at t given a list of control points
    
    t: floating point between [0,1]
    control_points: list of control points, which should be in three dimensions (R,G,B)
    """
    import numpy as np
    control_points = list(map(np.asarray, control_points))
    result = [
        bernstein_constant(t, i, len(control_points)-1)*pnt for i, pnt in enumerate(control_points)
    ]
    return [int(sum(pnt)) for pnt in zip(*result)]

In [8]:
import numpy as np

l = np.linspace(0, 1, 100)

line = [bezier_point(i, [r, b, g]) for i in l]

In [9]:
scatter = [go.Scatter3d(
        x=l[0],
        y=l[1],
        z=l[2],    
    mode='markers',
    marker=dict(
        color='rgb({0}, {1}, {2})'.format(l[0], l[1], l[2]),
        size=12,
        symbol='circle',
        line=dict(
            color='rgb({0}, {1}, {2})'.format(l[0], l[1], l[2]),
            width=1
                )
        )
        )
        for l in line
          ]

In [11]:
fig = go.Figure(data=[red, green, blue] + scatter)
py.iplot(fig, filename='simple-3d-scatter')


In [None]:
bernstein_constant(1, 0, 2)*np.asarray(b)+bernstein_constant(1, 1, 2)*np.asarray(b)+bernstein_constant(1, 2, 2)*np.asarray(b)