### Making Plots

As you've seen in recent lessons, data science leans on data visualizations to draw inferences about our data, and to make sense of the math we use in making sense of this data.  We saw how plotting data can display the relationship between x and y variables and how the impact that changing the y-intercept or slope variable has on a regression line.  

In this lesson, let's explore the Plotly library, which allows us to create data visualizations with Python.  As we do so, pay careful attention to the data type that our methods require: whether they are dictionaries or arrays, or arrays of dictionaries.  Ok, let's go!

### Working through a linear regression 

To get started with plotly first install the library on your computer.  You can do so in Jupyter through executing the cell below.

In [None]:
!pip install plotly

If plotly is already on your computer, pip will tell you that the require is already satisfied.  That's ok - nothing broke.

The next step is to import the plotly library. 

In [11]:
import plotly

If we plot offline, we do not need to provide a login.  So we do so, while plotting our first plot with the below line.

In [13]:
plotly.offline.iplot([
    {}
])

Let's take another look at that line.
```python
plotly.offline.iplot([
    {}
])
```

We reference the `plotly` library, which we imported above.  Then to the `iplot` method we pass an array, which has a dictionary in it.  That dictionary can represent a scatter chart, a line chart, or other types of charts.  Another name for these charts is traces.  We'll use the two words interchangeably.  We pass the dictionary into an array because we can have more then one chart in the same graph - for example a scatter plot underneath a line plot.  

Now, for the dictionary that represents a chart, we should begin to provide some information.  We can do so by giving our plot some data.  Below we plot four points.  Notice that we provide the x and y coordinates in two separate attributes of the dictionary.  Change around the data to get a feel for how it works.

In [19]:
trace = {'x': [1, 2, 3, 4], 'y': [1, 2, 3, 4]}

plotly.offline.iplot([
    trace
])

The line above produces a line chart.  However this is just the default.  We can change it by changing the mode to `markers`.  Let's also change the color of the markers while we are at it.  

In [22]:
trace = {'x': [1, 2, 3, 4], 'y': [1, 2, 3, 4], 'mode': 'markers', 'marker': {'color': 'rgba(255, 182, 193, .9)'}}

plotly.offline.iplot([
    trace
])

Now remember we said that we can add more than one trace to a given graph.  Let's do that now.  We'll keep the first trace largely the same by using the same data, and color of markers.  We'll add a name to our trace of 'Some dots', simply by adding a name attribute.

In the second trace, we have some new data, and set the color as blue.  Because we did not specify a mode, it defaults to connecting the points as a line.  And we name our trace as "Our nice line".   

Then, we set a variable `initial_sample_budgets` equal to a list of our budgets.  

In [30]:
trace0 = {'x': [1, 2, 3, 4], 'y': [1, 2, 3, 4], 'mode': 'markers', 'marker': {'color': 'rgba(255, 182, 193, .9)'}, 'name': 'Some dots'}
trace1 = {'x': [1.5, 2.5, 3.5, 4.5], 'y': [3, 5, 7, 9], 'marker': {'color': 'blue'}, 'name': 'Our nice line'}

plotly.offline.iplot([
    trace0, trace1
])

### Working with types

So far, we have only worked with either scatter charts or line charts.  The two charts are really quite similar -- line charts connect points with a line -- and plotly treats them as such.  There are other types of charts.

We can make a bar chart, for example, simply by specifying the in our dictionary that the type is a bar chart.

In [39]:
trace0 = {'type': 'bar', 'x': ['bobby', 'susan', 'eli', 'malcolm'], 'y': [3, 5, 7, 9], 'marker': {'color': 'blue'}, 'name': 'Our nice line'}

plotly.offline.iplot([
    trace0
])

Now another way to create a bar chart is to use the constructor provided by plotly.  It's not too bad.  First, we import our graph_objs library from plotly.  And then we call the bar chart constructor. 

In [45]:
from plotly import graph_objs 

bar_chart = graph_objs.Bar(
            x=['bobby', 'susan', 'eli', 'malcolm'],
            y=[3, 5, 7, 9]
    )

bar_chart

{'type': 'bar', 'x': ['bobby', 'susan', 'eli', 'malcolm'], 'y': [3, 5, 7, 9]}

We refer to `graph_objs.Bar` as a constructor because it literally constructs python dictionaries with a key of `type` that equals `bar`.  Then, we can pass this dictionary to our `iplot` method to display our bar chart.

In [46]:
bar_chart = graph_objs.Bar(
            x=['bobby', 'susan', 'eli', 'malcolm'],
            y=[3, 5, 7, 9]
    )


plotly.offline.iplot([
    trace0
])

There are constructors for making other charts as well.  

In [47]:
graph_objs.Scatter()

{'type': 'scatter'}

In [48]:
graph_objs.Pie()

{'type': 'pie'}

And of course, we can always use the dictionary constructor to create our dictionaries.

In [54]:
pie_trace = dict(type="pie", labels=["chocolate", "vanilla", "strawberry"], values=[10, 5, 15])

plotly.offline.iplot([
    pie_trace
])

### Modifying a Chart Layout

So far we have seen how to specify attributes of traces or charts, which display our data.  Now let's see how to modify the overall layout in our chart.

Note that the format of our traces will not change.

In [56]:
trace_of_data = {'x': [1.5, 2.5, 3.5, 4.5], 'y': [3, 5, 7, 9], 'marker': {'color': 'blue'}, 'name': 'Our nice line'}

# plotly.offline.iplot([
#     trace_of_data
# ])

However, instead of passing to our `iplot` function an array of traces, we pass a dictionary with a `data` key, which has a value of an array of traces.  And a `layout` key, with a value of a dictionary representing our layout.

{'title': 'Scatter Plot'}

In [63]:
layout = {'title': 'Scatter Plot'}
trace_of_data = {'x': [1.5, 2.5, 3.5, 4.5], 'y': [3, 5, 7, 9], 'marker': {'color': 'blue'}, 'name': 'Our nice line'}

figure = {'data': [trace_of_data], 'layout': layout}

plotly.offline.iplot(figure)

Now above we only used the `layout` to specify our chart's title.  Let's now also use it to add a range to our x axis and y axis.  Currently, we are allowing plotly to automatically set our range.  But we can also specify this.  Let's change it so that the x and y axis both have the same range.

In [65]:
layout = {'title': 'Scatter Plot', 'xaxis': {'range': [1, 10]}, 'yaxis': {'range': [1, 10]}}
trace_of_data = {'x': [1.5, 2.5, 3.5, 4.5], 'y': [3, 5, 7, 9], 'marker': {'color': 'blue'}, 'name': 'Our nice line'}

figure = {'data': [trace_of_data], 'layout': layout}

plotly.offline.iplot(figure)

### Working with Functions

Often, when working with graphs we are representing data that is a function of an input.  As described in previous sections, we write functions that produce these output values for us.

For example, we can describe a line as $y = 1.3x + 400$.  It's not too difficult for us to then visually represent this as a line.  We can start by translating this formula into a function.

In [67]:
def y(x):
    return 1.3*x + 400

y(30)

439.0

Now this returns different y-values for a set of x-values.  

Imagine if we want to plot this line, for the data between 30 and 50.  First, we will need a set of x values between 30 and 50.  We can create this with a range in Python, and call them the `x_values`.

In [72]:
x_values = list(range(30, 51, 1))
x_values

[30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50]

Now to produce the values of the line for that set of x_values, we need the corresponding `y_values`.  We can do this by iterating through our x values with the `map` function.

In [74]:
y_values = list(map(lambda x: y(x), x_values))
y_values

[439.0,
 440.3,
 441.6,
 442.9,
 444.2,
 445.5,
 446.8,
 448.1,
 449.4,
 450.7,
 452.0,
 453.3,
 454.6,
 455.9,
 457.2,
 458.5,
 459.8,
 461.1,
 462.4,
 463.7,
 465.0]

Ok, now we have an array of `x_values` and an array of `y_values`.  We can provide these into a line graph.

In [77]:
scatter_trace = graph_objs.Scatter(x=x_values, y=y_values)

layout = {'title': 'Regression Line'}
fig = {'data': [scatter_trace], 'layout': layout}

plotly.offline.iplot(fig)


There it is a nice regression line.  Note that we can do this in one fell swoop with the following code:

In [80]:
x_values = list(range(30, 51, 1))

scatter_trace = graph_objs.Scatter(x=x_values, y=list(map(lambda x: y(x), x_values)))

layout = {'title': 'Regression Line'}
fig = {'data': [scatter_trace], 'layout': layout}

plotly.offline.iplot(fig)


### Summary

In this section we saw how we can use Plotly's library to create data visualisations.  We create different traces to represent our data, with each trace represented as a dictionary which is passed to our `iplot` method.  We saw we can have multiple traces displayed in the chart, as the traces are wrapped in an array.  We saw that even when we use constructors like `graph_objs.Bar` to create a chart, all this does is create a dictionary which is then passed to our `iplot` method.  Then we moved onto modifying our layout for our charts, which is also just a python dictionary.  

We then ended the section by showing how to display lines with our charts.  We do so by creating some initial  data for our $x$ values, and then using a function to map through this data and produce corresponding $y$ values.  Then we plot these points using these two arrays of coordinates.