# Visualize car data with Brunel

The Brunel Visualization language makes it easy to build interactive charts and diagrams that you can deploy rapidly. This notebook contains the steps and code to get you started with visualizing data with Brunel.

Some familiarity with Python is recommended. This notebook runs on Python.

You will use data about cars to graph the relationships between various properties, for example, how horsepower affects gas mileage. The cars data set was used for the 1983 American Statistical Association Data Exposition. This data set was collected by Ernesto Ramos and David Donoho and obtained from StatLib.

## Table of contents

This notebook has the following sections:
1. [Load the data](#data_set)
1. [Visualize the data](#visualize)
1. [Modify the DataFrame to highlight specific data](#highlight)
1. [Summary and next steps](#summary)

<a id="data_set"></a>
## 1. Load the data 
The car data is a freely available data set on the Watson Studio home page.

1. Go to the <a href="https://dataplatform.cloud.ibm.com/exchange/public/entry/view/c81e9be8daf6941023b9dc86f303053b" target="_blank">Car performance data</a> card on the Watson Studio home page.
1. Click the link button.
1. Hover above the link button next to the access key to display the link.
1. Double-click the link to select it, copy the link, and click **Close**.
1. In the cell below, replace the **LINK-TO-DATA** string in the read_csv() method with the link.

 
Run the next cell to import pandas and Brunel libraries, load the data into a pandas DataFrame, and display the first five rows of data:

In [1]:
import pandas as pd
import brunel

#Replace LINK-TO-DATA with link to data set
cars = pd.read_csv("LINK-TO-DATA")

cars.head(6)

Unnamed: 0,mpg,cylinders,engine,horsepower,weight,acceleration,year,origin,name
0,18.0,8,307.0,130.0,3504,12.0,70,American,chevrolet chevelle malibu
1,15.0,8,350.0,165.0,3693,11.5,70,American,buick skylark 320
2,18.0,8,318.0,150.0,3436,11.0,70,American,plymouth satellite
3,16.0,8,304.0,150.0,3433,12.0,70,American,amc rebel sst
4,17.0,8,302.0,140.0,3449,10.5,70,American,ford torino
5,15.0,8,429.0,198.0,4341,10.0,70,American,ford galaxie 500


<a id="visualize"></a>
## 2. Visualize the data
You'll create some charts and diagrams with Brunel commands.

The basic format of each call to Brunel is simple. Whether the command is a single line or a set of lines, the commands are concatenated together and the result interpreted as one command.

Here are some of the rules for using Brunel that you'll need in this notebook:
 * __DataFrame__: Use the `data` command to specify the pandas DataFrame. 
 * __Chart type__: Use commands like `chord` and `treemap` to specify a chart type. If you don't specify a type, the default chart type is a scatterplot. 
 * __Chart definition__: Use the `x` and `y` commands to specify the data to include on the x-axis and the y-axis.
 * __Styling__: Use commands like `color`, `tooltip`, and `label` to control the styling of the graph.
 * __Size__: Use the `width` and `height` key-value pairs to specify the size of the graph. The key-value pairs must be preceded with two colons and separated with a comma, for example: `:: width=800, height=300`
 
See detailed documentation on the Brunel Visualization language at <a href="http://brunel.mybluemix.net/docs/" target="_blank">Introduction to Brunel</a>.

### Scatter plots
Run the next cell to show the relationship between the miles per gallon and the horsepower of the vehicles in a scatter plot. The color identifies the origin of the vehicles. 

In [2]:
%brunel data('cars') x(mpg) y(horsepower) color(origin) :: width=800, height=300

<IPython.core.display.Javascript object>

Put your cursor over the chart and scroll to zoom in and out. When you zoom in, you can pan across the chart by clicking and dragging. 

Run the next cell to show the relationship between the horsepower and the weight of the cars in a scatter plot. The color is based on the origin of the cars. The tooltips show the name of the cars. 

In [3]:
%brunel data('cars') x(horsepower) y(weight) color(origin) tooltip(name) :: width=800, height=300

<IPython.core.display.Javascript object>

### Chord plot
Run the next cell to show a chord plot that correlates the origin and number of cars produced per year. The x and y commands specify that the origin is mapped to the year of manufacture. The size of the segments is based on the number of cars. The color is based on the origin of the cars.

In [4]:
%brunel data('cars') x(origin) y(year) chord size(#count) color(origin) :: width=500, height=400

<IPython.core.display.Javascript object>

### Treemap
A treemap can show many dimensions as recursively divided rectangles.

Run the next cell to show a treemap that groups vehicles by their origin, year of manufacture, and number of cylinders. The color indicates the average gas mileage of the vehicles in each block. The numbers in each block are the number of cylinders. The size of the blocks reflects the number of vehicles in the category. The tooltips show all the information.

In [5]:
%brunel data('cars') treemap x(origin, year, cylinders) color(mpg) mean(mpg) size(#count) label(cylinders) tooltip(#all):: width=900, height=600

<IPython.core.display.Javascript object>

<a id="highlight"></a>
## 3. Modify the DataFrame to highlight specific data
You can modify or add to the DataFrame to show data in different ways. In the following example, you apply a function that takes a string and tests whether it matches one of a set of substrings. The function maps to the names column to create a new column that consists of the names that match either "Ford" or "Buick". 

In [6]:
def identify(x, search): 
    for y in search: 
        if y.lower() in x.lower(): return y
    return None

cars['Type'] = cars.name.map(lambda x: identify(x, ["Ford", "Buick"]))

Run the next cell to create a scatter chart that plots gas mileage versus engine size. The Buick cars have blue dots and the Ford cars have red dots. The Brunel command is split into two chart definitions that are combined by the overlay operator (a plus sign). Both chart definitions set the x-axis, the y-axis, and the color to the same values but set the style to different values. The first chart definition sets the style of the dots and the second definition sets the style of the words in the legend. The last line of the command sets the width and height of the chart.

In [7]:
%%brunel data('cars') x(engine) y(mpg) color(Type)  style('size:50%; fill:#eee') +
     x(engine) y(mpg) color(Type) style('text {font-size:14; font-weight:bold; fill:darker}') 
     :: width=800, height=800

<IPython.core.display.Javascript object>

<a id="summary"></a>
## 4. Summary and next steps
You explored different types of charts and formatting and learned how you can use the pandas DataFrame to refine your charts. Try changing the formatting of these charts, or creating your own. 

For more information about the Brunel Visualization language, see __<a href="http://brunel.mybluemix.net/docs/" target="_blank">Introduction to Brunel</a>__.

Also read the Brunel blog at __<a href="http://brunelvis.org/" target="_blank">Working Vis - Perspectives on Actionable Visualization</a>__.

### Authors

**Dan Rope** and **Graham Wills** are visualization architects. They created the Brunel visualization language.

Copyright © 2017-2019 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>