# Getting started with Notebooks in Watson Studio

This notebook contains the steps and code to get you started with importing data in a Notebook environment and visualizing data with Brunel.

This notebook runs on Python 3.

You will use data about cars to graph the relationships between various properties, for example, how horsepower affects gas mileage. The cars data set was used for the 1983 American Statistical Association Data Exposition. This data set was collected by Ernesto Ramos and David Donoho and obtained from StatLib.

## Table of contents

This notebook has the following sections:
1. [Load the data](#data_set)
1. [Visualize the data](#visualize)
1. [Modify the DataFrame to highlight specific data](#highlight)
1. [Summary and next steps](#summary)

<a id="data_set"></a>
## 1. Load the data to Object Store bucket
To start exploring the dataset, you can upload the dataset `cars.csv` (which is available in the downloaded .zip file) to the Cloud Object Storage bucket associated with your project. You can do so by selecting the `1010` icon in the top right menu, select `Files`. You can either browse for your dataset in your local filesystem or drag-n-drop it into the side-palette. After this step, the file uploaded by you should appear in the side panel.

## insert-to-code pandas dataframe
`pandas` is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Importing the dataset into your Python 3 Notebook is straight forward. Go to the next epmty cell in the notebook, open `Files` tab in the side-palette and click on `insert to code` below your dataset file. Select `Insert pandas DataFrame`. This will generate all the Python 3 code required to fetch your dataset from the COS bucket and load into the notebook environment.

In [None]:

import sys
import types
import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
client_d63757cebaec46879c8378101b770c3a = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='3vcwuy0rNWugEN1fXsPko0vd19jbtJAoYyrr_W6LikrT',
    ibm_auth_endpoint="https://iam.eu-de.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

body = client_d63757cebaec46879c8378101b770c3a.get_object(Bucket='watsonstudiominihandson-donotdelete-pr-nwgf2ftsyg1qcu',Key='cars.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_data_1 = pd.read_csv(body)
df_data_1.head()



`head()` returns the first `n` rows for the object based on position. It is useful for quickly exploring the dataset. Take a moment to analyze the feature columns in the dataset.

In [None]:
# IMP: Rename the dataframe from `df_data_xxx` to cars for easy usage in rest of the notebook.
cars = df_data_xxx

<a id="visualize"></a>
## 2. Visualize the data
Data visualizatin is a quick, easy way to understand data and convey concepts. You'll create some charts and diagrams with Brunel commands. Brunel defines a highly succinct and novel language that defines interactive data visualizations based on tabular data. See detailed documentation on the Brunel Visualization language at __<a href="http://brunel.mybluemix.net/docs/" target="_blank">Introduction to Brunel</a>__. If you are familiar with the Matplotlib python plotting library feel free to use that to generate plots.

### Scatter plots
Run the next cell to show the relationship between the miles per gallon and the horsepower of the vehicles in a scatter plot. The color identifies the origin of the vehicles. 

In [None]:
import brunel
%brunel data('cars') x(mpg) y(horsepower) color(origin) :: width=800, height=300

You will observe a trend that as the horsepower of a car increases the fuel efficiency declines in general.

Put your cursor over the chart and scroll to zoom in and out. When you zoom in, you can pan across the chart by clicking and dragging. 

Run the next cell to show the relationship between the horsepower and the weight of the cars in a scatter plot. The color is based on the origin of the cars. The tooltips show the name of the cars. 

In [None]:
%brunel data('cars') x(horsepower) y(weight) color(origin) tooltip(name) :: width=800, height=300

### Chord plot
Run the next cell to show a chord plot that correlates the origin and number of cars produced per year. The x and y commands specify that the origin is mapped to the year of manufacture. The size of the segments is based on the number of cars. The color is based on the origin of the cars.

In [None]:
%brunel data('cars') x(origin) y(year) chord size(#count) color(origin) :: width=500, height=400

<a id="highlight"></a>
## 3. Modify the DataFrame to highlight specific data
You can modify or add to the DataFrame to show data in different ways. In the following example, you apply a function that takes a string and tests whether it matches one of a set of substrings. The function maps to the names column to create a new column that consists of the names that match either "Ford" or "Buick". 

In [None]:
def identify(x, search): 
    for y in search: 
        if y.lower() in x.lower(): return y
    return None

cars['Type'] = cars.name.map(lambda x: identify(x, ["Ford", "Buick"]))

In [None]:
cars.head()

Run the next cell to create a scatter chart that plots gas mileage versus engine size. The Buick cars have blue dots and the Ford cars have red dots. The Brunel command is split into two chart definitions that are combined by the overlay operator (a plus sign). The last line of the command sets the width and height of the chart.

In [None]:
%%brunel data('cars') x(engine) y(mpg) color(Type)  style('size:50%; fill:#eee') +
     x(engine) y(mpg) color(Type) style('text {font-size:14; font-weight:bold; fill:darker}') 
     :: width=800, height=300

## Well done! you already have some wonderful findings from the dataset. 
You are now ready to share it with your team. Click on the `Share` icon in the top left tool bar. You can share the notebook link and everyone with the link will be open the notebook and look at your work. Depending upon the content of your notebook you can chose what you want to share from `Only text and Output` to `All content, including code`. 

Feel free to explore other options in Watson Studio. For example, Saving versions of your notebook, commenting on the notebook, or inviting other users to collaborate in your project. You can find information about how to do all that in the __<a href="https://eu-de.dataplatform.cloud.ibm.com/docs/content/getting-started/welcome-main.html?audience=wdp&context=analytics" target="_blank">Docs</a>__. 

<a id="summary"></a>
## 4. Summary and next steps
You explored how to bring your dataset into a Notebook environment, create different types of charts and formatting and learned how you can use the pandas DataFrame to refine your charts. Try changing the formatting of these charts, or creating your own. 

Now you are ready to explore the __<a href="https://eu-de.dataplatform.cloud.ibm.com/community?context=analytics&format=notebook" target="_blank">Watson Studio Community</a>__ for notebooks. It contains a huge selection of curated learning material.

Copyright © 2017, 2018 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>