## PRACTICE NOTEBOOK

`Welcome to the practice notebook.` 
* There will be some high level overviews of some of the topics we've already discussed.

* Items that start with a number are exercises to be completed. 

* Follow the instructions listed under **`Tasks`**.

* Start by running the first cell below to get the data set up

#### RUN THE CELL BELOW BEFORE STARTING

In [1]:
import numpy as np
import pandas as pd

from bokeh.plotting import ColumnDataSource


# Import the necessary datasets
building_data = pd.read_csv('../datasets/Building_Energy_and_Water_Use_Metrics.csv', index_col=0)

building_cds = ColumnDataSource(building_data)
building_data.head(2)

Unnamed: 0,Property_Name,Address,ZIP,Property_Type,Gross_Sq_Ft,Property_Uses,Site_EUI,EnergyStar_Score,EnergyStar_Certified,Year_Built,GHG_Emissions,GHG_Intensity,Site_Energy_Use,Percent_Electr,Percent_Gas,Percent_Steam,built_before
0,50 West Broadway,50 West Broadway,2127,Multifamily Housing,250755.0,"Multifamily Housing, Parking",39.3,97,,2008,585.0,2.9,"* 7,898,243",48%,52%,0%,Built after 1950
1,Boston Trinity Academy,17 Hale Street,2136,K-12 School,53000.0,K-12 School,50.4,98,,1956,142.4,2.7,"* 2,673,498",0%,100%,0%,Built after 1950


## 1.0 Basic Plot with no Data
* Review:
    * figure()
    * output_notebook() or output_file()
    * show()
    
#### Tasks:
1. Import the appropriate libraries
2. Create a **`figure()`** object named **`p`**
3. Output the figure to the notebook

In [5]:
# create a basic plot figure from scratch and output to notebook

## 2.0 Plotting Using NumPy Arrays
* NumPy is the widely used package for scientific computing in Python
* Python library that provides a very useful multidimensional array object the ndarray object
* bokeh can take input in the form of an ndarray object

#### Tasks:
1. Create a blank **`figure()`**
2. Plot a circle glyph with x and y as input values
3. Call **`output_notebook`** and **`show()`** to display the plot
4. print the type of x and y using **`type()`**

In [7]:
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource
from bokeh.io import output_notebook, show

# create a figure

# x and y are numpy arrays
x = np.linspace(0, 10, 101)
y = np.exp(x)

# plot a circle glyph with x and y

# output the notebook and show

# print the type of x and y

## 3.0 Using Pandas DataFrames
* pandas is a Python package for working with realational data
* Built on top of the NumPy library.
* pandas DataFrame is primary data structure
    * analgous in appearance to excel workbook or R data frame
    
#### Tasks:
1. Use the **`GHG_Emissions`** column from the **`building_data`** DataFrame for the x-value
2. Use the **`Gross_Sq_Ft`** column from the **`building_data`** DataFrame for the y-value
3. Create a circle glyph with the x and y values
4. Output to notebook and show the plot
5. Print the type of the x and y values

In [8]:
from bokeh.plotting import figure
from bokeh.io import output_notebook, show

# create the x-value below here

# create the y-value below here


p = figure(plot_width=500,
           plot_height=300,
           x_axis_label='Greenhouse Gas Emissions',
           y_axis_label='Gross Square Feet')

# create a circle glyph

# output the notebook and show

# print the type of the x and y

## Bokeh's ColumnDataSource
##### *Review*
* In the background bokeh is transforming these data formats into the main data format for bokeh - **`ColumnDataSource`**
* **`ColumnDataSource`** is the main data structure in bokeh.
* **`ColumnDataSource`** has a data attribute that matches a string name to a sequence of data.
    * In the case of the pandas DataFrame the string name is the column name and the sequence of data is the values from the column

## Equivalent DataFrame and ColumnDataSource for comparison
##### *Review*

In [26]:
# review material
table = pd.DataFrame(data=[['Greg', 2, 68], ['Tim', 4, 70]],
                     columns=['name', 'number', 'height'])

print(table)

   name  number  height
0  Greg       2      68
1   Tim       4      70


In [25]:
# review material
table = ColumnDataSource(data={
    'name': ['Greg', 'Tim'],
    'number': [2, 4],
    'height': [68, 70],
})

table.data

{'height': [68, 70], 'name': ['Greg', 'Tim'], 'number': [2, 4]}

* Benefits of the **`ColumnDataSource`**:
    * Can be used to link selections between plots
    * Can be used to create extra hover tooltips

## Transform a Pandas DataFrame to a ColumnDataSource
##### *Review*

In [2]:
# pass the pandas DataFrame building_data to ColumnDataSource function
building_cds = ColumnDataSource(building_data)

building_cds.data.keys() # the keys are the column headers from the DataFrame

dict_keys(['Property_Name', 'Address', 'ZIP', 'Property_Type', 'Gross_Sq_Ft', 'Property_Uses', 'Site_EUI', 'EnergyStar_Score', 'EnergyStar_Certified', 'Year_Built', 'GHG_Emissions', 'GHG_Intensity', 'Site_Energy_Use', 'Percent_Electr', 'Percent_Gas', 'Percent_Steam', 'built_before', 'index'])

## 4.0 Plotting with the ColumnDataSource

#### Tasks:
1. Create a ColumnDataSource object by passing the **`building_data`** DataFrame into the function
    * name the ColumnDataSource object **`building_cds`**
2. Add a label to the x-axis with the **`x_axis_label`** parameter within **`figure()`**
3. Add a label to the y-axis with the **`y_axis_label`** parameter within **`figure()`**
4. Pass the dictionary keys as inputs to the x and y parameters of the circle glyph
    * **`GHG_Emissions`** and **`Gross_Sq_Ft`** are keys from the ColumnDataSource
    * set the source parameter equal to the ColumnDataSource object **`building_cds`**
    * NOTE you are pulling data from the ColumnDataSource object and NOT the pandas DataFrame

In [12]:
from bokeh.plotting import figure
from bokeh.io import output_notebook, show

# create a ColumnDataSource object


# Set up the figure
p = figure(plot_width=500,
           plot_height=300,
          )

# create a circle glyph following the above instructions


# output notebook and show


print(type(building_cds))

<class 'bokeh.models.sources.ColumnDataSource'>


## 5.0 Color Mapping
* You can color points based on categorical values
* **`from bokeh.models import CategoricalColorMapper`**
    * `CategoricalColorMapper` inputs:
        * factors
        * palette
* to the glyph property you have to pass a dictionary
    * **`field`** - which is the name of the column to map
    * **`transform`** - the color map for that value.
    
#### Tasks:
2. Import CategoricalColorMapper from bokeh.models
1. Set up the figure p_basic with **`plot_width`**, **`plot_height`**, **`x_axis_label`** and **`y_axis_label`**.
3. Set the palette parameter within CategoricalColorMapper to two different colors
4. Modify the **`transform`** property to select the color_mapper.

In [12]:
from bokeh.plotting import figure
from bokeh.io import output_notebook, show
# import CategoricalColorMapper

building_cds = ColumnDataSource(building_data)

# Set up the figure
p_basic = figure(# set plot width,
    # set plot height,
    # set x axis label,
    # set y axis label,
)

# Create the CategoricalColorMapper object
color_mapper = CategoricalColorMapper(factors=['Built after 1950', 'Built before 1950'],
                                      palette = # set the color palette here e.g. palette=['red', 'blue']
                                      )

p_basic.circle(x='GHG_Emissions',
               y='Gross_Sq_Ft',
               source=building_cds,
               color={'field':'built_before', 
                      'transform': }, # transform the values by the color_mapper object
               legend='built_before')

output_notebook()
show(p_basic)

print(type(building_cds))

## Color Palette Structure
##### *Review*
* The palettes themselves are just lists of hexadecmial RGB color strings
* The palettes start at a minimum value of 3

In [10]:
from bokeh.palettes import Colorblind, viridis
print(Colorblind[3])
print(viridis(6))

['#0072B2', '#E69F00', '#F0E442']
['#440154', '#404387', '#29788E', '#22A784', '#79D151', '#FDE724']


## 5.1 Using an Imported Color Palette

#### Tasks:
1. For the **`palette`** parameter within CategoricalColorMapper pass a the **`Colorblind`** palette with any integer value from **3** to **8**
2. Following the structure from exercise 5.0 complete the fill out the circle glyph with the appropriate properties

In [13]:
from bokeh.plotting import figure
from bokeh.io import output_notebook, show
from bokeh.models import CategoricalColorMapper
from bokeh.palettes import Colorblind,

building_cds = ColumnDataSource(building_data)

p_cat = figure(plot_width=500,
               plot_height=300,
               x_axis_label='Greenhouse Gas Emissions',
               y_axis_label='Gross Square Feet')



color_mapper = CategoricalColorMapper(factors=['Built after 1950', 'Built before 1950'],
                                      palette =  # color palette values e.g. Colorblind[3]
                                      )

p_cat.circle() # add x, y, color and legend values to the circle glyph

output_notebook()
show(p_cat)

print(type(building_cds))

### Larger Palettes
##### *Review*
The bokeh.palettes module also has some larger palettes with 256 colors. 
* The large palettes available are shown below:

<img src="../presentation/images/large-palettes.png">

## Bokeh Layouts
##### *Review*
* row method - aligns plots & menu objects in rows
* column method - aligns plots & menu objects in columns

## 6.0 Row Layout
#### Tasks:
1. Add p_basic, p_cat and p_scale in a row

In [37]:
from bokeh.layouts import row, column

layout = row() # add p_basic, p_cat and p_scale in a row

show(layout)

## 6.1 Column Layout
#### Tasks:
1. Add p_basic, p_cat and p_scale in a column to a layout object
2. Show layout

In [38]:
from bokeh.layouts import row, column

# add p_basic, p_cat and p_scale in a column

# show output


## 6.2 Combination Layout
#### Tasks:
1. Add p_basic, p_cat in the first row and add p_scale alone on a second row
2. Show layout

In [13]:
from bokeh.layouts import row, column

# add (p_basic, p_cat) in the first row and add p_scale alone on a second row

# show output

## 7.0 Creating grid plots
* Benefit is that you have one toolbar for all the plots

#### Tasks:
1. Pass a list of lists into gridplot()
2. Make sure the lists are of the same length
3. If there is a blank space in the grid you must pass None into the list

In [46]:
from bokeh.layouts import gridplot

# create grid_layout below



output_notebook()
show(grid_layout)

## 7.1 Allow for scaling of the plots
#### Tasks:
1. Use the sizing_mode attribute and set it to 'scale_width'

In [41]:
from bokeh.layouts import gridplot

grid_layout = gridplot(
    [[p_basic, p_cat], [p_scale, None]]) # modify the sizing_mode attribute

output_notebook()
show(grid_layout)

## 7.2 Changing the Toolbar Location
#### Tasks:
1. Use the sizing_mode attribute and set it to 'scale_width'
1. Use **`toolbar_location`** to change the location of the toolbar

In [42]:
from bokeh.layouts import gridplot

grid_layout = gridplot(
    children=[[p_basic, p_cat], [p_scale, None]],
            # modify the sizing_mode attribute,
            # modify the toolbar_location attribute
)  

output_notebook()
show(grid_layout)