<a href="https://colab.research.google.com/github/mincloud1501/Bokeh/blob/master/googlesheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting setup

Install boketh `pip install bokeh`.
Setup-test, run the next cell. Hopefully you should see output that looks something like this.

In [140]:
from IPython import __version__ as ipython_version
from pandas import __version__ as pandas_version
from bokeh import __version__ as bokeh_version
print("IPython - %s" % ipython_version)
print("Pandas - %s" % pandas_version)
print("Bokeh - %s" % bokeh_version)

IPython - 5.5.0
Pandas - 0.25.3
Bokeh - 1.0.4


# Basic plotting with Bokeh


In [0]:
# Import figure from bokeh.plotting
import numpy as np

# Import pandas
import pandas as pd

# Import output_file and show from bokeh.io
from bokeh.plotting import figure, show

# Import figure from bokeh.plotting
from bokeh.io import output_file, output_notebook

output_notebook()

# Google Sheets

Our examples below use the open-source [`gspread`](https://github.com/burnash/gspread) library for interacting with Google Sheets.

First, install the package using `pip`.

In [142]:
!pip install --upgrade --quiet gspread
!pip install --upgrade oauth2client
!pip install PyOpenSSL
!pip install -U -q PyDrive

Requirement already up-to-date: oauth2client in /usr/local/lib/python3.6/dist-packages (4.1.3)


Import the library, authenticate, and create the interface to Sheets.


In [0]:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

import gspread

auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

gc = gspread.authorize(GoogleCredentials.get_application_default())


In [0]:
sh = gc.create('My Test')

In [168]:
from os import path
from google.colab import drive
notebook_dir_name = 'Colab Notebooks'
drive.mount('/content/drive')
notebook_base_dir = path.join('./drive/My Drive/', notebook_dir_name)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [172]:
with open('/content/drive/My Drive/foo.txt', 'w') as f:
  f.write('Hello Google Drive!')
!cat /content/drive/My\ Drive/foo.txt

Hello Google Drive!

## Downloading data from a sheet into Python as a Pandas DataFrame


In [0]:
worksheet = gc.open('My Test').sheet1

# get_all_values gives a list of rows.
data = worksheet.get_all_values()

headers = data.pop(0)

df = pd.DataFrame(data, columns=headers)

## Plotting with glyphs
https://docs.bokeh.org/en/latest/docs/user_guide/plotting.html


In [179]:
plot = figure(plot_width=800, plot_height=500, tools='pan,box_zoom,')
plot.circle([10,20,33,4,5, 100], [8,6,5,2,3, 300])
output_file('circles.html')
show(plot)

## What are glyphs?
* Visual shapes
  * circles, squares, triangles
  * rectangles, lines, wedges
* With properties a!ached to data
  * coordinates (x,y)
  * size, color, transparency

### Glyph properties
  * Lists, arrays, sequences of values
  * Single fixed values

```
plot = figure()
plot.circle(x=10, y=[2,5,8,12], size=[10,20,30,40])
```

### Markers
https://docs.bokeh.org/en/latest/docs/gallery/markers.html

## A simple scatter plot

In this example, you're going to make a scatter plot of female literacy vs fertility using data from the European Environmental Agency. This dataset highlights that countries with low female literacy have high birthrates. The x-axis data has been loaded for you as fertility and the y-axis data has been loaded as female_literacy.

Your job is to create a figure, assign x-axis and y-axis labels, and plot female_literacy vs fertility using the circle glyph.

After you have created the figure, in this exercise and the ones to follow, play around with it! Explore the different options available to you on the tab to the right, such as "Pan", "Box Zoom", and "Wheel Zoom". You can click on the question mark sign for more details on any of these tools.

Note: You may have to scroll down to view the lower portion of the figure.

Import the figure function from bokeh.plotting, and the output_file and show functions from bokeh.io.
Create the figure p with figure(). It has two parameters: x_axis_label and y_axis_label.
Add a circle glyph to the figure p using the function p.circle() where the inputs are, in order, the x-axis data and y-axis data.
Use the output_file() function to specify the name 'fert_lit.html' for the output file.
Create and display the output file using show() and passing in the figure p.

In [0]:
sh1 = gc.create('My Test1')

In [0]:
worksheet = gc.open('My Test1').sheet1

# get_all_values gives a list of rows.
data = worksheet.get_all_values()

# get head
headers = data.pop(0)

# make dataframe
df = pd.DataFrame(data, columns=headers)

# get column data
population = df['population']
fertility = df['fertility']
# print(fertility)
female_literacy = df['female literacy']



In [186]:

# Create the figure: p
p = figure(x_axis_label='여성 1인명 당 자녀 수', y_axis_label ='population (% 인구율)')

# Add a circle glyph to the figure p
p.circle(fertility, female_literacy)

# Call the output_file() function and specify the name of the file
output_file('fert_lit.html')

# Display the plot
show(p)

## Customizing your scatter plots

The three most important arguments to customize scatter glyphs are color, size, and alpha. Bokeh accepts colors as hexadecimal strings, tuples of RGB values between 0 and 255, and any of the 147 CSS color names. Size values are supplied in screen space units with 100 meaning the size of the entire figure.

The alpha parameter controls transparency. It takes in floating point numbers between 0.0, meaning completely transparent, and 1.0, meaning completely opaque.

In this exercise, you'll plot female literacy vs fertility for Africa and Latin America as red and blue circle glyphs, respectively.

Using the Latin America data (fertility_latinamerica and female_literacy_latinamerica), add a blue circle glyph of size=10 and alpha=0.8 to the figure p. To do this, you will need to specify the color, size and alpha keyword arguments inside p.circle().
Using the Africa data (fertility_africa and female_literacy_africa), add a red circle glyph of size=10 and alpha=0.8 to the figure p.





In [187]:
fertility = df['fertility']
female_literacy = df['female literacy']

fertility_latinamerica = df.loc[df['continent'] == "LAT"]['fertility']
female_literacy_latinamerica = df.loc[df['continent'] == "LAT"]['female literacy']

fertility_africa = df.loc[df['continent'] == "AF"]['fertility']
female_literacy_africa = df.loc[df['continent'] == "AF"]['female literacy']

# Create the figure: p
p = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

# Add a blue circle glyph to the figure p
p.circle(fertility_latinamerica, female_literacy_latinamerica, color='blue', size=20, alpha=0.8)

# Add a red circle glyph to the figure p
p.circle(fertility_africa, female_literacy_africa, color='red', size=20, alpha=0.8)

# Specify the name of the file
output_file('fert_lit_separate_colors.html')

# Display the plot
show(p)

## Lines



In [188]:
x = [1, 2, 3, 4, 5]
y = [8, 6, 5, 2, 3]
plot = figure()
plot.line(x, y, line_width=3)
output_file('line.html')
show(plot)

## Lines and markers

In [189]:
plot = figure()
plot.line(x, y, line_width=2)
plot.circle(x, y, fill_color='white', size=10)
output_file('line.html')
show(plot)

## Patches
* Userful for showing geographic regions
* Data given as "list of lists"


In [190]:
xs = [ [1,1,2,2], [2,2,4], [2,2,3,3] ]
ys = [ [2,5,5,2], [3,5,5], [2,3,4,2] ]

plot = figure()
plot.patches(xs, ys, fill_color= ['red', 'blue','green', 'yellow'], line_color='white')
output_file('patches.html')

show(plot)

