# A basic histogram

Create a simple histogram with the Histogram() function. Again we'll use the fertility data set to plot the distribution of female literacy around the world.

In [1]:
# Import Histogram, output_file, and show from bokeh.charts
from bokeh.charts import Histogram, output_notebook, show

# Import pandas as pd
import pandas as pd

# Import the fertility.csv data: data
data = pd.read_csv("fertility.csv", encoding = 'latin2')

In [2]:
data.head()

Unnamed: 0,Country,Continent,female literacy,fertility,population
0,Chine,ASI,90.5,1.769,1324655000
1,Inde,ASI,50.8,2.682,1139964932
2,USA,NAM,99.0,2.077,304060000
3,Indonesie,ASI,88.8,2.132,227345082
4,Bresil,LAT,90.2,1.827,191971506


In [3]:
# Make the Histogram: p
p = Histogram(data, 'female literacy', title='Female Literacy', bins=40)

# Set axis labels
p.xaxis.axis_label = 'Female Literacy (% population)'
p.yaxis.axis_label = 'Number of Countries'


# Call the output_notebook() 
output_notebook()
show(p)

# Generating multiple histograms at once

Let's make separate histograms, each with 10 bins for each of the 6 continents.

To do this with Bokeh charts we pass a column name to the the **color** parameter of the **Histogram()** function. In this case, the **'Continent'** column contains the continent abbreviation that we'll use to group the female literacy values in each of the 6 continents.



In [4]:
# Import Histogram, output_file, and show from bokeh.charts
from bokeh.charts import Histogram, output_notebook, show

# Import pandas as pd
import pandas as pd

# Import the fertility.csv data: data
data = pd.read_csv("fertility.csv", encoding = 'latin2')

# Make a Histogram: p
p = Histogram(data, 'female literacy', title='Female Literacy',
              color='Continent', legend='top_left', bins=10)

# Set axis labels
p.xaxis.axis_label = 'Female Literacy (% population)'
p.yaxis.axis_label = 'Number of Countries'


# Call the output_notebook() 
output_notebook()
show(p)

# A basic box plot

In this exercise, you'll make a box plot of female literacy per continent by setting **values='female_literacy'** and **label='Continent'** with the **BoxPlot()** function.



In [5]:
# Import BoxPlot, output_notebook, and show from bokeh.charts
from bokeh.charts import BoxPlot, output_notebook, show

# Import pandas as pd
import pandas as pd

# Import the fertility.csv data: data
data = pd.read_csv("fertility.csv", encoding = 'latin2')

# Make a box plot: p
p = BoxPlot(data, values='female literacy', label='Continent',
             title='Female Literacy (grouped by Continent)',
             legend='bottom_right')

# Set the y axis label
p.yaxis.axis_label='Female literacy (% population)'

# Call the output_notebook() 
output_notebook()
show(p)


# Color different groups differently

Like the **Histogram()** function, we can use the color parameter of the **BoxPlot()** function to color the box plot of each continent separately.

In this exercise, you'll distinguish between the six continents by setting set the color parameter of the **BoxPlot()** function to **'Continent'**.



In [6]:
# Make a box plot: p
p = BoxPlot(data, values='female literacy', label='Continent', color='Continent',
             title='Female Literacy (grouped by Continent)',
             legend='bottom_right')

# Set the y axis label
p.yaxis.axis_label='Female literacy (% population)'

# Call the output_notebook() 
output_notebook()
show(p)

# A basic scatter plot


The high level Scatter in bokeh.charts is similar to marker plots from bokeh.plotting, but differs in important ways. 
- For drawing different markers automatically based on groups in the data.
- To choose colors automatically based on groups in the data.
- You want to work directly with Pandas DataFrame.

In this exercise, you'll make a simple scatter plot of female literacy on the y axis and population on x axis.


In [7]:
# Import Scatter, output_notebook, and show from bokeh.charts
from bokeh.charts import Scatter, output_notebook, show

# Import pandas as pd
import pandas as pd

# Import HoverTool from bokeh.models
from bokeh.models import HoverTool


# Import the fertility.csv data: data
data = pd.read_csv("fertility.csv", encoding = 'latin2')

In [8]:
data.head()

Unnamed: 0,Country,Continent,female literacy,fertility,population
0,Chine,ASI,90.5,1.769,1324655000
1,Inde,ASI,50.8,2.682,1139964932
2,USA,NAM,99.0,2.077,304060000
3,Indonesie,ASI,88.8,2.132,227345082
4,Bresil,LAT,90.2,1.827,191971506


In [9]:
# Create a tooltips to hover
tooltips=[('Country','@Country')]

# Make a scatter plot: p
p = Scatter(data, x='population', y='female literacy',
            title='Female Literacy vs Population', tooltips=tooltips)

# Set the x-axis label
p.xaxis.axis_label = 'Population'

# Set the y-axis label
p.yaxis.axis_label = 'Female Literacy'

# Call the output_notebook() 
output_notebook()
show(p)