## A simple scatter plot

In this example, you're going to make a scatter plot of female literacy vs fertility using data from the European Environmental Agency(https://www.eea.europa.eu/data-and-maps/figures/correlation-between-fertility-and-female-education). This dataset highlights that countries with low female literacy have high birthrates. The x-axis data has been loaded for you as fertility and the y-axis data has been loaded as female_literacy.

Your job is to create a figure, assign x-axis and y-axis labels, and plot female_literacy vs fertility using the circle glyph.

After you have created the figure, in this exercise and the ones to follow, play around with it! Explore the different options available to you on the tab to the right, such as "Pan", "Box Zoom", and "Wheel Zoom". You can click on the question mark sign for more details on any of these tool

    # Import figure from bokeh.plotting
    from bokeh.plotting import ____

    # Import output_file and show from bokeh.io
    from bokeh.io import ____, ____

    # Create the figure: p
    p = ____(____='fertility (children per woman)', ____='female_literacy (% population)')

    # Add a circle glyph to the figure p
    p.circle(____, ____)

    # Call the output_file() function and specify the name of the file


    # Display the plot



In [1]:
import pandas as pd 
from bokeh.plotting import figure
from bokeh.io import output_file, show
import numpy as np
from bokeh.charts.attributes import cat, color
from bokeh.charts.operations import blend
from bokeh.charts.utils import df_from_json

In [2]:
data = pd.read_excel("../data/TREND01-5G-educ-fertility-bubbles.xls")
data.head()

Unnamed: 0,Country,Continent,female literacy,fertility,population
0,Chine,ASI,90.5,1.769,1324655000.0
1,Inde,ASI,50.8,2.682,1139965000.0
2,USA,NAM,99.0,2.077,304060000.0
3,Indonésie,ASI,88.8,2.132,227345100.0
4,Brésil,LAT,90.2,1.827,191971500.0


In [3]:
data.tail()

Unnamed: 0,Country,Continent,female literacy,fertility,population
177,Antilles néerlandaises,,96.3,,
178,Iles Caïmanes,,99.0,,
179,Seychelles,,92.3,,
180,Territoires autonomes palestiniens,,90.9,,
181,WORLD,WORLD,77.0,,


In [4]:
p = figure(x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')

In [5]:
data.shape

(182, 5)

In [6]:
data.dropna(subset=['fertility', 'population','female literacy'], inplace=True)

In [7]:
data.shape

(162, 5)

In [8]:
df = df_from_json(data)

In [9]:
print(df)

None


In [10]:
data = data.fillna('')

In [11]:
fertility = [np.asarray(i) for i in data['fertility']]
female_lit = [np.asarray(i) for i in data['female literacy']]

In [12]:
p.circle(fertility,female_lit)

<bokeh.models.renderers.GlyphRenderer at 0x16c2af9a0b8>

In [13]:
output_file("fert_lit.html")

In [14]:
show(p)

## A scatter plot with different shapes

By calling multiple glyph functions on the same figure object, we can overlay multiple data sets in the same figure.

In this exercise, you will plot female literacy vs fertility for two different regions, Africa and Latin America. Each set of x and y data has been loaded separately for you as `fertility_africa`, `female_literacy_africa`, `fertility_latinamerica`, and `female_literacy_latinamerica`.

Your job is to plot the Latin America data with the `circle()` glyph, and the Africa data with the `x()` glyph.

`figure` has already been imported for you from `bokeh.plotting`.

#### Instructions 
   - Create the figure p with the figure() function. It has two parameters: x_axis_label and y_axis_label.
   - Add a circle glyph to the figure p using the function p.circle() where the inputs are the x and y data from Latin America: fertility_latinamerica and female_literacy_latinamerica.
   - Add an x glyph to the figure p using the function p.x() where the inputs are the x and y data from Africa: fertility_africa and female_literacy_africa.
   - The code to create, display, and specify the name of the output file has been written for you, so after adding the x glyph, hit 'Submit Answer' to view the figure.



            # Create the figure: p
            p = ____(____='fertility', ____='female_literacy (% population)')

            # Add a circle glyph to the figure p


            # Add an x glyph to the figure p


            # Specify the name of the file
            output_file('fert_lit_separate.html')

            # Display the plot
            show(p)


In [15]:
data.Continent.unique()

array(['ASI', 'NAM', 'LAT', 'AF', 'EUR', 'OCE'], dtype=object)

In [28]:
latinamerica = data[data['Continent'] == 'LAT']
latinamerica.shape

(24, 5)

In [29]:
africa = data[data['Continent'] == 'AF']
africa.shape

(49, 5)

In [23]:
lat_fertility = [np.asarray(i) for i in latinamerica['fertility']]
lat_female_lit = [np.asarray(i) for i in latinamerica['female literacy']]

In [24]:
af_fertility = [np.asarray(i) for i in africa['fertility']]
af_female_lit = [np.asarray(i) for i in africa['female literacy']]

In [25]:
p.circle(lat_fertility,lat_female_lit)

<bokeh.models.renderers.GlyphRenderer at 0x16c31abc978>

In [26]:
p.x(af_fertility,af_female_lit)

<bokeh.models.renderers.GlyphRenderer at 0x16c31abc898>

In [27]:
output_file('fert_lit_separate.html')

show(p)