# Plotting Data Lab

### Learning Objectives

* Understand the components of a point in a graph, an $x$ value, and a $y$ value 
* Understand how to plot a point on a graph, from a point's $x$ and $y$ value
* Get a sense of how to use a graphing library, like Plotly, to answer questions about our data

### Working again with our travel data

Let's again get our travel data from our excel spreadsheet.

In [2]:
!pip install xlrd
import pandas
file_name = './cities.xlsx'
travel_df = pandas.read_excel(file_name)
cities = travel_df.to_dict('records')

Collecting xlrd
  Downloading xlrd-1.1.0-py2.py3-none-any.whl (108kB)
[K    100% |████████████████████████████████| 112kB 3.3MB/s ta 0:00:01
[?25hInstalling collected packages: xlrd
Successfully installed xlrd-1.1.0
[33mYou are using pip version 9.0.1, however version 9.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [3]:
cities

[{'Area': 59, 'City': 'Solta', 'Country': 'Croatia', 'Population': 1700},
 {'Area': 68, 'City': 'Greenville', 'Country': 'USA', 'Population': 84554},
 {'Area': 4758,
  'City': 'Buenos Aires',
  'Country': 'Argentina',
  'Population': 13591863},
 {'Area': 3750,
  'City': 'Los Cabos',
  'Country': 'Mexico',
  'Population': 287651},
 {'Area': 33,
  'City': 'Walla Walla Valley',
  'Country': 'USA',
  'Population': 32237},
 {'Area': 200, 'City': 'Marakesh', 'Country': 'Morocco', 'Population': 928850},
 {'Area': 491,
  'City': 'Albuquerque',
  'Country': 'New Mexico',
  'Population': 559277},
 {'Area': 8300,
  'City': 'Archipelago Sea',
  'Country': 'Finland',
  'Population': 60000},
 {'Area': 672,
  'City': 'Iguazu Falls',
  'Country': 'Argentina',
  'Population': 0},
 {'Area': 27, 'City': 'Salina Island', 'Country': 'Italy', 'Population': 4000},
 {'Area': 2731571, 'City': 'Toronto', 'Country': 'Canada', 'Population': 630},
 {'Area': 3194,
  'City': 'Pyeongchang',
  'Country': 'South Korea'

### Visualizing our data with graphs

As we can see, in our list of cities, each city has a population number.  Our first task will be to display the populations of our first three cities in a bar chart.

First we load the plotly library into our notebook, and we initialize this offline mode.

In [4]:
import plotly

plotly.offline.init_notebook_mode(connected=True)
# use offline mode to avoid initial registration

Now the next step is to build a trace.  As we know our trace is a dictionary with a key of `x` and a key of `y`.  We have set up a trace to look like the following: `trace_first_three = {'x': x_values, 'y': y_values}`.  

First define `x_values` so that it is a list of the first three cities.  Use what we learned about accessing information from lists and dictionaries to assign `x_values` equal to the first three countries.

In [12]:
x_values = [cities[0]['City'], cities[1]['City'], cities[2]['City']]
x_values

['Solta', 'Greenville', 'Buenos Aires']

Now use list and dictionary accessors to set `y_values` equal to the first three populations.

In [10]:
y_values = [cities[0]['Population'], cities[1]['Population'], cities[2]['Population']]

In [11]:
x_values = [cities[0]['City'], cities[1]['City'], cities[2]['City']]
y_values = [cities[0]['Population'], cities[1]['Population'], cities[2]['Population']]

Now let's plot our data.

In [13]:
trace_first_three_pops = {'x': x_values, 'y': y_values}


plotly.offline.iplot([trace_first_three_pops])

Note that by default, plotly sets the type of trace as a line trace.  Let's make our trace a bar trace by setting the key of `'type'` equal to `'bar'`.  We can continue to use the lists of `x_values` and `y_values` that we defined about in our new trace.  Also, we can have the label match the names of the cities, by setting the key of `text` equal to a list of the names of the cities.  Assign a list of our first three cities to the key of `text`.

In [24]:
bar_trace_first_three_pops = {'type': 'bar', 'text': x_values, 'x': x_values, 'y': y_values}

In [36]:
bar_trace_first_three_pops['type'] # 'bar'

'bar'

In [26]:
plotly.offline.iplot([bar_trace_first_three_pops])

Ok, now let's plot two different traces side by side.  Create another trace called `bar_trace_first_three_areas` that has is like our `bar_trace_first_three_pops` except the values are a list of areas.  We will plot this side by side our `bar_trace_first_three_pops` in the plot below.

In [34]:
areas = [cities[0]['Area'], cities[1]['Area'], cities[2]['Area']]
print(areas)
trace_first_three_areas = {'type': 'scatter', 'x': x_values, 'y': areas, 'text': x_values}

[59, 68, 4758]


In [37]:
plotly.offline.iplot([trace_first_three_pops, trace_first_three_areas])

### Summary

In this section, we saw how we use data visualisations to better understand the data.  We do the following.  Import plotly:


    import plotly

    plotly.offline.init_notebook_mode(connected=True)

Then we define a trace, which is a Python dictionary.

    trace = {'x': [], 'y': [], 'text': [], 'type': 'bar'}
    
Finally, we display our trace with a call to the following method:

    plotly.offline.iplot([trace])
    
Easy peasy, quick and easy!