# "Hello, World of Data!": Your ABC Blocks in Data Handling, Analysis, and Visualization  
## Section 5: Visualization

The results of analyses are often best understood and communicated through visualizations.  Several tools that generate charts and graphs exist out there (e.g., spreadsheet software, Tableau, Gephi).  In Python alone, visualization packages with different strengths and weaknesses abound. Choosing the right tool to create your visuals should depend on which can most effectively deliver the story that is in your data.

This quick follow-along demo will explore some Python libraries that represent solutions to different visualization needs.  This is by no means a crash course on how to create the most effective visualizations--that is a whole course by itself.  Rather, this is meant as an introduction to help you get started on choosing the right tools for the visuals that you want.

Let's get started!

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

Notice that we import `matplotlib`.  Although `matplotlib` won't be used directly, the libraries used here (pandas and seaborn) both create wrappers for matplotlib.

In [None]:
cities = pd.read_csv("datasets/Cities.csv")
countries = pd.read_csv("datasets/Countries.csv")

In [None]:
cities.head()

In [None]:
countries.head()

### Simple and fast static visualization: pandas 
Basic plotting can be done directly within pandas.  Refer to the pandas documentation for a full description of the types of plots you can make using pandas.
https://pandas.pydata.org/pandas-docs/stable/visualization.html  

Here is a simple scatter plot.

In [None]:
cities.plot.scatter(x="latitude", y="temperature")

In [None]:
# Here, a pandas.Series object is created for the pie graph
ser_pie = pd.Series(list(countries.population), 
                    index=list(countries.country))
ser_pie.head()

In [None]:
ser_pie.plot(kind="pie", figsize=(8,8))

### More customized static visualization: seaborn  
Direct plotting using pandas is very convenient.  However, some graphs simply need several customizations to be effective.  A plotting library that makes customization easy is an advantage in this scenario.  In Python alone, there are several libraries that can do this.  Here is an entertaining review of thse libraries.  
https://dsaber.com/2016/10/02/a-dramatic-tour-through-pythons-data-visualization-landscape-including-ggplot-and-altair/

In this demo, we will use seaborn.  For a full API description fo seaborn, some tutorials, and a ton of inspiration, go to seaborn.pydata.org.

In [None]:
import seaborn as sns
# sns.set_palette(sns.color_palette("RdBu_r"))
sns.lmplot(x="latitude", y="temperature", data=cities)

Here, a scatter plot and its regression line are drawn in the same panel.  All of it is exectued in a single line of code!

Let's look at other types of plots that seaborn makes extremely easy to do.

In [None]:
sns.jointplot(x="longitude", y="latitude", data=cities, kind="kde")

In [None]:
sns.violinplot(x="EU", y="population", hue="coastline", data=countries, 
               cut=0, split=True, palette="Set1", inner="stick")

Seaborn has an extensive list of plotting and styling functions that make plotting life easy.  The API is worth checking out.

### Interactive Plotting with Plotly
We begin by importing the plotly libraries. 

<b>Important!</b> In order to plot with Plotly, we highly encourage you to [set up your own account in plotly](https://plot.ly/accounts/login/?action=login). Afterwards, please change the default username and API key below to you own username and API key. You can [find your API key here](https://plot.ly/settings/api).

Note that each account can create a limited number of charts daily. To prevent the default account (currently set to ibtingzon3) from reaching its daily limit, it is advisable that you use your own API keys. 

In [None]:
import plotly
plotly.tools.set_credentials_file(username='ibtingzon3', api_key='J6darIF7bKasACTNzeBQ')
import plotly.plotly as py
import plotly.graph_objs as go

In [None]:
# Create a trace
trace = go.Scatter(
    x = cities['latitude'],
    y = cities['temperature'],
    text = cities['city'],
    mode = 'markers'
)

data = [trace]

# Create layout
layout = go.Layout(
    title = "Temperature vs Latitude Scatterplot",
    xaxis = dict(title="Latitude"),
    yaxis = dict(title="Temperature"),
    width = 800,
    height = 500,
    hovermode = "closest"
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='scatterplot-basic')

In [None]:
eu_countries = countries[countries.EU == 'yes']
eu_countries_sorted = eu_countries.sort_values('population', ascending=True)

trace = go.Bar(
    x = eu_countries_sorted['country'],
    y = eu_countries_sorted['population']
)

data = [trace]
layout = go.Layout(
    title = "Polution of Countries in the EU",
    xaxis = dict(title="Countries"),
    yaxis = dict(title="Population ( millions)"),
    width = 800,
    height = 500
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='bar-basic')

In [None]:
codes = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_world_gdp_with_codes.csv')
countries_codes = countries.merge(codes, left_on='country', right_on='COUNTRY', how='outer')

data = [ dict(
        type = 'choropleth',
        locations = countries_codes['CODE'],
        z = countries_codes['population'],
        text = countries_codes['country'],
        autocolorscale = True,
        colorbar = dict(
            title = 'Population (millions)'),
        ) ]

layout = dict(
    title = 'Population of European Countries',
    geo = dict(
        showframe = False,
        projection = dict(
            type = 'Mercator'
        )
    )
)

fig = dict(data=data, layout=layout )
py.iplot(fig, filename='choropleth')

### <font color="green">Your Turn: World Cup data</font>  
Draw any type of plot not shown above, using `datasets/Players.csv` or `datasets/Teams.csv`.  We recommend consulting the API documentation of pandas and/or seaborn.