<table>
<tr>
<td><img src="https://i.imgur.com/BqJgyzB.png" width="350px"/></td>
<td><img src="https://i.imgur.com/ttYzMwD.png" width="350px"/></td>
<td><img src="https://i.imgur.com/WLmzj41.png" width="350px"/></td>
<td><img src="https://i.imgur.com/LjRTbCn.png" width="350px"/></td>
</tr>
<tr>
<td style="font-weight:bold; font-size:16px;">Scatter Plot</td>
<td style="font-weight:bold; font-size:16px;">Choropleth</td>
<td style="font-weight:bold; font-size:16px;">Heatmap</td>
<td style="font-weight:bold; font-size:16px;">Surface Plot</td>
</tr>
<tr>
<td>go.Scatter()</td>
<td>go.Choropleth()</td>
<td>go.Heatmap()</td>
<td>go.Surface()</td>
</tr>
<!--
<tr>
<td>Good for interval and some nominal categorical data.</td>
<td>Good for interval and some nominal categorical data.</td>
<td>Good for nominal and ordinal categorical data.</td>
<td>Good for ordinal categorical and interval data.</td>
</tr>
-->
</table>

# Introduction to plotly

So far in this tutorial we have been using `seaborn` and `pandas`, two mature libraries designed around `matplotlib`. These libraries all focus on building "static" visualizations: visualizations that have no moving parts. In other words, all of the plots we've built thus far could appear in a dead-tree journal article.

The web unlocks a lot of possibilities when it comes to interactivity and animations. There are a number of plotting libraries available which try to provide these features.

In this section we will examine `plotly`, an open-source plotting library that's one of the most popular of these libraries.

In [1]:
import pandas as pd
import numpy as np
reviews = pd.read_csv("data/winemag-data-130k-v2.csv.zip", index_col=0)
reviews.head(3)

Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm


`plotly` provides both online and offline modes. The latter injects the `plotly` source code directly into the notebook; the former does not. I recommend using `plotly` in offline mode the vast majority of the time.

The following line will help you do so!

In [2]:
from plotly.offline import init_notebook_mode, iplot, plot
import plotly
import plotly.graph_objs as go
init_notebook_mode(connected=True)

Now, start using `Plotly` with `winemag-data-130k-v2.csv.zip` to create the following plots (Your notebook must contain one of each):
    - Scatter Plot
    - Choropleth Plot
    - Heatmap
    - Surface Plot

## Scatter Plot

In [3]:

trace = go.Scatter(
    x = reviews['points'],
    y = reviews['price'],
    mode = 'markers'

)
data = [trace]
layout = go.Layout(title='POINTS VS. PRICE', xaxis=dict(title='points'), yaxis=dict(title='price'))

# Plot and embed in ipython notebook!
plotly.offline.plot({'data': data, 'layout': layout}, filename='scatter.html')

'scatter.html'

## Choropleth Plot

In [4]:
avg_prices = list(reviews.groupby('country').mean()['price'])

In [5]:
country_names = list(reviews.groupby('country').mean().index)

In [6]:
trace = go.Choropleth(
            locations=country_names,
            locationmode='country names',
            colorscale='Portland',
            text=country_names,
            z=avg_prices,
            colorbar=dict(title="World Countries and their average wine prices")
)
data = [trace]

plotly.offline.plot(data, filename='Choropleth.html')



'Choropleth.html'

## Heatmap

In [7]:
#take the countries that have the highest scores in wine (from 98 to 100) and count how many wine brands they have for each score
#and plot them together in a heatmap

count = pd.DataFrame(reviews[reviews['points']>97].groupby(['country', 'points'])['title'].count().reset_index())
countries = count['country'].unique()
points = count['points'].unique()
o = count.pivot(index='country', columns='points', values='title').fillna(0)
o

points,98,99,100
country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Australia,5.0,2.0,1.0
Austria,1.0,0.0,0.0
France,21.0,8.0,8.0
Germany,1.0,0.0,0.0
Italy,12.0,9.0,4.0
Portugal,3.0,2.0,2.0
Spain,1.0,0.0,0.0
US,33.0,12.0,4.0


In [8]:
trace = go.Heatmap(
            y=points,
            x=countries,
            z=np.array(o).T
)

data = [trace]
layout = go.Layout(title='POINTS vs HIGHEST COUNTRIES', xaxis=dict(title='COUNTRIES'), yaxis=dict(title='POINTS'))

plotly.offline.plot({'data': data, 'layout': layout}, filename='Heatmap.html')


'Heatmap.html'

## Surface Plot

In [18]:
count = reviews.groupby(['points', 'price'])['country'].count().reset_index()
price = np.sort(count['price'].unique())
points = np.sort(count['points'].unique())
o = count.pivot(index='points', columns='price', values='country').fillna(0)
o

price,4.0,5.0,6.0,7.0,8.0,9.0,10.0,11.0,12.0,13.0,...,1100.0,1125.0,1200.0,1300.0,1500.0,1900.0,2000.0,2013.0,2500.0,3300.0
points,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
80,0.0,2.0,3.0,10.0,23.0,20.0,39.0,20.0,40.0,30.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
81,0.0,4.0,2.0,20.0,31.0,47.0,80.0,44.0,69.0,34.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
82,1.0,3.0,12.0,26.0,63.0,73.0,173.0,84.0,128.0,96.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
83,1.0,14.0,18.0,70.0,109.0,138.0,289.0,150.0,236.0,198.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
84,5.0,10.0,26.0,94.0,183.0,202.0,546.0,313.0,523.0,457.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
85,2.0,8.0,32.0,88.0,196.0,295.0,784.0,398.0,659.0,548.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
86,2.0,2.0,11.0,68.0,141.0,269.0,652.0,422.0,749.0,671.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
87,0.0,3.0,9.0,41.0,99.0,180.0,491.0,355.0,782.0,662.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
88,0.0,0.0,5.0,12.0,34.0,78.0,256.0,162.0,447.0,477.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
89,0.0,0.0,0.0,3.0,9.0,17.0,77.0,66.0,164.0,195.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [19]:
trace = go.Surface(
            x=points,
            y=price,
            z=o
)
data = [trace]

plotly.offline.plot(data, filename='Surface.html')



'Surface.html'

In [11]:
points


array([ 80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,  91,  92,
        93,  94,  95,  96,  97,  98,  99, 100], dtype=int64)

In [12]:
o.values.shape

(21, 390)