# Overview
This project is referring [Siddharth Yadav's kernel](https://www.kaggle.com/thebrownviking20/intermediate-visualization-tutorial-using-plotly) for showing the popular visualization library plotly using the pokemon dataset. We can practice various rare visualization techniques as well as simple techniques with plotly library. 

# Why plotly
Plotly provides a wide range of interactive plotting options and is one of the most interactive python visualization libraries. Highly recommended for usage.

# Pokemon dataset

This data set includes 800 Pokemon, including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed. It has been of great use when teaching statistics to kids.

The data as described by Myles O'Neill is:

'#': ID for each pokemon

Name: Name of each pokemon

Type 1: Each pokemon has a type, this determines weakness/resistance to attacks

Type 2: Some pokemon are dual type and have 2

HP: hit points, or health, defines how much damage a pokemon can withstand before fainting

Attack: the base modifier for normal attacks (eg. Scratch, Punch)

Defense: the base damage resistance against normal attacks

SP Atk: special attack, the base modifier for special attacks (e.g. fire blast, bubble beam)

SP Def: the base damage resistance against special attacks

Speed: determines which pokemon attacks first each round

In [1]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from plotly import tools
import chart_studio.plotly as py
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff
from IPython.display import HTML, Image

# Import, modify dataset

Read the pokemon dataset with pandas, and fill in the Not Available values.

In [2]:
dirPath = os.path.dirname(os.path.dirname(os.path.realpath('__file__')))
pokemon = pd.read_csv(dirPath + "/data/pokemon.csv")
pokemon

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,5,Charmander,Fire,,39,52,43,60,50,65,1,False
...,...,...,...,...,...,...,...,...,...,...,...,...
795,796,Diancie,Rock,Fairy,50,100,150,100,150,50,6,True
796,797,Mega Diancie,Rock,Fairy,50,160,110,160,110,110,6,True
797,798,Hoopa Confined,Psychic,Ghost,80,110,60,150,130,70,6,True
798,799,Hoopa Unbound,Psychic,Dark,80,160,60,170,130,80,6,True


In [30]:
dirPath

'/Users/bumjoonpark/Desktop/pokemon_project/Pokemon_project'

In [3]:
pokemon.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 800 entries, 0 to 799
Data columns (total 12 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   #           800 non-null    int64 
 1   Name        799 non-null    object
 2   Type 1      800 non-null    object
 3   Type 2      414 non-null    object
 4   HP          800 non-null    int64 
 5   Attack      800 non-null    int64 
 6   Defense     800 non-null    int64 
 7   Sp. Atk     800 non-null    int64 
 8   Sp. Def     800 non-null    int64 
 9   Speed       800 non-null    int64 
 10  Generation  800 non-null    int64 
 11  Legendary   800 non-null    bool  
dtypes: bool(1), int64(8), object(3)
memory usage: 69.7+ KB


In [4]:
# Almost half of the Type 2 attribute is empty but it's because many pokemon have only one type. Still we will fill this with 'Blank'
pokemon = pokemon.fillna(value={'Type 2':'Blank'})
pokemon.head()

Unnamed: 0,#,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
0,1,Bulbasaur,Grass,Poison,45,49,49,65,65,45,1,False
1,2,Ivysaur,Grass,Poison,60,62,63,80,80,60,1,False
2,3,Venusaur,Grass,Poison,80,82,83,100,100,80,1,False
3,4,Mega Venusaur,Grass,Poison,80,100,123,122,120,80,1,False
4,5,Charmander,Fire,Blank,39,52,43,60,50,65,1,False


In [5]:
pokemon = pokemon.rename(index=str, columns={"#": "Number"})
pokemon = pokemon.fillna(value={'Name':'Primeape'})
# pokemon['Name'][62] = "Primeape" 
pokemon[60:65]

Unnamed: 0,Number,Name,Type 1,Type 2,HP,Attack,Defense,Sp. Atk,Sp. Def,Speed,Generation,Legendary
60,61,Golduck,Water,Blank,80,82,78,95,80,85,1,False
61,62,Mankey,Fighting,Blank,40,80,35,35,45,70,1,False
62,63,Primeape,Fighting,Blank,65,105,60,60,70,95,1,False
63,64,Growlithe,Fire,Blank,55,70,45,70,50,60,1,False
64,65,Arcanine,Fire,Blank,90,110,80,100,80,95,1,False


# Distplot

Distplots are used to plot a univariate distribution of observations. This basically plots a histogram and fits a kernel density estimate(kde) and rug plot on it.

### HP distplot

In [6]:
fig = ff.create_distplot([pokemon.HP],['HP'],bin_size=5)
iplot(fig, filename='HP Distplot')

### Distplot of all pokemon stats

In [7]:
hist_data = [pokemon['HP'],pokemon['Attack'],pokemon['Defense'],pokemon['Sp. Atk'],pokemon['Sp. Def'],pokemon['Speed']]
group_labels = list(pokemon.iloc[:,4:10].columns)

fig = ff.create_distplot(hist_data, group_labels, bin_size=5)
iplot(fig, filename='Distplot of all pokemon stats')

In [8]:
hist_data = [pokemon['Attack'],pokemon['Defense']]
group_labels = ['Attack','Defense']

fig2 = ff.create_distplot(hist_data, group_labels, bin_size=5)
iplot(fig2, filename='Distplot of attack and defense')

### Distplot of Generation
Generation stats are evenly distributed

In [9]:
fig = ff.create_distplot([pokemon.Generation],['Generation'],bin_size=5)
iplot(fig, filename='Generation Distplot')

# Boxplots
A boxplot is a simple way of representing statistical data on a plot in which a rectangle is drawn to represent the second and third quartiles, usually with a vertical line inside to indicate the median value. The lower and upper quartiles are shown as horizontal lines either side of the rectangle.

**Definitions**
* Median (50th percentile) = middle value of the data set. Sort and take the data in the middle. It is also called 50% percentile that is 50% of data are less that median(50th quartile)(quartile)
* 25th percentile = quartile 1 (Q1) that is lower quartile
* 75th percentile = quartile 3 (Q3) that is higher quartile
* height of box = IQR = interquartile range = Q3-Q1
* Whiskers = 1.5 * IQR from the Q1 and Q3
* Outliers = being more than 1.5*IQR away from median commonly.

**Elements**
* trace = box
* y = data we want to visualize with box plot
* marker = color

![title](/Users/bumjoonpark/Desktop/pokemon_project/Pokemon_project/data/boxplot.jpg)
[image source](http://www.ni.com/tutorial/3047/en/)

### Boxplots of all stats

In [10]:
trace0 = go.Box(y = pokemon['HP'], name = 'HP')
trace1 = go.Box(y = pokemon['Attack'], name = 'Attack')
trace2 = go.Box(y = pokemon['Defense'], name = 'Defense')
trace3 = go.Box(y = pokemon['Sp. Atk'], name = 'Sp. Atk')
trace4 = go.Box(y = pokemon['Sp. Def'], name = 'Sp. Def')
trace5 = go.Box(y = pokemon['Speed'], name = 'Speed')
data = [trace0, trace1, trace2, trace3, trace4, trace5]
iplot(data)

### Customized boxplots

* trace 0 : Add mean.
* trace 1 : Add mean and standard variation.
* trace 2 : Draw all the points.
* trace 3 : Only whisker.
* trace 4 : Detect suspected outliers.
* trace 5 : Whisker and outliers.

In [11]:
trace0 = go.Box(
    y= pokemon["HP"],
    boxmean = True,
    name = "HP(with Mean)"
)
trace1 = go.Box(
    y= pokemon["Attack"],
    boxmean = 'sd',
    name = "Attack(Mean and SD)"
)
trace2 = go.Box(
    y = pokemon["Defense"],
    jitter = 0.5,
    pointpos = -2,
    boxpoints = 'all',
    name = "Defense(All points)"
)
trace3 = go.Box(
    y = pokemon["Sp. Atk"],
    boxmean = False,
    name = "Sp. Atk(Only Whiskers)"
)
trace4 = go.Box(
    y = pokemon["Sp. Def"],
    boxpoints = 'suspectedoutliers',
    marker = dict(
        outliercolor = 'rgba(219, 64, 82, 0.6)',
        line = dict(
            outliercolor = 'rgba(219, 64, 82, 0.6)', 
            outlierwidth = 2
        )
    ),
    line = dict(
        color = 'rgb(8, 81, 156)'
    ),
    name = 'Sp. Def(Suspected Outliers)'
)
trace5 = go.Box(
    y = pokemon["Speed"],
    boxpoints = 'outliers',
    line = dict(
        color = 'rgb(107, 174, 214)'
    ),
    name = "Speed(Whiskers and Outliers)"
)

layout = go.Layout(
    title = "Boxplot with customized outliers"
)

data = [trace0, trace1, trace2, trace3, trace4, trace5]
fig = go.Figure(data = data, layout = layout)
iplot(fig, filename = "Customized Boxplot")

# Radar charts
A radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point. The relative position and angle of the axes is typically uninformative. 

Source : [Radar chart(Wikipedia)](https://en.wikipedia.org/wiki/Radar_chart)


### Visualizing stats of a single pokemon

In [12]:
poke1 = pokemon[pokemon['Name'] == 'Charizard']

data = [go.Scatterpolar(
    r = [poke1['HP'].values[0],poke1['Attack'].values[0],poke1['Defense'].values[0],
       poke1['Sp. Atk'].values[0], poke1['Sp. Def'].values[0],poke1['Speed'].values[0],
       poke1['HP'].values[0]],
    theta = ['HP','Attack','Defense','Sp. Atk','Sp. Def','Speed', 'HP'],
    fill = 'toself',
    name = poke1.Name.values[0]
)]

layout = go.Layout(
  polar = dict(
    radialaxis = dict(
      visible = True,
      range = [0, 250]
    )
  ),
  showlegend = True,
  title = "Stats of {}".format(poke1.Name.values[0])
)
fig = go.Figure(data=data, layout=layout)
iplot(fig, filename = "Single Pokemon stats")

### Comparing stats of two pokemon

In [13]:
def radar(poke):
    poke1 = pokemon[pokemon["Name"] == poke]
    data = [go.Scatterpolar(
        r = [poke1['HP'].values[0],poke1['Attack'].values[0],poke1['Defense'].values[0],
           poke1['Sp. Atk'].values[0], poke1['Sp. Def'].values[0],poke1['Speed'].values[0],
           poke1['HP'].values[0]],
        theta = ['HP','Attack','Defense','Sp. Atk','Sp. Def','Speed', 'HP'],
        fill = 'toself',
        name = poke1.Name.values[0]
    )]

    layout = go.Layout(
      polar = dict(
        radialaxis = dict(
          visible = True,
          range = [0, 250]
        )
      ),
      showlegend = True,
      title = "Stats of {}".format(poke1.Name.values[0])
    )
    fig = go.Figure(data=data, layout=layout)
    iplot(fig, filename = "Pokemon stats")
    
radar("Mega Mewtwo X")
radar("Garchomp")

In [14]:
# Define a method to compare two pokemons 
def compare2pokemon(pokemon1, pokemon2): 
    p1 = pokemon[pokemon.Name == pokemon1]
    p2 = pokemon[pokemon.Name == pokemon2]
    
    trace0 = go.Scatterpolar(
        r = [p1['HP'].values[0], p1.Attack.values[0], p1.Defense.values[0], p1['Sp. Atk'].values[0], p1['Sp. Def'].values[0], p1.Speed.values[0], p1.HP.values[0]], 
        theta = ['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'HP'],
        fill = 'toself',
        name = p1.Name.values[0]
    )
    trace1 = go.Scatterpolar(
        r = [p2['HP'].values[0], p2.Attack.values[0], p2.Defense.values[0], p1['Sp. Atk'].values[0], p2['Sp. Def'].values[0], p2.Speed.values[0], p2.HP.values[0]], 
        theta = ['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed', 'HP'],
        fill = 'toself',
        name = p2.Name.values[0]
    )
    layout = go.Layout(
        polar = dict(
            radialaxis = dict(
                visible = True,
                range = [0, 250]
            )
        ),
        showlegend = True,
        title = "{} vs {}".format(pokemon1, pokemon2)
    )
    data = [trace0, trace1]
    fig = go.Figure(data = data, layout = layout)
    iplot(fig, filename = "Comparison of two pokemon")

compare2pokemon("Mega Charizard X", "Garchomp")

# Scatterplot

A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. If the points are coded (color/shape/size), one additional variable can be displayed. The data are displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.

[source: wikipedia](https://en.wikipedia.org/wiki/Scatter_plot)

### 2D scatterplot 
Add colorscale with the speed to make the plot 3D.

In [15]:
trace0 = go.Scatter(
    x = pokemon["Defense"],
    y = pokemon["Attack"],
    mode = 'markers',
    marker = dict(
        size = 16,
        color = pokemon["Speed"], 
        colorscale = 'Electric',
        showscale = True
    ),
    text = pokemon["Name"]
)
data = [trace0]
layout = go.Layout(
    paper_bgcolor = 'rgba(0, 0, 0, 1)',
    plot_bgcolor = 'rgba(0, 0, 0, 1)',
    showlegend = False,
    font=dict(family='Courier New, monospace', size=10, color='#ffffff'),
    title="Scatter plot of Defense vs Attack with Speed as colorscale"
)
fig = go.Figure(data = data, layout = layout)
iplot(fig, filename = "Scatterplot")

### 3D scatterplot
Add an additional z-axis to make the plot 3D.
Then add colorscale with the HP for 4D.

In [16]:
trace0 = go.Scatter3d(
    x=pokemon["Speed"],
    y=pokemon["Attack"],
    z=pokemon["Defense"],
    mode='markers',
    marker=dict(
        size=4,
        color = pokemon["HP"], 
        colorscale = 'Electric',
        line=dict(
            color='rgba(217, 217, 217, 0.14)',
            width=0.5
        ),
        opacity=1,
        showscale = True
    ),
    text = pokemon["Name"]
)
data = [trace0]
layout = go.Layout(
    margin=dict(
        l=0,
        r=0,
        b=0,
        t=0
    ),
    xaxis=dict(title="Speed"),
    yaxis=dict(title="Attack"),
    title = "Speed vs Attack vs Defense"
)
fig = go.Figure(data=data, layout=layout)
iplot(fig, filename='3d-scatter')

# Contour Plot
Contour plots (sometimes called Level Plots) are a way to show a three-dimensional surface on a two-dimensional plane. It graphs two predictor variables X Y on the y-axis and a response variable Z as contours. These contours are sometimes called z-slices or iso-response values.

This type of graph is widely used in cartography, where contour lines on a topological map indicate elevations that are the same. Many other disciples use contour graphs including: astrology, meteorology, and physics. Contour lines commonly show altitude (like height of a geographical features), but they can also be used to show density, brightness, or electric potential.

[source](https://www.statisticshowto.com/contour-plots/)

### Contour plot for distribution of bug pokemon

In [17]:
trace_contour = go.Contour(
    x = ['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed'],
    z = pokemon[pokemon["Type 1"] == 'Bug'].iloc[:, 4:10].values, 
    colorscale = 'Jet',
)
data = [trace_contour]
layout = go.Layout(
    title = "Distribution of Bug pokemon(Contour chart)",
    width = 600,
    height = 800
)

fig = go.Figure(data = data, layout = layout)
iplot(fig, filename = 'bug-contour')

### Contour plot 
for depicting density(and distribution) of Defense, Speed, Sp. Attack, Sp. Defense of different generations of pokemon based on their HP, Attack 

In [18]:
gen1 = go.Contour(
    x = pokemon[pokemon["Generation"] == 1].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 1].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 1].iloc[:, 6:10].values,
    name = "Generation 1",
    showscale = False,
)
gen2 = go.Contour(
    x = pokemon[pokemon["Generation"] == 2].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 2].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 2].iloc[:, 6:10].values,
    name = "Generation 2",
    showscale = False,
)
gen3 = go.Contour(
    x = pokemon[pokemon["Generation"] == 3].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 3].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 3].iloc[:, 6:10].values,
    name = "Generation 3",
    showscale = False,
)
gen4 = go.Contour(
    x = pokemon[pokemon["Generation"] == 4].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 4].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 4].iloc[:, 6:10].values,
    name = "Generation 4",
    showscale = False,
)
gen5 = go.Contour(
    x = pokemon[pokemon["Generation"] == 5].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 5].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 5].iloc[:, 6:10].values,
    name = "Generation 5",
    showscale = False,
)
gen6 = go.Contour(
    x = pokemon[pokemon["Generation"] == 6].iloc[:, 4].values,
    y = pokemon[pokemon["Generation"] == 6].iloc[:, 5].values,
    z = pokemon[pokemon["Generation"] == 6].iloc[:, 6:10].values,
    name = "Generation 6",
    showscale = False,
)
fig = tools.make_subplots(rows = 1, cols = 6, subplot_titles=('Generation 1', 'Generation 2', 'Generation 3','Generation 4','Generation 5','Generation 6'), shared_yaxes = True)
fig.append_trace(gen1, 1, 1)
fig.append_trace(gen2, 1, 2)
fig.append_trace(gen3, 1, 3)
fig.append_trace(gen4, 1, 4)
fig.append_trace(gen5, 1, 5)
fig.append_trace(gen6, 1, 6)

fig['layout'].update(height = 600, 
                    width = 800,
                    title = 'Contour subplots for different generations',
                    paper_bgcolor='rgba(0,0,0,1)',
                     plot_bgcolor='rgba(0,0,0,1)',
                     font=dict(size=12, 
                     color='#ffffff'),
                     showlegend=True,
                     margin=go.Margin(
                     l=50,
                     r=50,
                     b=100,
                     t=100,
                     pad=4,
                     ),
                     xaxis=dict(
                        domain=[0, 0.1]
                 ),
                xaxis2=dict(
                        domain=[0.15, 0.30]
                ),
                xaxis3=dict(
                        domain=[0.35, 0.45]
                ),  
                xaxis4=dict(
                        domain=[0.5, 0.6]
                ),            
                xaxis5=dict(
                        domain=[0.65, 0.75]
                ),  
                xaxis6=dict(
                        domain=[0.85, 1]
                )
)
iplot(fig, filename = 'contour-subplots')



plotly.tools.make_subplots is deprecated, please use plotly.subplots.make_subplots instead


plotly.graph_objs.Margin is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.layout.Margin




# Bubble plot
A bubble plot is a scatterplot where a third dimension is added: the value of an additional numeric variable is represented through the size of the dots.

You need 3 numerical variables as input: one is represented by the X axis, one by the Y axis, and one by the dot size.

[source](https://www.data-to-viz.com/graph/bubble.html)

### Bubble plot
with Attack on X-axis, Defense on Y-axis, and HP as size for the fire pokemon over each generation.

In [19]:
sizeref = 2.*max(pokemon['HP'])/(3000)

trace0 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 1],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 1],
    mode = 'markers',
    name = 'Generation 1',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 1],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 1],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

trace1 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 2],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 2],
    mode = 'markers',
    name = 'Generation 2',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 2],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 2],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

trace2 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 3],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 3],
    mode = 'markers',
    name = 'Generation 3',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 3],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 3],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

trace3 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 4],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 4],
    mode = 'markers',
    name = 'Generation 4',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 4],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 4],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

trace4 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 5],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 5],
    mode = 'markers',
    name = 'Generation 5',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 5],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 5],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

trace5 = go.Scatter(
    x = pokemon['Attack'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 6],
    y = pokemon['Defense'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 6],
    mode = 'markers',
    name = 'Generation 6',
    text = pokemon['Name'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 6],
    marker = dict(
        symbol = 'circle',
        sizemode = 'area',
        size = pokemon['HP'][pokemon['Type 1'] == 'Fire'][pokemon['Generation'] == 6],
        sizeref = sizeref,
        line = dict(width = 2),
    )
)

data = [trace0, trace1, trace2, trace3, trace4, trace5]
layout = go.Layout(
    title = 'Attack vs Defense of Fire pokemon over generations',
    xaxis = dict(
        title = 'Attack',
        gridcolor = 'rgb(255, 255, 255)',
        range = [0, 200]
    ),
    yaxis = dict(
        title = 'Defense',
        gridcolor = 'rgb(243, 243, 243)',
        range = [0, 200]
    ),
    paper_bgcolor = 'rgb(243, 243, 243)',
    plot_bgcolor = 'rgb(243, 243, 243)',
)

fig = go.Figure(data = data, layout = layout)
iplot(fig, filename = "bubble_attack_defense.png")


# Treemap
A treemap is a visualization that displays hierarchically organized data as a set of nested rectangles, parent elements being tiled with their child elements. The sizes and colors of rectangles are proportional to the values of the data points they represent.

[source](https://docs.anychart.com/Basic_Charts/Treemap_Chart)

**Treemaps are often used in R kernels due to their interactivity. Python offers squarify to do the same. We can combine squarify and pyplot to plot treemaps.**

In [20]:
list(pokemon['Type 1'])

['Grass',
 'Grass',
 'Grass',
 'Grass',
 'Fire',
 'Fire',
 'Fire',
 'Fire',
 'Fire',
 'Water',
 'Water',
 'Water',
 'Water',
 'Bug',
 'Bug',
 'Bug',
 'Bug',
 'Bug',
 'Bug',
 'Bug',
 'Normal',
 'Normal',
 'Normal',
 'Normal',
 'Normal',
 'Normal',
 'Normal',
 'Normal',
 'Poison',
 'Poison',
 'Electric',
 'Electric',
 'Ground',
 'Ground',
 'Poison',
 'Poison',
 'Poison',
 'Poison',
 'Poison',
 'Poison',
 'Fairy',
 'Fairy',
 'Fire',
 'Fire',
 'Normal',
 'Normal',
 'Poison',
 'Poison',
 'Grass',
 'Grass',
 'Grass',
 'Bug',
 'Bug',
 'Bug',
 'Bug',
 'Ground',
 'Ground',
 'Normal',
 'Normal',
 'Water',
 'Water',
 'Fighting',
 'Fighting',
 'Fire',
 'Fire',
 'Water',
 'Water',
 'Water',
 'Psychic',
 'Psychic',
 'Psychic',
 'Psychic',
 'Fighting',
 'Fighting',
 'Fighting',
 'Grass',
 'Grass',
 'Grass',
 'Water',
 'Water',
 'Rock',
 'Rock',
 'Rock',
 'Fire',
 'Fire',
 'Water',
 'Water',
 'Water',
 'Electric',
 'Electric',
 'Normal',
 'Normal',
 'Normal',
 'Water',
 'Water',
 'Poison',
 'Poison',


In [21]:
import squarify

# these values define the coordinate system for the returned rectangles
# the values will range from x to x + width and y to y + height
x = 0
y = 0
# Area = 2500
width = 50
height = 50
type_list = list(pokemon['Type 1'].unique())
values = [len(pokemon[pokemon['Type 1'] == t]) for t in type_list]
# values must be sorted descending (and positive, obviously)
# values.sort(reverse=True)

normed = squarify.normalize_sizes(values, width, height)
rects = squarify.squarify(normed, x, y, width, height)

# Choose colors from http://colorbrewer2.org/ under "Export"
color_brewer = ['#2D3142','#4F5D75','#BFC0C0','#F2D7EE','#EF8354','#839788','#EEE0CB','#BAA898','#BFD7EA','#685044','#E9AFA3','#99B2DD','#F9DEC9','#3A405A','#494949','#FF5D73','#7C7A7A','#CF5C36','#EFC88B']
shapes = []
annotations = []
counter = 0
# count : 0 ~ n - 1

for r in rects:
    shapes.append(
        dict(
            type = 'rect',
            x0 = r['x'],
            y0 = r['y'],
            x1 = r['x'] + r['dx'],
            y1 = r['y'] + r['dy'],
            line = dict(width = 2),
            fillcolor = color_brewer[counter]
        )
    )
    annotations.append(
        dict(
            x = r['x'] + (r['dx']/2),
            y = r['y'] + (r['dy']/2),
            text = "{} - {}".format(type_list[counter], values[counter]), 
            showarrow = False
        )
    )
    counter = counter + 1
    counter = counter % len(color_brewer)

trace0 = go.Scatter(
    x = [r['x'] + (r['dx']/2) for r in rects],
    y = [r['y'] + (r['dy']/2) for r in rects],
    text = [str(v) for v in values],
    mode = 'text'
)

data = [trace0]

layout = dict(
    height=700, 
    width=700,
    xaxis=dict(showgrid=False,zeroline=False),
    yaxis=dict(showgrid=False,zeroline=False),
    shapes=shapes,
    annotations=annotations,
    hovermode='closest',
    font=dict(color="#FFFFFF")
)

# With hovertext
figure = dict(data=data, layout=layout)
iplot(figure, filename='squarify-treemap')

# Bullet chart

Bullet charts are a variation of a bar chart developed by Stephen Few as a replacement for gauges and meters.

The Bullet chart has the following advantages over the gauges.

* Space saver – require less real estate, can be oriented horizontally and vertically based on the space available.
* Display multiple measures
* Easier to read and more informative


The bullet graph consists of five primary components:

* Performance Measure – The actual value of the metrics.
* Comparative Measures – One or two comparative/ target measures to compare the performance of the metrics against a given target value.
* Qualitative scale – To measure the metrics state i.e. good, bad or satisfactory.
* Quantitative scale – Linear X axis to measure the value of the metrics
* Text label – For labeling the metric

Information can be shown in multiple formats using Bullet charts.

![title](~/Desktop/pokemondata/bullet1.png)
[source](http://visualbi.com/blogs/business-intelligence/dashboards/bullet-charts-use-cases/)

In [22]:
# Function for bullet chart
def checkpokemonperformance(name):
    x = pokemon[pokemon["Name"] == name]
    data = (
      {"label": "HP", "sublabel": x["HP"].values[0],
       "range": [max(pokemon["HP"])*0.5, max(pokemon["HP"])*0.75, max(pokemon["HP"])], "performance": [x["HP"].values[0], x["HP"].values[0]], "point": [max(pokemon['HP'])*0.55]},
      {"label": "Attack", "sublabel": x["Attack"].values[0],
       "range": [max(pokemon["Attack"])*0.5, int(max(pokemon["Attack"])*0.75), max(pokemon["Attack"])], "performance": [x["Attack"].values[0],x["Attack"].values[0]], "point": [max(pokemon["Attack"])*0.55]},
      {"label": "Defense", "sublabel": x["Defense"].values[0],
       "range": [max(pokemon["Defense"])*0.5, int(max(pokemon["Defense"])*0.75), max(pokemon["Defense"])], "performance": [x["Defense"].values[0],x["Defense"].values[0]], "point": [max(pokemon["Defense"])*0.55]},
      {"label": "Sp. Atk", "sublabel": x["Sp. Atk"].values[0],
       "range": [max(pokemon["Sp. Atk"])*0.5, int(max(pokemon["Sp. Atk"])*0.75), max(pokemon["Sp. Atk"])], "performance": [x["Sp. Atk"].values[0],x["Sp. Atk"].values[0]], "point": [max(pokemon["Sp. Atk"])*0.55]},
      {"label": "Sp. Def", "sublabel": x["Sp. Def"].values[0],
       "range": [max(pokemon["Sp. Def"])*0.5, int(max(pokemon["Sp. Def"])*0.75), max(pokemon["Sp. Def"])], "performance": [x["Sp. Def"].values[0],x["Sp. Def"].values[0]], "point": [max(pokemon["Sp. Def"])*0.55]},
      {"label": "Speed", "sublabel": x["Speed"].values[0],
       "range": [max(pokemon["Speed"])*0.5, int(max(pokemon["Speed"])*0.75), max(pokemon["Speed"])], "performance": [x["Speed"].values[0],x["Speed"].values[0]], "point": [max(pokemon["Speed"])*0.55]}
    )
    
    fig = ff.create_bullet(
        data, titles='label', subtitles='sublabel', markers='point',
        measures='performance', ranges='range', orientation='v', width=800, height=800
    )
    iplot(fig, filename='Bullet chart')

In [23]:
checkpokemonperformance("Mewtwo")

In [24]:
checkpokemonperformance("Regigigas")

In [26]:
checkpokemonperformance("Kyogre")

In [29]:
checkpokemonperformance("Rayquaza")

# Scatterplot matrix
A scatter plot matrix is a grid (or matrix) of scatter plots used to visualize bivariate relationships between combinations of variables. Each scatter plot in the matrix visualizes the relationship between a pair of variables, allowing many relationships to be explored in one chart.

[source](https://pro.arcgis.com/en/pro-app/help/analysis/geoprocessing/charts/scatter-plot-matrix.htm)

### Scatterplot matrix of attributes with boxplots

In [48]:
fig = ff.create_scatterplotmatrix(pokemon.iloc[:,4:11], index='Generation', diag='box', size=2, height=800, width=800)
iplot(fig, filename ='Scatterplotmatrix')

In [46]:
fig = ff.create_scatterplotmatrix(pokemon.iloc[:,[2,4,5,6,7,8,9,10]], index='Type 1', diag='histogram', size=2, height=800, width=800)
iplot(fig, filename ='Scatterplotmatrix')

# Violin plot

A violin plot is a method of plotting numeric data. It is similar to box plot with a rotated kernel density plot on each side.

A violin plot is more informative than a plain box plot. In fact while a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak). In this case a violin plot clearly shows the presence of different peaks, their position and relative amplitude. This information could not be represented with a simple box plot which only reports summary statistics. The inner part of a violin plot usually shows the mean (or median) and the interquartile range. In other cases, when the number of samples is not too high, the inner part can show all sample points (with a dot or a line for each sample).

[source](https://en.wikipedia.org/wiki/Violin_plot)

### Violin plot of all stats

In [49]:
data = []
for i in range(4,10):
    trace = {
            "type": 'violin',
            "x": max(pokemon.iloc[:,i]),
            "y": pokemon.iloc[:,i],
            "name": list(pokemon.columns)[i],
            "box": {
                "visible": True
            },
            "meanline": {
                "visible": True
            }
        }
    data.append(trace)
        
fig = {
    "data": data,
    "layout" : {
        "title": "Violin plot of all stats",
        "yaxis": {
            "zeroline": False,
        }
    }
}

iplot(fig, filename='violin', validate = False)