# Commonly used visualizations in data analytics

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

In this notebook we present some of the most commonly used visualizations in data analytics.

## Line Charts

A line chart or line graph is a type of chart which displays information as a series of data points called 'markers' connected by straight line segments

In [None]:
# Generating data
x = np.linspace(-np.pi, np.pi, 200)
sine, cosine = np.sin(x), np.cos(x)

In [None]:
fig, ax = plt.subplots()
ax.plot(x, sine)
ax.plot(x, cosine)

In [None]:
fig, ax = plt.subplots()
ax.plot(x, sine, color='red')
ax.plot(x, cosine, color='#110013')

### Linestyles
Line styles are about as commonly used as colors. There are a few predefined linestyles available to use. Note that there are some advanced techniques to specify some custom line styles. [Here](http://matplotlib.org/1.3.0/examples/lines_bars_and_markers/line_demo_dash_control.html) is an example of a custom dash pattern.

linestyle          | description
-------------------|------------------------------
'-'                | solid
'--'               | dashed
'-.'               | dashdot
':'                | dotted
'None'             | draw nothing
' '                | draw nothing
''                 | draw nothing

In [None]:
fig, ax = plt.subplots()
ax.plot(x, sine, color='red', linestyle='--', linewidth=2.5)
ax.plot(x, cosine, color='#110013', linestyle=':', linewidth=2.5);

### Other attributes
With just about any plot you can make, there are many attributes that can be modified to make the lines and markers suit your needs. Note that for many plotting functions, matplotlib will cycle the colors for each dataset you plot. However, you are free to explicitly state which colors you want used for which plots. For the [`plt.plot()`](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot) and [`plt.scatter()`](http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter) functions, you can mix the specification for the colors, linestyles, and markers in a single string.


| Property               | Value Type                                      
|------------------------|-------------------------------------------------
|alpha                   | float                                           
|color or c              | any matplotlib color                            
|drawstyle               | [ ‘default’ ‘steps’ ‘steps-pre’
|                        |   ‘steps-mid’ ‘steps-post’ ]
|linestyle or ls         | [ '-' '--' '-.' ':' 'None' ' ' ''] 
|                        | and any drawstyle in combination with a         
|                        | linestyle, e.g. 'steps--'.                      
|linewidth or lw         | float value in points                           
|marker                  | [ 0 1 2 3 4 5 6 7 'o' 'd' 'D' 'h' 'H'
|                        |  '' 'None' ' ' `None` '8' 'p' ','
|                        |  '+' 'x' '.' 's' '\*' '\_' '&#124;'
|                        |  '1' '2' '3' '4' 'v' '<' '>' '^' ]
|markeredgecolor or mec  | any matplotlib color
|markeredgewidth or mew  | float value in points
|markerfacecolor or mfc  | any matplotlib color
|markersize or ms        | float
|visible                 | [`True` `False`]

In [None]:
fig, ax = plt.subplots()
ax.plot(x, sine, color='red', linestyle='-', linewidth=10, alpha=0.25)
ax.plot(x, cosine, color='#110013', linestyle=':', linewidth=2.5)

### Go to exercise!

## Bar plots

In [None]:
# Top 10 countries in the Rio Olympics
countries = ['USA','GBR','CHN','RUS','GER','JPN','FRA','KOR','ITA','AUS']
gold = [46,27,26,19,17,12,10,9,8,8]
silver = [37,23,18,18,10,8,18,3,12,11]
bronze = [38,17,26,19,15,21,14,9,8,10]

In [None]:
np.arange(10)

In [None]:
fig, ax = plt.subplots()
ax.bar(np.arange(10), gold);

In [None]:
fig, ax = plt.subplots()
ax.bar(np.arange(10), gold)

ax.set_xticks(np.arange(0.5,10.5,1));

In [None]:
fig, ax = plt.subplots()
ax.bar(np.arange(10), gold)

ax.set_xticks(np.arange(0.5,10.5,1))
ax.set_xticklabels(countries);

In [None]:
fig, ax = plt.subplots()
ax.bar(np.arange(10), gold, color="#FFDF00")

ax.set_xticks(np.arange(0.5,10.5,1))
ax.set_xticklabels(countries);

In [None]:
fig, ax = plt.subplots()
ax.bar(np.arange(10), gold, color="#FFDF00")

ax.set_xticks(np.arange(0.5,10.5,1))
ax.set_xticklabels(countries)

ax.set_title('Gold Medals at Rio', size=14)
ax.set_xlabel('Country')
ax.set_ylabel('Number of golds');

In [None]:
fig, ax = plt.subplots(figsize=(10,5.5))
ax.bar(np.arange(10), gold, color="#FFDF00", width=0.4, label='Gold')
ax.bar(np.arange(10)+0.4, silver, color="#C0C0C0", width=0.4, label='Silver')

ax.set_xticks(np.arange(0.4,10.4,1))
ax.set_xticklabels(countries)

ax.set_title('Gold and Silver Medals at Rio', size=16)
ax.set_xlabel('Country', size=14)
ax.set_ylabel('Number of medals', size=14)
ax.legend(loc='upper right');

In [None]:
zip(np.arange(10), gold, silver)

In [None]:
fig, ax = plt.subplots(figsize=(10,5.5))
ax.bar(np.arange(10), gold, color="#FFDF00", width=0.4, label='Gold')
ax.bar(np.arange(10)+0.4, silver, color="#C0C0C0", width=0.4, label='Silver')

ax.set_xticks(np.arange(0.4,10.4,1))
ax.set_xticklabels(countries);

for x,g,s in zip(np.arange(10), gold, silver):
    ax.text(x+0.1, g+0.5, g) # annotating the golds
    ax.text(x+0.5, s+0.5, s) # annotating the silvers

ax.set_title('Gold and Silver Medals at Rio', size=16)
ax.set_xlabel('Country', size=14)
ax.set_ylabel('Number of medals', size=14)
ax.legend(loc='upper right');

#### Go to exercise!

## Histograms

In [None]:
iqs = np.random.normal(loc=100, scale=10, size=300)

In [None]:
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12,4))

ax[0].hist(iqs, bins=18)
ax[0].set_title('Frequency histogram')

ax[1].hist(iqs, bins=18, normed=True, )
ax[1].set_title('Density (normed) histogram');

#### Go to exercise!

## Scatter plots

In [None]:
fig, ax = plt.subplots()
ax.scatter(gold, silver, marker='o')

ax.set_title('Gold vs. Silver Medals at Rio', size=16)
ax.set_xlabel('Gold medals', size=14)
ax.set_ylabel('Silver medals', size=14);

In [None]:
fig, ax = plt.subplots()
ax.scatter(gold, silver, marker='o', color='blue', label='Gold vs Silver')
ax.scatter(gold, bronze, marker='D', color='red', label='Gold vs Bronze')

ax.set_title('Gold vs. Silver and Bronze Medals at Rio', size=16)
ax.set_xlabel('Gold medals', size=14)
ax.set_ylabel('Silver and bronze medals', size=14)
ax.legend(loc='upper left');