# Tutorial 02: Data Visualization

The following research contains Python examples for data visualization. 

Note: The notebook(.ipynb or .html) can be downloaded from Google Classroom. 

Data visualization gives us a clear idea of what the information means by giving it visual context through maps or graphs. This makes the data more natural for the human mind to comprehend and therefore makes it easier to identify trends, patterns, and outliers within large data sets.To execute the code, click on the corresponding cell and press the SHIFT-ENTER keys simultaneously.

Original notebook by <a href="https://www.sunilghimire.com.np" target = '_blank'>Sunil Ghimire</a>, Herald College Kathmandu, 2021. <br>

💬 **Stay Connected**

[<img align="left" alt="Sunil | Website" width="22px" src="https://www.freepnglogos.com/uploads/logo-website-png/logo-website-website-logo-png-transparent-background-background-15.png" style = "margin-right:5px;" />](https://sunilghimire.com.np)

[<img align="left" alt="Sunil Linkedin" width="22px" src="https://www.freepnglogos.com/uploads/linkedin-basic-round-social-logo-png-13.png" style = "margin-right:5px;" />](https://www.linkedin.com/in/ghimiresunil/)

[<img align="left" alt="Sunil Github" width="22px" src="https://image.flaticon.com/icons/png/512/2111/2111425.png" style = "margin-right:5px;" target = '_blank' />](https://github.com/sunil-gh)


## Plotly and Cufflinks 

Plotly is a library that allows you to create interactive plots that you can use in dashboards or websites (you can save them as html files or static images).

## Installation

In order for this all to work, you'll need to install plotly and cufflinks to call plots directly off of a pandas dataframe. These libraries are not currently available through <b>conda</b> but are available through <b>pip</b>. Install the libraries at your command line/terminal using:

<code> !pip install plotly </code> <br>
<code> !pip install cufflinks </code>

<b> NOTE: Make sure you only have one installation of Python on your computer when you do this, otherwise the installation may not work. </b>

## Imports and Set-up 

In [1]:
import pandas as pd
import numpy as np
import cufflinks as cf
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

print(__version__) # requires version >= 1.9.0

4.14.3


 ## For Notebooks

In [2]:
init_notebook_mode(connected=True)

# For offline use

In [3]:
cf.go_offline()

## Random Data

In [4]:
df = pd.DataFrame(np.random.randn(100,4),columns='A B C D'.split())
df.head()

Unnamed: 0,A,B,C,D
0,0.870348,2.399264,-1.061792,-0.903242
1,-2.659676,1.545015,0.412722,-1.814039
2,0.702158,1.583214,-0.217561,0.313427
3,0.802098,-0.222987,-0.489724,-2.459307
4,-1.258352,0.510112,-0.800927,0.823925


In [5]:
df2 = pd.DataFrame({'Category':['A','B','C'],'Values':[32,43,50]})
df2.head()

Unnamed: 0,Category,Values
0,A,32
1,B,43
2,C,50


In [6]:
df.iplot()

## Using Cufflinks and iplot()

1. scatter

2. bar

3. box

4. spread

5. ratio

6. heatmap

7. surface

8. histogram

9. bubble

## Scatter Plot

In [7]:
df.iplot(kind='scatter',x='A',y='B',size=11,mode='markers') #mode='markers'

## Bar plot

In [8]:
df2.iplot(kind='bar',x='Category',y='Values')

## Number of occurrences of a substring 

In [9]:
df.count().iplot(kind='bar')

## Boxplots

In [10]:
df.iplot(kind='box')

## 3d Surface

In [11]:
df3 = pd.DataFrame({'x':[1,2,3,4,5],'y':[10,20,30,20,10],'z':[5,4,3,2,1]})
df3.iplot(kind='surface',colorscale='rdylbu')

## Spread

In [12]:
df[['A','B']].iplot(kind='spread')

## Histogram

In [13]:
df.iplot(kind='hist')

## Bubble plot

In [14]:
df.iplot(kind='bubble',x='A',y='B',size='C')

## scatter Matrix

In [15]:
df.scatter_matrix() # Similar to sns.pairplot()

## Thank You