# **Chart_Studio Plotly + Cufflinks**
___
___

***Topics***



*   Histogram by group data *(Example: How many Houses contains bedrooms within range 0-4)*
*   **Overlay Histogram** *(Example: Comparison of multiple histogram in same chart)*
*   Simple Bar Plot



In [None]:
!pip install cufflinks chart_studio plotly

**Import List**

In [2]:
# plotly imports
import plotly.graph_objs as go
import chart_studio.plotly as py
# Cufflinks wrapper on plotly
import cufflinks as cf

import pandas as pd
import numpy as np

%matplotlib inline

# Options for pandas
pd.options.display.max_columns = 30

# Display all cell outputs
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'

## Initializations

**This function is required for GOOGLE COLAB plotly view**
Need to call the following line in **every cell** if it contains vizualization code

`configure_plotly_browser_state()`

`init_notebook_mode(connected=False)`

In [3]:
def configure_plotly_browser_state():
  import IPython
  display(IPython.core.display.HTML('''
        <script src="/static/components/requirejs/require.js"></script>
        <script>
          requirejs.config({
            paths: {
              base: '/static/base',
              plotly: 'https://cdn.plot.ly/plotly-latest.min.js?noext',
            },
          });
        </script>
        '''))

In [4]:
from plotly.offline import iplot, init_notebook_mode
init_notebook_mode(connected=True)
cf.go_offline(connected=True)

# Set global theme
cf.set_config_file(world_readable=True, theme='pearl')

## DATA

In [5]:
df = pd.read_csv('kc_house_data.csv')
df.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,condition,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,10/13/2014,221900.0,3,1.0,1180,5650,1.0,0,0,3,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,12/9/2014,538000.0,3,2.25,2570,7242,2.0,0,0,3,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,2/25/2015,180000.0,2,1.0,770,10000,1.0,0,0,3,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,12/9/2014,604000.0,4,3.0,1960,5000,1.0,0,0,5,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,2/18/2015,510000.0,3,2.0,1680,8080,1.0,0,0,3,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503


In [15]:
df.describe()

Unnamed: 0,id,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,condition,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
count,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0,21597.0
mean,4580474000.0,540296.6,3.3732,2.115826,2080.32185,15099.41,1.494096,0.007547,0.234292,3.409825,7.657915,1788.596842,291.725008,1970.999676,84.464787,98077.951845,47.560093,-122.213982,1986.620318,12758.283512
std,2876736000.0,367368.1,0.926299,0.768984,918.106125,41412.64,0.539683,0.086549,0.76639,0.650546,1.1732,827.759761,442.6678,29.375234,401.821438,53.513072,0.138552,0.140724,685.230472,27274.44195
min,1000102.0,78000.0,1.0,0.5,370.0,520.0,1.0,0.0,0.0,1.0,3.0,370.0,0.0,1900.0,0.0,98001.0,47.1559,-122.519,399.0,651.0
25%,2123049000.0,322000.0,3.0,1.75,1430.0,5040.0,1.0,0.0,0.0,3.0,7.0,1190.0,0.0,1951.0,0.0,98033.0,47.4711,-122.328,1490.0,5100.0
50%,3904930000.0,450000.0,3.0,2.25,1910.0,7618.0,1.5,0.0,0.0,3.0,7.0,1560.0,0.0,1975.0,0.0,98065.0,47.5718,-122.231,1840.0,7620.0
75%,7308900000.0,645000.0,4.0,2.5,2550.0,10685.0,2.0,0.0,0.0,4.0,8.0,2210.0,560.0,1997.0,0.0,98118.0,47.678,-122.125,2360.0,10083.0
max,9900000000.0,7700000.0,33.0,8.0,13540.0,1651359.0,3.5,1.0,4.0,5.0,13.0,9410.0,4820.0,2015.0,2015.0,98199.0,47.7776,-121.315,6210.0,871200.0


## **Histogram by Group Data**

In [None]:
####### These Lines are important for Google Colab ######
configure_plotly_browser_state()
init_notebook_mode(connected=False)
####### Without these lines figure will be null ######

df[['bedrooms', 'bathrooms', 'condition']].iplot(
    kind='hist',
    linecolor='green',
    bins=10,
    histnorm='percent',
    bargap=0.1,
    opacity=0.8,
    barmode='group',
    xTitle='House Info',
    yTitle='Percentage (%)',
    title='Housing Criteria')

## **Overlay Histogram**

In [None]:
####### These Lines are important for Google Colab ######
configure_plotly_browser_state()
init_notebook_mode(connected=False)
####### Without these lines figure will be null ######

df[['bedrooms', 'bathrooms']].iplot(
    kind='hist',
    bins=10,
    linecolor='black',
    opacity=0.8,
    histnorm='percent',
    barmode='overlay',
    xTitle='House',
    yTitle='(%)',
    title='Compare bedroom and bathroom')

⚠️*Comment: Compare multiple column by plotting in same histogram chart*

## **Bar Plot**

In [None]:
####### These Lines are important for Google Colab ######
configure_plotly_browser_state()
init_notebook_mode(connected=False)
####### Without these lines figure will be null ######

df.groupby('bedrooms').count()['bathrooms'].iplot(
    kind='bar', yTitle='Total Frequency of houses', linecolor='black', title='Compare no. of bedrooms and price')