<span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:28px; color:#FBFAFC; ">Plotly Tutorial 📊📈</span>

1. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Libraries and Utilities</span>](#1)
2. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Load and Check Data</span>](#2)
3. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Scatter Plots</span>](#3)
4. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Bubble Charts</span>](#4)
5. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">3D Scatter Plots</span>](#5)
6. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Line Charts</span>](#6)
7. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Bar Charts</span>](#7)
8. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Pie Charts</span>](#8)
9. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Histograms</span>](#9)
10. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Distplots</span>](#10)
11. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Heatmaps</span>](#11)
12. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Box Plots</span>](#12)
13. [<span style="font-weight: bold; font-family:Verdana; font-size:16px; color:#DC1010; ">Subplots</span>](#13)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">What is Plotly?</span>

Plotly is an interactive, open-source plotting library that supports over 40 unique chart types covering a wide range of statistical, financial, geographic, scientific, and 3-dimensional use-cases. There are three main elements of the visualizations that built with Plotly:
- **Data:** Specifies the type of visualization and the axes variables of it.  
- **Layout:**  Title, axis labels, font types, axes ranges, etc. 
- **Figure:** Visualization may be completed by executing data and layout features in figure.

<a id = "1"></a><h1 id="Libraries and Utilities"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Libraries and Utilities</span></h1>

In [None]:
!pip install plotly
!pip install chart_studio
!pip install cufflinks

In [None]:
import pandas as pd
import numpy as np
import random
import matplotlib.pyplot as plt
%matplotlib inline

import plotly.graph_objs as go
import plotly.offline as pyo
import plotly.figure_factory as ff
import plotly.express as px
from plotly import tools
from plotly.subplots import make_subplots

import chart_studio.plotly as py
import cufflinks as cf

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected = True)
cf.go_offline();
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

<a id = "2"></a><h1 id="Load and Check Data"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Load and Check Data</span></h1>

In [None]:
df = pd.read_csv('/kaggle/input/students-performance-in-exams/StudentsPerformance.csv')
df.head()

<a id = "3"></a><h1 id="Scatter Plots"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Scatter Plots</span></h1>

- Scatter plots allow the comparison of two variables for a set of data.
- Depending on the trend of scatter points, we could interpret a correlation.
- We can create scatter plots with <code>go.Scatter</code>
- Set the title of graph by using <code>title</code> in **layout**.
- The <code>x</code> and <code>y</code> parameters inside the **title** dictionary represent the position of the title.
- We can also specify the size of output with <code>width</code> and <code>height</code>
- We have to put the data and layout parts we filled in into the figure we created with <code>go.Figure</code>

In [None]:
data = [go.Scatter(x=df['reading score'],
                   y=df['writing score'],
                   mode='markers',
                   marker = dict(size=12,
                                 color='rgb(0,189,255)',
                                 symbol= 'diamond',
                                 opacity = 0.8,
                                 line={'color':'black',
                                       'width':1.5}))]

layout = go.Layout(title=dict(text='Reading Score & Writing Score',
                              y=0.88,
                              x=0.5,
                              xanchor= 'center',
                              yanchor= 'top'),
                              xaxis={'title':'Reading Score'},
                              yaxis=dict(title='Writing Score'),
                              hovermode='closest')

fig = go.Figure(data=data, layout=layout)
iplot(fig)

In [None]:
trace_male = (go.Scatter(x=df[df['gender']=='male']['math score'],
                         showlegend=True,text='Male',
                         y = df[df['gender']=='male']['writing score'],
                         name='Male',
                         mode='markers',
                         marker = dict(color= 'cornflowerblue',
                                       size=9,
                                       opacity = 0.55)))

trace_female = (go.Scatter(x=df[df['gender']=='female']['math score'],
                           showlegend=True,text='Female',
                           y = df[df['gender']=='female']['writing score'],
                           name='Female',
                           mode='markers',
                           marker = dict(color='darkorange',
                                         size=9,
                                         opacity = 0.55)))
        
data=[trace_male,trace_female]

layout= go.Layout(title='Math Score & Writing Score',
                  xaxis = dict(title='Math Score'),
                  yaxis=dict(title='Writing Score'),
                  width=700,
                  height=450,
                  template='plotly_white')

fig = go.Figure(data=data,layout=layout)   
iplot(fig)

In [None]:
data = [go.Scatter(x = df['reading score'],
                   y = df['writing score'],
                   mode = 'markers',
                   text=df['math score'],
                   marker=dict(size=10,
                               color = df['math score'],
                               showscale=True,
                               colorbar=dict(title='Math Score'),
                               opacity=0.65))]

layout = go.Layout(title=dict(text='Reading Score - Writing Score - Math Score',
                              y=0.9,
                              x=0.5,
                              xanchor= 'center',
                              yanchor= 'top'),
                              xaxis = dict(title='Reading Score'),
                              yaxis =dict(title='Writing Score'),
                   width=700,
                   height=450,
                   template='plotly_dark')

fig = go.Figure(data=data,layout=layout)
iplot(fig)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">Scatter Plots by Using For Loop</span>

In [None]:
data = []
for i in df['parental level of education'].unique():
    data.append(go.Scatter(x=df[df['parental level of education']==i]['reading score'],
                           y=df[df['parental level of education']==i]['math score'],
                           mode='markers',
                           name=str(i),
                           showlegend = True,
                           marker = dict(size = 8,
                                          opacity = 0.7)))

layout = go.Layout(title='Scores by Level of Education',
                  xaxis = dict(title='Reading Score'),
                  yaxis=dict(title='Math Score'),
                  width=700,
                  height=450,
                  template='plotly_white')

fig = go.Figure(data=data, layout = layout)
fig.show()

<a id = "4"></a><h1 id="Bubble Charts"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Bubble Charts</span></h1>

- A bubble chart is a type of chart that displays **three dimensions** of data.
- Bubble charts can be considered a variation of the scatter plot, in which the data points are replaced with bubbles.
- Define the third variable with <code>size</code> in bubble charts.

In [None]:
data = [go.Scatter(x = df['reading score'],
                   y = df['writing score'],
                   mode = 'markers',
                   text = df['math score'],
                   marker=dict(size=df['math score']*0.3,
                            color = '#FFAE00',
                            showscale=False,
                            opacity=0.6,
                            line=dict(color='black', 
                                      width=0.9)))]

layout = go.Layout(title='Reading Score - Writing Score - Math Score',
                   xaxis = dict(title='Reading Score'),
                   yaxis =dict(title='Writing Score'),
                   width=700,
                   height=450,
                   template='plotly_white')

fig = go.Figure(data=data,layout=layout)
iplot(fig)

In [None]:
fig = make_subplots(rows=1,
                    cols=2,
                    shared_yaxes=True,
                    subplot_titles=("Male", "Female"))
m_color = ['#360DEF','#3A23A2','#14CED1','#0CAFFA','#1FE716','#A8B0E8']
f_color = ['#FA0606','#E8BAE3','#E61FE6','#FA8B06','#ECEF2A','#C707FA']
m=0
f=0
for i in df['parental level of education'].unique():
    fig.add_trace(go.Scatter(x=df.loc[(df['gender']=='male') & (df['parental level of education']==i)]['reading score'],
                            y=df.loc[(df['gender']=='male') & (df['parental level of education']==i)]['math score'],showlegend=True,
                             mode='markers', name=str(i)+' '+'(M)',text=df.loc[(df['gender']=='male') & (df['parental level of education']==i)]['writing score'],
                            marker = dict(color = m_color[m],
                                          size = df.loc[(df['gender']=='male') & (df['parental level of education']==i)]['writing score']*0.3,
                                          opacity = 0.5)),row=1,col=1)
    m=m+1

for i in df['parental level of education'].unique():
    fig.add_trace(go.Scatter(x=df.loc[(df['gender']=='female') & (df['parental level of education']==i)]['reading score'],
                            y=df.loc[(df['gender']=='female') & (df['parental level of education']==i)]['math score'],showlegend=True,
                             mode='markers', name=str(i)+' '+'(F)',text=df.loc[(df['gender']=='female') & (df['parental level of education']==i)]['writing score'],
                            marker = dict(color = f_color[f],
                                          size = df.loc[(df['gender']=='female') & (df['parental level of education']==i)]['writing score']*0.3,
                                          opacity = 0.5)),row=1,col=2)
    f=f+1

fig.update_xaxes(title_text="Reading Score",
                 gridwidth=1,
                 gridcolor='LightGray',
                 row=1,
                 col=2)
fig.update_yaxes(title_text="Writing Score",
                 gridwidth=1,
                 gridcolor='LightGray',
                 row=1,
                 col=2)
fig.update_xaxes(title_text="Reading Score",
                 gridwidth=1,
                 gridcolor='LightGray',
                 row=1,
                 col=1)
fig.update_yaxes(title_text="Writing Score",
                 gridwidth=1,
                 gridcolor='LightGray',
                 row=1,
                 col=1)

fig.update_layout(title=dict(text='Scores by Level of Education',
                             y=0.9,
                             x=0.5,
                             xanchor= 'center',
        yanchor= 'top'),
                  template = 'plotly',
                  legend = dict(font = dict(size = 10)))

iplot(fig)

<a id = "5"></a><h1 id="3D Scatter Plots"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">3D Scatter Plots</span></h1>

In [None]:
data = go.Scatter3d(x = df['reading score'],
                    y = df['writing score'],
                    z = df['math score'],
                    mode='markers', marker=dict(color=df['math score'],
                                                colorscale='Viridis',
                                                showscale=True,
                                                colorbar=dict(title='Math Score'),
                                                opacity=0.8))

layout = go.Layout(title=dict(text='Scores',y=0.9,
                              x=0.5,
                              xanchor= 'center',
                              yanchor= 'top'),
                   scene = dict(xaxis = dict(title='Reading Score'),
                                yaxis = dict(title = 'Writing Score'),
                                zaxis = dict(title='Math Score')),
                   template='plotly_dark')

fig = go.Figure(data=data,layout=layout)
iplot(fig)

<a id = "6"></a><h1 id="Line Charts"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Line Charts</span></h1>

- A line chart displays a series of data points (markers) connected by line segments.
- It is similar to a scatter plot except that the measurement points are ordered (typically by their x-axis value) and joined with straight line segments.
- Often used to visualize a trend in data over intervals of time - known as a time series.
- To create a line chart, tune the <code>mode</code> parameter as "line".
- Like scatter plots, you can edit marker qualities like line color,line width, etc. in <code>line</code> parameter.

In [None]:
students = np.random.randint(25,60,10)
from datetime import date, time, datetime
dates = []
for i in range(10):
    dates.append(date(year=2020+i, month=1, day=1))
students_dataset = pd.DataFrame(dates,columns=['Date'])
students_dataset['students'] = students
students_dataset.head()

In [None]:
data = go.Scatter(x=students_dataset['Date'],
                  y=students_dataset['students'],
                  mode='lines',
                  name='students',
                  line = dict(color='#FF2F01',width=4))

layout = go.Layout(title={'text': "Number of Students by Years",
                          'y':0.9,
                          'x':0.5,
                          'xanchor': 'center',
                          'yanchor': 'top'},
                   xaxis = dict(title='Year'),
                   yaxis =dict(title='Student'))

fig = go.Figure(data=data, layout=layout)
iplot(fig)

In [None]:
fig = make_subplots(rows=1, cols=2,shared_yaxes=True,subplot_titles=("2020-2024", "2025-2029"))

fig.add_trace(go.Scatter(x=students_dataset['Date'][0:5], y=students_dataset['students'][0:5],mode='lines',
                         showlegend=False,name='students20-24',line = dict(color='#18FF01',width=4)),row=1,col=1)
                                      
fig.add_trace(go.Scatter(x=students_dataset['Date'][5:10], y=students_dataset['students'][5:10],mode='lines',
                         showlegend=False, name='students25-29',line = dict(color='#01AAFF',width=4)),row=1,col=2)

fig.update_yaxes(title_text="Students", row=1, col=1)

fig.update_layout(title=dict(text='Number of Students by Years',y=0.9,x=0.5,
                             xanchor= 'center',yanchor= 'top'),template = 'plotly')

iplot(fig)

<a id = "7"></a><h1 id="Bar Charts"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Bar Charts</span></h1>
- A bar chart presents **categorical data** with rectangular bars with heights (or lengths) proportional to the values that they represent.
- Built a bar chart with <code>go.Bar</code>
- Use <code>text</code> to display values on each bar.
- Bars can be edited with <code>marker</code>

In [None]:
data = go.Bar(x = df.groupby('gender').agg({'math score':'mean'}).reset_index()['gender'],
              y = df.groupby('gender').agg({'math score':'mean'}).reset_index()['math score'],
              width=[0.5, 0.5],marker = dict(color = 'cornflowerblue',line_color = 'black',line_width=3))

layout = go.Layout(title='Avg Math Scores by Gender',
                   xaxis = dict(title='Gender'),
                   yaxis =dict(title='Math Score'),
                   width=450,height=450,template = 'plotly_white')

fig=go.Figure(data=data, layout=layout)
fig.update_yaxes(range=[0,100])
iplot(fig)

In [None]:
data = go.Bar(x = df.groupby('gender').agg({'reading score':'mean'}).reset_index()['gender'],
              y = df.groupby('gender').agg({'reading score':'mean'}).reset_index()['reading score'],
              width=[0.5, 0.5],
              text =round(df.groupby('gender').agg({'reading score':'mean'}).reset_index()['reading score'],2),
              textposition= 'outside',
              marker = dict(color = 'deeppink',
                            line_color = 'black',line_width=3))

layout = go.Layout(title='Avg Reading Scores by Gender',
                   xaxis = dict(title='Gender'),
                   yaxis =dict(title='Reading Score'),
                  width=450,height=450,template='plotly_white')

fig=go.Figure(data=data, layout=layout)
fig.update_yaxes(range=[0,100])
iplot(fig)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">Grouped Bar Charts</span>

- A grouped bar chart extends the bar chart, plotting numeric values for levels of two categorical variables instead of one.
- Bars are grouped by position for levels of one categorical variable, with color indicating the secondary category level within each group.
- Use <code>barmode</code> to define the type of the bar chart.

In [None]:
trace1 = go.Bar(x = df.groupby('gender').agg({'reading score':'mean'}).reset_index()['gender'],
                text =round(df.groupby('gender').agg({'reading score':'mean'}).reset_index()['reading score'],2),
                textposition= 'auto',
                y = df.groupby('gender').agg({'reading score':'mean'}).reset_index()['reading score'],
                name = 'Reading Score',marker=dict(color='#06F5E3'))

trace2 = go.Bar(x = df.groupby('gender').agg({'writing score':'mean'}).reset_index()['gender'],
                text =round(df.groupby('gender').agg({'writing score':'mean'}).reset_index()['writing score'],2),
                textposition= 'auto',
                y = df.groupby('gender').agg({'writing score':'mean'}).reset_index()['writing score'],
                name = 'Writing Score',marker=dict(color='#FEAD00'))

trace3 = go.Bar(x = df.groupby('gender').agg({'math score':'mean'}).reset_index()['gender'],
                text =round(df.groupby('gender').agg({'math score':'mean'}).reset_index()['math score'],2),
                textposition= 'auto',
                y = df.groupby('gender').agg({'math score':'mean'}).reset_index()['math score'],
                name = 'Math Score',marker=dict(color='#CC00FE'))

layout = go.Layout(title={'text': "Avg Scores by Gender",'y':0.9,'x':0.45,
                          'xanchor': 'center','yanchor': 'top'},
                  barmode = 'group',legend=dict(x=0,y=1.0,bgcolor='rgba(255, 255, 255, 0)',
                                                bordercolor='rgba(255, 255, 255, 0)'),
                  xaxis = dict(title='Gender'),
                  yaxis =dict(title='Score'),template='plotly_dark')

data = [trace1,trace2,trace3]
fig = go.Figure(data=data,layout=layout)
fig.update_yaxes(range=[0,100])
iplot(fig)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">Stacked Bar Charts</span>

In [None]:
parental_avg = pd.DataFrame(df.groupby(['parental level of education']).mean())
parental_avg = parental_avg.reset_index()

trace1 = go.Bar(x = parental_avg['parental level of education'],
                y = parental_avg['math score'], name='math score',
               marker=dict(color ='#F2E80C',opacity = 0.7))

trace2 = go.Bar(x = parental_avg['parental level of education'],
                y = parental_avg['reading score'], name='reading score',
               marker=dict(color='#44F20C',opacity = 0.7))

trace3 = go.Bar(x = parental_avg['parental level of education'],
                y = parental_avg['writing score'], name='writing score',
               marker=dict(color='#F20CE1',opacity = 0.7))

layout = go.Layout(title = 'Avg Scores by Level of Education', barmode = 'stack',
                  xaxis = dict(title='Level of Education'),
                   yaxis =dict(title='Score'))
data = [trace1,trace2,trace3]
fig = go.Figure(data=data,layout=layout)
iplot(fig)

<a id = "8"></a><h1 id="Pie Charts"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Pie Charts</span></h1>

- A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical proportion.
- In a pie chart, the arc length of each slice, is proportional to the quantity it represents. While it is named for its resemblance to a pie which has been sliced, there are variations on the way it can be presented.
- Create a Pie Chart with <code>go.Pie</code>

In [None]:
colors = ['#28F20C', '#0CF2F2', '#F27F0C', '#F20C52']

fig = go.Figure(data=[go.Pie(labels=df['race/ethnicity'].value_counts().keys(),
                             values=df['race/ethnicity'].value_counts()[0:5].values)])
fig.update_traces(hoverinfo='value', textinfo='label', textfont_size=16,textposition ='auto',showlegend=False,
                  marker=dict(colors=colors))

fig.update_layout(title={'text': "Race/Ethnicity Gropus",'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                 template='simple_white')
iplot(fig)

In [None]:
colors = ['#14CFE8', '#E814C1']

fig = go.Figure(data=[go.Pie(labels=df['lunch'].value_counts().keys(),
                             values=df['lunch'].value_counts()[0:5].values,
                             pull=[0, 0.2])])

fig.update_traces(hoverinfo='label', textinfo='percent', textfont_size=20,textposition ='auto',
                  marker=dict(colors=colors, line=dict(color='black', width=1.5)))

fig.update_layout(title={'text': "Percentages of Lunch Types",'y':0.86,'x':0.45,
                         'xanchor': 'center','yanchor': 'top'},
                  template='plotly_white')

iplot(fig)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">Donut Charts</span>

- A donut chart is a pie chart with a hole in the center.
- Use <code>hole</code> to define a hole. In this parameter, **larger values** come up with **bigger holes**.

In [None]:
colors = ['#D7DD19', '#6FDD19', '#19DDA5', '#195ADD','#A219DD','#DD1984']

fig = go.Figure(data=[go.Pie(labels=df['parental level of education'].value_counts().keys(),
                             values=df['parental level of education'].value_counts()[0:6].values)])

fig.update_traces(hoverinfo='label', textinfo='value',hole = 0.35, textfont_size=22,textposition ='auto',
                  marker=dict(colors=colors,line=dict(color='beige', width=1.5)))

fig.update_layout(title={'text': "Parental Level of Education",'y':0.88,'x':0.45,
                         'xanchor': 'center','yanchor': 'top'},
                          template='plotly_dark')

iplot(fig)

<a id = "9"></a><h1 id="Histograms"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Histograms</span></h1>

- A histogram displays an accurate representation of the overall distribution of a **continuous feature**.
- In graphical objects, <code>go.Histogram</code> can be used to create a histogram.
- To create a histogram, we divide the entire range of values of the continuous feature into a series of intervals.
- This series of intervals are known as **"bins"**.
- Change the bin size with <code>size</code> to get either more or lesss detail.
- Determine the starting, ending and interval size with <code>xbins</code>

In [None]:
data = [go.Histogram(x= df['math score'],
                     xbins = dict(start = 0,end =100,size =5),
                    marker=dict(color='#FFE400',line=dict(color='black', width=2)))]

layout = go.Layout(title='Math Scores Histogram',
                   xaxis = dict(title='Score'),
                   yaxis =dict(title='Frequency'),
                  width=700,height=450, template = 'simple_white')

fig = go.Figure(data = data, layout = layout)

iplot(fig)

In [None]:
fig = go.Figure()
fig.add_trace(go.Histogram(x=df[df['gender']=='male']['reading score'],
                           xbins = dict(start = 0,end =100,size =5),name='Male',
                          marker=dict(color = '#0891EF', opacity = 0.5)))
fig.add_trace(go.Histogram(x=df[df['gender']=='female']['reading score'],
                           xbins = dict(start = 0,end =100,size =5),name='Female',
                          marker =dict(color ='#FF00E0', opacity = 0.5)))

fig.update_layout(title='Reading Scores Histogram',barmode='overlay',
                  xaxis = dict(title='Score'),
                   yaxis =dict(title='Frequency'),
                  width=700,height=450)
iplot(fig)

<a id = "10"></a><h1 id="Distplots"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Distplots</span></h1>

- Distribution Plots, or Distplots, typically layer three plots on top of one another.
- The first is a histogram, where each data point is placed inside a bin of similar values.
- The second is a rug plot - marks are placed along the x-axis for every data point, which lets you see the distribution of values inside each bin.
- Lastly, Distribution plots often include a "kernel density estimate", or KDE line that tries to describe the shape of the distribution.
- Use <code>create_distplot</code> to define a distplot.

In [None]:
hist_data = []
group_labels=[]
for i in range(len(df['race/ethnicity'].unique())):
    hist_data.append(df[df['race/ethnicity'] == df['race/ethnicity'].unique()[i]]['math score'])
    group_labels.append(df['race/ethnicity'].unique()[i])

fig = ff.create_distplot(hist_data, group_labels, bin_size= 5)

fig.update_layout(title={'text': "Math Scores Distplot",'y':0.85,'x':0.48,'xanchor': 'center',
        'yanchor': 'top'},barmode='overlay',template='plotly_white')

iplot(fig)

<a id = "11"></a><h1 id="Heatmaps"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Heatmaps</span></h1>

- Heatmaps allow the visualization of **3 features**.
- Categorical or continuous features along the x and y axis to make up a grid, and then a 3rd continuous feature displayed through color.
- X and Y axis are seperated into intervals to form a grid.
- **Categorical** features also can be defined on the **x** and **y** axis.
- Use <code>go.Heatmap</code> to define a heatmap.

In [None]:
data = [go.Heatmap(x=df['gender'],
                   y= df['parental level of education'],
                   z = df['math score'].values.tolist(),
                   colorscale = 'Plasma')]

layout = go.Layout(title={'text': "Gender & Level of Education",'y':0.9,'x':0.56,'xanchor': 'center',
        'yanchor': 'top'},
                   xaxis = dict(title='Gender'),
                   yaxis =dict(title='Level of Education'),
                   width=700,height=450,template='plotly_white')

fig = go.Figure(data = data, layout = layout)
iplot(fig)

<a id = "12"></a><h1 id="Box Plots"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Box Plots</span></h1>
- Box Plots visualize the variation of a feature by depicting the **continuous numerical** data through quartiles.
- Seperate the data based on a categorical feature to compare the continuous feature based on catergory.
- Create a Box Plot with <code>go.Box</code>
- The Box Plot is a way of visually displaying the data distribution through their quartiles.
- Quartiles seperate the data into four equal parts.
- **Q1** represents the **first quartile** and is the 25th percentile.
- **Q2 (the median)** is the 50th percentile and shows that 50% of the scores.
- Finally, **Q3**, the 75th percentile, is the central point that lies between the median (Q2) and the highest number of the distribution.
- If we hover over the plot, we can display the median, max, min values and quartiles.

In [None]:
data = go.Box(y=df['math score'],name = 'Math Score',marker_color='#F513C1')
layout = go.Layout(title={'text': "Math Score",'y':0.9,'x':0.5,'xanchor': 'center',
        'yanchor': 'top'}, width = 450, height=450)
fig = go.Figure(data = data, layout=layout)
iplot(fig)

In [None]:
data = [go.Box(x =df['reading score'],showlegend=False, name = 'Reading Score'),
        go.Box(x=df['writing score'],showlegend=False, name = 'Writing Score'),
       go.Box(x=df['math score'],showlegend=False, name = 'Math Score')]

layout = go.Layout(title={'text': "Scores",'y':0.9,'x':0.5,'xanchor': 'center',
        'yanchor': 'top'}, width = 700, height=450,template='plotly_dark')
fig = go.Figure(data = data, layout = layout)
iplot(fig)

In [None]:
fig = make_subplots(rows=1,cols=2,shared_yaxes=True,subplot_titles=("Male", "Female"))

fig.add_trace(go.Box(y =df[df['gender']=='male']['writing score'],showlegend=False,
                     name = 'Writing Score',marker_color='#1760E1'),row=1,col=1)
fig.add_trace(go.Box(y =df[df['gender']=='male']['math score'],showlegend=False ,
                     name = 'Math Score',marker_color='#17E160'),row=1,col=1)
fig.add_trace(go.Box(y =df[df['gender']=='male']['reading score'],showlegend=False ,
                     name = 'Reading Score',marker_color='#E1E117'),row=1,col=1)

fig.add_trace(go.Box(y =df[df['gender']=='female']['writing score'],showlegend=False,
                     name = 'Writing Score',marker_color='#1760E1'),row=1,col=2)
fig.add_trace(go.Box(y =df[df['gender']=='female']['math score'] ,showlegend=False,
                     name = 'Math Score',marker_color='#17E160'),row=1,col=2)
fig.add_trace(go.Box(y =df[df['gender']=='female']['reading score'],showlegend=False ,
                     name = 'Reading Score',marker_color='#E1E117'),row=1,col=2)

fig.update_layout(title={'text': "Scores by Gender",'y':0.9,'x':0.5,'xanchor': 'center',
        'yanchor': 'top'}, width = 700, height=450,template='plotly')      
iplot(fig)

<a id = "13"></a><h1 id="Subplots"><span class="label label-default" style="background-color:#DC1010; border-radius:12px; font-weight: bold; font-family:Verdana; font-size:22px; color:#FBFAFC; ">Subplots</span></h1>
- Subplots method provides a way to plot **multiple plots** on a **single figure**.
- Use <code>make_subplots</code> to create a subplot, then define the size of the subplot with <code>rows</code> and <code>cols</code>
- Set the title of each plot with <code>subplot_titles</code>

In [None]:
colors = ['#237DE3','#23E37D','#E35BDB','#E3885B','#5BE3E1','#C27CED']

fig = make_subplots(rows=1,cols=2,
                    subplot_titles=('Countplot',
                                    'Percentages'),
                    specs=[[{"type": "xy"},
                            {'type':'domain'}]])

fig.add_trace(go.Bar( y = df['race/ethnicity'].value_counts().values.tolist(), 
                      x = df['race/ethnicity'].value_counts().index, 
                      text=df['race/ethnicity'].value_counts().values.tolist(),
                      textfont=dict(size=15),
                      name = 'race/ethnicity',
                      textposition = 'auto',
                      showlegend=False,
                      marker=dict(color = colors,
                                  line=dict(color='white',
                                            width=1.5))),
              row = 1, col = 1)

fig.add_trace(go.Pie(labels=df['race/ethnicity'].value_counts().keys(),
                     values=df['race/ethnicity'].value_counts().values,
                     textfont = dict(size = 16),
                     textposition='auto',
                     showlegend = False,
                     name = 'race/ethnicity',
                     marker=dict(colors = colors)),
              row = 1, col = 2)

fig.update_layout(title={'text': 'Race/Ethnicity',
                         'y':0.9,
                         'x':0.5,
                         'xanchor': 'center',
                         'yanchor': 'top'},
                  template='plotly_white')

iplot(fig)

In [None]:
tpc = pd.DataFrame(df.groupby(['test preparation course']).mean())
tpc['tpc'] = tpc.index
tpc = tpc.reset_index()
fig = make_subplots(rows=2,
                    cols=2,
                    subplot_titles=("Math & Writing Scores",
                                    "Math Scores",
                                    "Writing Scores",
                                    "Avg Scores"))

fig.add_trace((go.Scatter(x=df[df['test preparation course']=='none']['math score'],
                          showlegend=False,
                          text='None',
                   y = df[df['test preparation course']=='none']['writing score'],
                          name='None',
                          mode='markers',
                  marker = dict(color= 'cornflowerblue',
                                size=8,
                                opacity = 0.6))),row=1,col=1)

fig.add_trace((go.Scatter(x=df[df['test preparation course']=='completed']['math score'],
                          showlegend=False,
                          text='Completed',
                   y = df[df['test preparation course']=='completed']['writing score'],
                          name='Completed',
                          mode='markers',
                  marker = dict(color= 'darkorange',
                                size=8,
                                opacity = 0.6))),row=1,col=1)

fig.add_trace(go.Histogram(x=df[df['test preparation course']=='none']['math score'],
                           showlegend=False,
                           xbins = dict(start = 0,end =100,size =5),
                           name='None',
                          marker=dict(color = 'cornflowerblue')),row=1,col=2)
fig.add_trace(go.Histogram(x=df[df['test preparation course']=='completed']['math score'],
                           showlegend=False,
                           xbins = dict(start = 0,end =100,size =5),
                           name='Completed',
                          marker =dict(color ='darkorange')),row=1,col=2)

fig.add_trace(go.Violin(y =df[df['test preparation course']=='none']['math score'],
                        showlegend=False,
                        name = 'None',
                        marker_color='#55EAE8'),row=2,col=1)
fig.add_trace(go.Violin(y =df[df['test preparation course']=='completed']['math score'],
                        showlegend=False,
                        name = 'Completed',
                        marker_color='#EA5555'),row=2,col=1)

fig.add_trace(go.Bar(x = tpc['tpc'],name='Math Score',
                     y = tpc['math score'],
                     showlegend=False,
                     text =round(tpc['math score'],1),
                     textposition= 'auto',marker = dict(color = '#B900FF',
                                                        opacity=0.7)),row=2,col=2)

fig.add_trace(go.Bar(x = tpc['tpc'],name='Writing Score',
                     y = tpc['writing score']
                     ,showlegend=False,
                     text =round(tpc['writing score'],1),
                     textposition= 'auto',marker = dict(color = '#F7FA10',
                                                        opacity=0.7)),row=2,col=2)

fig.add_trace(go.Bar(x = tpc['tpc'],name='Reading Score',
                     y = tpc['reading score'],
                     showlegend=False,
                     text =round(tpc['reading score'],1),
                     textposition= 'auto',
                     marker = dict(color = '#35CB29',
                                   opacity=0.7)),row=2,col=2)

fig.update_xaxes(title_text="Math", row=1, col=1)
fig.update_yaxes(title_text="Writing", row=1, col=1)
fig.update_yaxes(title_text="Frequency", row=1, col=2)
fig.update_xaxes(title_text="Test Preparation Course", row=2, col=1)
fig.update_xaxes(title_text="Test Preparation Course", row=2, col=2)
fig.update_yaxes(range = ([0,100]), row=2, col=2)
fig.update_layout(title={'text': "Students Performance In Exams",
                         'y':0.9,
                         'x':0.5,
                         'xanchor': 'center',
                         'yanchor': 'top'},
                  template='plotly_white')

iplot(fig)

<span style="font-weight: bold; font-family:Verdana; font-size:18px; color:#DC1010; ">BONUS</span>

In [None]:
def create_scatter(variable):
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    a_list = []
    b_list=[]
    for i in tpc.columns[:tpc.shape[1]-1]:
            a_list.append(i)
        
    for i in range(2):
            c = random.choices(a_list)[0]
            a_list.remove(c)
            b_list.append(c)
        
    for a in df[variable].unique():
        fig.add_trace((go.Scatter(x=df[df[variable]==a][b_list[0]],showlegend=False,text=a,
                       y = df[df[variable]==a][b_list[1]],name=a, mode='markers')))
                
    fig.update_xaxes(title_text=b_list[0].title())
    fig.update_yaxes(title_text=b_list[1].title())            
    fig.update_layout(title={'text': variable.title(),'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                     template='plotly_white')
    
    return iplot(fig)

def create_bar(variable):
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    for i in tpc.columns[:tpc.shape[1]-1]:
        
        fig.add_trace(go.Bar(x = tpc['tpc'],name=str(i),y = tpc[str(i)],
                             showlegend=False,text =round(tpc[(i)],2),
                     textposition= 'auto'))
    fig.update_layout(title={'text': variable.title(),'y':0.9,'x':0.5,
                             'xanchor': 'center','yanchor': 'top'},
                 template='simple_white')
    
    return iplot(fig)

def create_histogram(variable,size=10):
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    b=random.choices(tpc.columns[:tpc.shape[1]-1])[0]
    
    for i in df[variable].unique():
        
        fig.add_trace(go.Histogram(x=df[df[variable]==i][b],name=str(i),
                                   showlegend=True,marker =dict(opacity = 0.6),
                                  xbins = dict(start = df[b].min(),end =df[b].max(),size =size)))
    fig.update_xaxes(title_text=b.title())
    fig.update_yaxes(title_text='Frequency')
    fig.update_layout(title={'text': variable.title(),'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                 barmode='overlay',template='plotly_white')
    
    return iplot(fig)

def create_box(variable):
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    b=random.choices(tpc.columns[:tpc.shape[1]-1])[0]
    
    for i in df[variable].unique():
        
        fig.add_trace(go.Box(y=df[df[variable]==i][b],name=str(i),showlegend=False))
    fig.update_yaxes(title_text=b.title()) 
    fig.update_layout(title={'text': variable.title(),'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                 barmode='overlay',template='plotly_white')
    
    return iplot(fig)

def create_violin(variable):
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    b=random.choices(tpc.columns[:tpc.shape[1]-1])[0]
    
    for i in df[variable].unique():
        
        fig.add_trace(go.Violin(y=df[df[variable]==i][b],name=str(i),showlegend=False))
    fig.update_yaxes(title_text=b.title())
    fig.update_layout(title={'text': variable.title(),
                             'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                 barmode='overlay',template='plotly_white')
    
    return iplot(fig)

def create_pie(variable):
    
    fig = go.Figure(data=[go.Pie(labels=df[variable].value_counts().keys(),
                                 values=df[variable].value_counts()[0:len(df[variable].unique())].values)])
    fig.update_traces(hoverinfo='value', textinfo='label', textfont_size=16,
                      textposition ='auto',showlegend=False)
    
    fig.update_layout(title={'text': variable.title(),
                             'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top'},
                     template='simple_white')
    iplot(fig)

def create_dist(variable):
    hist_data = []
    group_labels=[]
    tpc = pd.DataFrame(df.groupby([variable]).mean())
    tpc['tpc'] = tpc.index
    tpc = tpc.reset_index()
    tpc = tpc.drop(tpc.columns[0],axis=1)
    fig = go.Figure()
    b=random.choices(tpc.columns[:tpc.shape[1]-1])[0]
    for i in range(len(df[variable].unique())):
        hist_data.append(df[df[variable] == df[variable].unique()[i]][b])
        group_labels.append(df[variable].unique()[i])

    fig = ff.create_distplot(hist_data, group_labels, bin_size= 5)

    fig.update_layout(title={'text': b.title(),'y':0.85,'x':0.48,'xanchor': 'center',
            'yanchor': 'top'},barmode='overlay',template='plotly_white')

    iplot(fig)    

create_scatter('race/ethnicity')

In [None]:
create_box('parental level of education')

In [None]:
create_histogram('lunch',size=5)

In [None]:
create_pie('gender')

In [None]:
create_bar('race/ethnicity')