## Importing required libraries!!

Built on top of the Plotly JavaScript library (plotly.js), plotly enables Python users to create beautiful interactive web-based visualizations that can be displayed in Jupyter notebooks, saved to standalone HTML files, or served as part of pure Python-built web applications using Dash. The plotly Python library is sometimes referred to as "plotly.py" to differentiate it from the JavaScript library.

In [1]:
import pandas as pd
import numpy as np
import chart_studio.plotly as py
import cufflinks as cf
#for datasets
import seaborn as sns
import plotly.express as px
%matplotlib inline

#plotly work in jupyter notebook~
from plotly.offline import download_plotlyjs, init_notebook_mode,plot,iplot
init_notebook_mode(connected=True)

#for everything to work offline
cf.go_offline()

# BASICS

In [2]:
arr_1=np.random.randn(50,4)
df_1=pd.DataFrame(arr_1,columns=['A','B','C','D'])
df_1.head()
df_1.iplot()

# Line PLots!

In [3]:
#for customized graphs/plots
import plotly.graph_objects as go
df_stocks = px.data.stocks()
px.line(df_stocks,x='date',y='GOOG',labels={'x':'Date','y':'Price'})

In [4]:
px.line(df_stocks,x='date',y=['GOOG','AAPL'],labels={'x':'Date','y':'Price'},title='Apple vs Google!')

In [5]:
fig=go.Figure()
fig.add_trace(go.Scatter(x=df_stocks.date,y=df_stocks.AAPL,mode='lines',name='Apple'))
fig.add_trace(go.Scatter(x=df_stocks.date,y=df_stocks.AMZN,mode='lines+markers',name='Amazon'))
fig.add_trace(go.Scatter(x=df_stocks.date, y=df_stocks.GOOG,mode='lines+markers', name='Google',line=dict(color='firebrick', width=2, dash='dashdot')))

In [7]:
import plotly.express as px

fig = px.line(x=["a","b","c"], y=[1,3,2], title="sample figure")
print(fig)
fig.show()

Figure({
    'data': [{'hovertemplate': 'x=%{x}<br>y=%{y}<extra></extra>',
              'legendgroup': '',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'mode': 'lines',
              'name': '',
              'orientation': 'v',
              'showlegend': False,
              'type': 'scatter',
              'x': [a, b, c],
              'xaxis': 'x',
              'y': array([1, 3, 2]),
              'yaxis': 'y'}],
    'layout': {'legend': {'tracegroupgap': 0},
               'template': '...',
               'title': {'text': 'sample figure'},
               'xaxis': {'anchor': 'y', 'domain': [0.0, 1.0], 'title': {'text': 'x'}},
               'yaxis': {'anchor': 'x', 'domain': [0.0, 1.0], 'title': {'text': 'y'}}}
})


In [8]:
import pandas as pd

df = pd.DataFrame({
  "Fruit": ["Apples", "Oranges", "Bananas", "Apples", "Oranges", "Bananas"],
  "Contestant": ["Alex", "Alex", "Alex", "Jordan", "Jordan", "Jordan"],
  "Number Eaten": [2, 1, 3, 1, 3, 2],
})

import plotly.express as px

fig = px.bar(df, x="Fruit", y="Number Eaten", color="Contestant", barmode="group")
fig.show()

import plotly.graph_objects as go

fig = go.Figure()
for contestant, group in df.groupby("Contestant"):
    fig.add_trace(go.Bar(x=group["Fruit"], y=group["Number Eaten"], name=contestant,
      hovertemplate="Contestant=%s<br>Fruit=%%{x}<br>Number Eaten=%%{y}<extra></extra>"% contestant))
fig.update_layout(legend_title_text = "Contestant")
fig.update_xaxes(title_text="Fruit")
fig.update_yaxes(title_text="Number Eaten")
fig.show()

In [9]:
# Import required libraries
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

In [10]:
# Read the airline data into pandas dataframe
airline_data =  pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DV0101EN-SkillsNetwork/Data%20Files/airline_data.csv', 
                            encoding = "ISO-8859-1",
                            dtype={'Div1Airport': str, 'Div1TailNum': str, 
                                   'Div2Airport': str, 'Div2TailNum': str})

In [11]:
# Preview the first 5 lines of the loaded data 
airline_data.head()

Unnamed: 0.1,Unnamed: 0,Year,Quarter,Month,DayofMonth,DayOfWeek,FlightDate,Reporting_Airline,DOT_ID_Reporting_Airline,IATA_CODE_Reporting_Airline,...,Div4WheelsOff,Div4TailNum,Div5Airport,Div5AirportID,Div5AirportSeqID,Div5WheelsOn,Div5TotalGTime,Div5LongestGTime,Div5WheelsOff,Div5TailNum
0,1295781,1998,2,4,2,4,1998-04-02,AS,19930,AS,...,,,,,,,,,,
1,1125375,2013,2,5,13,1,2013-05-13,EV,20366,EV,...,,,,,,,,,,
2,118824,1993,3,9,25,6,1993-09-25,UA,19977,UA,...,,,,,,,,,,
3,634825,1994,4,11,12,6,1994-11-12,HP,19991,HP,...,,,,,,,,,,
4,1888125,2017,3,8,17,4,2017-08-17,UA,19977,UA,...,,,,,,,,,,


In [12]:
# Randomly sample 500 data points. Setting the random state to be 42 so that we get same result.
data = airline_data.sample(n=500, random_state=42)

In [13]:
# Get the shape of the trimmed data
data.shape

(500, 110)

#### Idea: How departure time changes with respect to airport distance

In [16]:
# First we create a figure using go.Figure and adding trace to it through go.scatter
fig = go.Figure(data=go.Scatter(x=data['Distance'], y=data['DepTime'], mode='markers', marker=dict(color='red')))

In [17]:
# Updating layout through `update_layout`. Here we are adding title to the plot and providing title to x and y axis.
fig.update_layout(title='Distance vs Departure Time', xaxis_title='Distance', yaxis_title='DepTime')
# Display the figure
fig.show()

## 2. Line Plot


#### Idea: Extract average monthly arrival delay time and see how it changes over the year.

In [30]:
# Group the data by Month and compute average over arrival delay time.
line_data = data.groupby('Month')['ArrDelay'].mean().reset_index()
line_data

Unnamed: 0,Month,ArrDelay
0,1,2.232558
1,2,2.6875
2,3,10.868421
3,4,6.229167
4,5,-0.27907
5,6,17.310345
6,7,5.088889
7,8,3.121951
8,9,9.081081
9,10,1.2


#### To do:

*   Create a line plot with x-axis being the month and y-axis being computed average delay time. Update plot title,\
    xaxis, and yaxis title.

*   Hint: Scatter and line plot vary by updating mode parameter.


In [34]:
fig = go.Figure(data=go.Scatter(x=line_data['Month'], y=line_data['ArrDelay'], mode='markers,lines', marker=dict(color='green')))
fig.update_layout(title='Month vs Average Flight Delay Time', xaxis_title='Month', yaxis_title='ArrDelay')
fig.show()

In [35]:
# Group the data by destination state and reporting airline. Compute total number of flights in each combination
bar_data=data.groupby(['DestState'])['Flights'].sum().reset_index()
bar_data.head()

Unnamed: 0,DestState,Flights
0,AK,4.0
1,AL,3.0
2,AZ,8.0
3,CA,68.0
4,CO,20.0


In [37]:
# Use plotly express bar chart function px.bar. Provide input data, x and y axis variable, and title of the chart.
# This will give total number of flights to the destination state.
fig = px.bar(bar_data, x="DestState", y="Flights", title='Total number of flights to the destination state split by reporting airline') 
fig.show()

# Bubble chart

In [38]:
# Group the data by reporting airline and get number of flights
bub_data = data.groupby('Reporting_Airline')['Flights'].sum().reset_index()

In [39]:
bub_data.head()

Unnamed: 0,Reporting_Airline,Flights
0,9E,5.0
1,AA,57.0
2,AS,14.0
3,B6,10.0
4,CO,12.0


**To do**

*   Create a bubble chart using the `bub_data` with x-axis being reporting airline and y-axis being flights.
*   Provide title to the chart
*   Update size of the bubble based on the number of flights. Use `size` parameter.
*   Update name of the hover tooltip to `reporting_airline` using `hover_name` parameter.


In [43]:
# Create bubble chart here

fig = px.scatter(bub_data, x="Reporting_Airline", y="Flights", size="Flights",hover_name="Reporting_Airline", title='Reporting Airline vs Number of Flights', size_max=60)
fig.show()

# Histogram


In [50]:
# Set missing values to 0
data['ArrDelay'] = data['ArrDelay'].fillna(0)

**To do**

*   Use px.histogram and pass the dataset.
*   Pass `ArrDelay` to x parameter.


In [51]:
# Create histogram here
fig=px.histogram(data,x="ArrDelay")
fig.show()

# Pie Chart


#### Idea: Proportion of distance group by month (month indicated by numbers)

In [53]:
# Use px.pie function to create the chart. Input dataset. 
# Values parameter will set values associated to the sector. 'Month' feature is passed to it.
# labels for the sector are passed to the `names` parameter.
fig = px.pie(data, values='Month', names='DistanceGroup', title='Distance group proportion by month')
fig.show()

# Sunburst Charts


#### Idea: Hierarchical view in othe order of month and destination state holding value of number of flights


**To do**

*   Create sunburst chart using `px.sunburst`.
*   Define hierarchy of sectors from root to leaves in `path` parameter. Here, we go from `Month` to `DestStateName` feature.
*   Set sector values in `values` paramter. Here, we can pass in `Flights` feature.
*   Show the figure.


In [56]:
# Create sunburst chart here
fig=px.sunburst(data,path=['Month','DestStateName'],values='Flights')
fig.show()