# **Visualization with Plotly**


## Objectives

Perform exploratory Data Analysis and Feature Engineering using `Pandas`,`Matplotlib` and `Plotly`

*   Exploratory Data Analysis


__Colors available for 'color_continuous_scale'__ : brbg, bluyl, rdylbu, earth, prgn, speed, sunsetdark, armyrose

__Basic Colors__ : 'gray', 'blue', 'white', 'lightgreen', 'pink', 'black', 'green', 'red', 'lightgray', 'lightred', 'darkblue', 'darkred', 'purple', 'orange', 'darkpurple', 'lightblue', 'cadetblue', 'beige', 'darkgreen'

### Import Libraries and Define Auxiliary Functions


We will import the following libraries the lab


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import plotly.express as px
import plotly.graph_objects as go

## Exploratory Data Analysis


First, let's read the SpaceX dataset into a Pandas dataframe and print its summary


In [2]:
df=pd.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/dataset_part_2.csv")
df.head(5)

Unnamed: 0,FlightNumber,Date,BoosterVersion,PayloadMass,Orbit,LaunchSite,Outcome,Flights,GridFins,Reused,Legs,LandingPad,Block,ReusedCount,Serial,Longitude,Latitude,Class
0,1,2010-06-04,Falcon 9,6104.959412,LEO,CCAFS SLC 40,None None,1,False,False,False,,1.0,0,B0003,-80.577366,28.561857,0
1,2,2012-05-22,Falcon 9,525.0,LEO,CCAFS SLC 40,None None,1,False,False,False,,1.0,0,B0005,-80.577366,28.561857,0
2,3,2013-03-01,Falcon 9,677.0,ISS,CCAFS SLC 40,None None,1,False,False,False,,1.0,0,B0007,-80.577366,28.561857,0
3,4,2013-09-29,Falcon 9,500.0,PO,VAFB SLC 4E,False Ocean,1,False,False,False,,1.0,0,B1003,-120.610829,34.632093,0
4,5,2013-12-03,Falcon 9,3170.0,GTO,CCAFS SLC 40,None None,1,False,False,False,,1.0,0,B1004,-80.577366,28.561857,0


First, let's try to see how the `FlightNumber` (indicating the continuous launch attempts.) and `Payload` variables would affect the launch outcome.

We can plot out the <code>FlightNumber</code> vs. <code>PayloadMass</code>and overlay the outcome of the launch. We see that as the flight number increases, the first stage is more likely to land successfully. The payload mass is also important; it seems the more massive the payload, the less likely the first stage will return.


We see that different launch sites have different success rates.  <code>CCAFS LC-40</code>, has a success rate of 60 %, while  <code>KSC LC-39A</code> and <code>VAFB SLC 4E</code> has a success rate of 77%.


In [3]:
fig = px.scatter(df, x="FlightNumber", y="PayloadMass", color="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn')
fig.update_layout(title='Flight Number vs. Payload Mass', xaxis_title='Flight Number', yaxis_title='Payload Mass (Kg)')
fig.show()

Next, let's drill down to each site visualize its detailed launch records.


### TASK 1: Visualize the relationship between Flight Number and Launch Site


Use the function <code>catplot</code> to plot <code>FlightNumber</code> vs <code>LaunchSite</code>, set the  parameter <code>x</code>  parameter to <code>FlightNumber</code>,set the  <code>y</code> to <code>Launch Site</code> and set the parameter <code>hue</code> to <code>'class'</code>


Now try to explain the patterns you found in the Flight Number vs. Launch Site scatter point plots.


In [4]:
fig = px.scatter(df, x="FlightNumber", y="LaunchSite", color="Class", hover_data=['PayloadMass'], color_continuous_scale='rdylgn')
fig.update_layout(title='Flight Number vs. Launch Site', xaxis_title='Flight Number', yaxis_title='Launch Site')
fig.show()

In [5]:
fig = px.scatter(df, x="FlightNumber", y="LaunchSite", color="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', opacity=.6)
fig.update_layout(title='Flight Number vs. Launch Site', xaxis_title='Flight Number', yaxis_title='Launch Site')
fig.show()

In [6]:
fig = px.scatter(df, x="FlightNumber", y="LaunchSite", color="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', opacity=.6, facet_row="Class",height=400)
fig.update_layout(title='Flight Number vs. Launch Site', xaxis_title='Flight Number', yaxis_title='Launch Site')
fig.show()

### Explain the patterns - found in the Flight Number vs. Launch Site

> - **With more flight numbers (after 40) higher the success rate for the Rocket is increasing**. 
> - _But theres no clear pattern to make a decision if the Flight Number is dependant on Launch Site for a success launch._



***

### TASK 2: Visualize the relationship between Payload and Launch Site


We also want to observe if there is any relationship between launch sites and their payload mass.


Now try to explain any patterns you found in the Payload Vs. Launch Site scatter point chart.


In [7]:
fig = px.scatter(df, x="PayloadMass", y="LaunchSite", color="Class", size='PayloadMass', hover_data=['PayloadMass'],  color_continuous_scale='rdylgn', opacity=.6)
fig.update_layout(title='Payload Vs. Launch Site', xaxis_title='Payload Mass (kg)', yaxis_title='Launch Site')
fig.show()

In [8]:
fig = px.scatter(df, x="PayloadMass", y="LaunchSite", color="Class", size='PayloadMass', facet_row="Class", hover_data=['PayloadMass'],  color_continuous_scale='rdylgn', opacity=.6, height=400)
fig.update_layout(title='Payload Vs. Launch Site', xaxis_title='Payload Mass (kg)', yaxis_title='Launch Site')
fig.show()

### Explain the patterns - found in the Payload Vs. Launch Site 

> The **greater the payload mass (greater than 8000) higher the success rate for the Rocket**. _But theres no clear pattern to make a decision if the Launch Site is dependant on Pay Load Mass for a success launch._

***

### TASK  3: Visualize the relationship between success rate of each orbit type


Next, we want to visually check if there are any relationship between success rate and orbit type.


Let's create a `bar chart` for the sucess rate of each orbit


Analyze the ploted bar chart try to find which orbits have high sucess rate.


In [9]:
xh =  df.groupby(['Orbit'], as_index=False)['Class'].mean()
xh.sort_values(['Class'], inplace=True)
xh

Unnamed: 0,Orbit,Class
8,SO,0.0
2,GTO,0.518519
4,ISS,0.619048
6,MEO,0.666667
7,PO,0.666667
5,LEO,0.714286
10,VLEO,0.857143
3,HEO,1.0
1,GEO,1.0
0,ES-L1,1.0


In [10]:
fig = px.bar(xh, x='Orbit', y='Class', hover_data=['Orbit', 'Class'], color='Class', height=400, color_continuous_scale='teal')
fig.update_layout(title='Success Rate vs. Orbit Type', xaxis_title='Orbit', yaxis_title='Success Rate' )
fig.show()

### Explain the patterns - which orbits have high sucess rate. 

> **ES-L1, GEO, HEO, SSO has highest Sucess rates**. _SO has poorest._

***

### TASK  4: Visualize the relationship between FlightNumber and Orbit type


For each orbit, we want to see if there is any relationship between FlightNumber and Orbit type.


In [11]:
fig = px.scatter(df, x="Orbit", y="FlightNumber", color="Class", size='PayloadMass', hover_data=['PayloadMass'], height=600, color_continuous_scale='rdylgn', opacity=.6)
fig.update_layout(title='FlightNumber Vs. Orbit type', xaxis_title='Orbit', yaxis_title='Flight Number')
fig.show()

In [12]:
fig = px.scatter(df, y="Orbit", x="FlightNumber", color="Class", size='PayloadMass', hover_data=['PayloadMass'], height=500, color_continuous_scale='rdylgn', opacity=.6)
fig.update_layout(title='FlightNumber Vs. Orbit Type', yaxis_title='Orbit', xaxis_title='Flight Number')
fig.show()

You should see that in the LEO orbit the Success appears related to the number of flights; on the other hand, there seems to be no relationship between flight number when in GTO orbit.


***

### TASK  5: Visualize the relationship between Payload and Orbit type


Similarly, we can plot the Payload vs. Orbit scatter point charts to reveal the relationship between Payload and Orbit type


In [13]:
fig = px.scatter(df, x="Orbit", y="PayloadMass", color="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', height=600, opacity=.6)
fig.update_layout(title='Payload Vs. Orbit Type', xaxis_title='Orbit', yaxis_title='Payload Mass')
fig.show()

In [14]:
fig = px.scatter(df, y="Orbit", x="PayloadMass", color="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', height=450, opacity=.6)
fig.update_layout(title='Payload Vs. Orbit Type', yaxis_title='Orbit', xaxis_title='Payload Mass')
fig.show()

In [15]:
fig = px.scatter(df, x="Orbit", y="PayloadMass", color="Class", facet_row="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', height=800)
fig.update_layout(title='Payload Vs. Orbit type', xaxis_title='Orbit', yaxis_title='Payload Mass')
fig.show()

In [16]:
fig = px.scatter(df, x="Orbit", y="PayloadMass", color="Class", facet_col="Class", size='PayloadMass', hover_data=['PayloadMass'], color_continuous_scale='rdylgn', height=500, width=1000)
fig.update_layout(title='Payload Vs. Orbit type', xaxis_title='Orbit', yaxis_title='Payload Mass (Kg)')
fig.show()

You should observe that Heavy payloads have a negative influence on GTO orbits and positive on GTO and Polar LEO (ISS) orbits.


### TASK  6: Visualize the launch success yearly trend


You can plot a line chart with x axis to be <code>Year</code> and y axis to be average success rate, to get the average launch success trend.


The function will help you get the year from the date:


In [17]:
# A function to Extract years from the date 
year=[]
def Extract_year(date):
    for i in df["Date"]:
        year.append(i.split("-")[0])
    return year

In [18]:
# Plot a line chart with x axis to be the extracted year and y axis to be the success rate
df['year']=Extract_year(df["Date"])
df_groupby_year=df.groupby("year",as_index=False)["Class"].mean()

In [19]:
fig = px.line(df_groupby_year, x="year", y="Class", text="year", height=400)
fig.update_layout(title='Space X Rocket Success Rates', xaxis_title='Year', yaxis_title='Success Rate')
fig.update_traces(textposition="bottom right")
fig.show()

In [20]:
fig = px.line(df_groupby_year, x="year", y="Class", text="year")
fig.update_layout(title='Space X Rocket Success Rates', xaxis_title='Year', yaxis_title='Success Rate', hovermode='x unified')
fig.update_traces(textposition="bottom right")
fig.show()

you can observe that the sucess rate since 2013 kept increasing till 2020
