# Space Missions Exploration

<hr>

In [None]:
import numpy as np
import pandas as pd

In [None]:
import plotly.express as px
import plotly.graph_objects as go

from plotly.subplots import make_subplots

In [None]:
space_data = pd.read_csv("../input/all-space-missions-from-1957/Space_Corrected.csv")

## Data
<br>

**The data contains the details of all Space Missions from October 1957 - August 2020.**<br>**Let's take a look at the first few rows in the data.**

In [None]:
space_data.head()

Let's drop the unnamed columns and rename a few columns for convenience.

In [None]:
space_data = space_data.drop(["Unnamed: 0", "Unnamed: 0.1"], axis=1)

space_data.rename(columns={"Company Name": "Company", "Datum": "Date & Time", "Detail": "Rocket Name",
                           "Status Rocket": "Rocket Status", " Rocket": "Mission Cost", "Status Mission": "Mission Status"},
                  inplace=True)

In [None]:
space_data.head()

**There are 4324 rows and 7 columns in the data. Each row corresponds to a mission.**<br><br>**The column details :**

* **Company** - name of the space company
* **Location** - location of the launch
* **Date & Time** - date and time of the launch
* **Rocket Name** - name of the rocket
* **Rocket Status** - status of the rocket
* **Mission Cost** - cost of the mission in million $
* **Mission Status** - status of the mission

In [None]:
space_data["Date & Time"] = pd.to_datetime(space_data["Date & Time"], utc=True).dt.tz_localize(None)

space_data["Rocket Status"] = space_data["Rocket Status"].replace({"StatusActive": "Active", "StatusRetired": "Retired"})

space_data["Mission Cost"] = space_data["Mission Cost"].str.replace(",", "").astype(float)

<hr>

## You are now entering the Universe of Graphs.......

You can take a quick look at the [Plotly Graph Instructions](https://www.kaggle.com/swashbuckler1/plotly-graph-instructions) notebook for instructions on how to interact with the different plotly graphs.

In [None]:
comp_mission_count = space_data["Company"].value_counts()
comp_mission_count = comp_mission_count[comp_mission_count >= 50]

In [None]:
comp_mission_count_df = pd.DataFrame({"Company": comp_mission_count.index, "No. of Missions": comp_mission_count.values})

comp_mission_count_df.sort_values(by=["Company"], inplace=True)
comp_mission_count_df.reset_index(drop=True, inplace=True)

comp_mission_count_df.loc[comp_mission_count_df.index.max() + 1] = ["Others", space_data.shape[0] - comp_mission_count_df["No. of Missions"].sum()]

In [None]:
fig = go.Figure(data=[go.Pie(labels=comp_mission_count_df["Company"],
                             values=comp_mission_count_df["No. of Missions"],
                             textinfo='label+percent',
                             textposition="inside",
                    hovertemplate="%{label}<br>No. of Missions: %{value}<br>Proportion: %{percent}<extra></extra>"
                            )])

fig.update_layout(title={"text": "Frequency of Missions by Space Company", "x": 0.5},
                  legend_title="Company")

fig.show()

**Facts :**
* **RVSN USSR** has launched the most number of missions with **1777** missions.<br><br>
* **Arianespace**, **CASC**, **General Dynamics** have launched over **250** missions.<br><br>
* **NASA**, **VKS RF** have launched over **200** missions.<br><br>
* **RVSN USSR** has launched a staggering **41.1%** of all the total missions which is more than **Arianespace**, **CASC**, **General Dynamics**,<br> **NASA**, **VKS RF** combined.

<hr>

In [None]:
loc_launch = space_data["Location"].str.split(", ").str[-1].value_counts()

In [None]:
country_code_dict = {"Russia": "RUS", "USA": "USA", "Kazakhstan": "KAZ", "France": "FRA", "China": "CHN", "Japan": "JPN", 
                     "India": "IND", "New Zealand": "NZL", "Iran": "IRN", "Israel": "ISR", "Kenya": "KEN",
                     "Australia": "AUS", "North Korea": "PRK", "New Mexico": "MEX", "Brazil": "BRA", "South Korea": "KOR"}

In [None]:
loc_launch_df = pd.DataFrame({"Country": loc_launch.index,
                             "No. of Launches": loc_launch.values, 
                              "Country Code": loc_launch.index.map(country_code_dict),})

loc_launch_df.dropna(axis=0, subset=["Country Code"], inplace=True)

In [None]:
fig = go.Figure(data=go.Choropleth(
    locations = loc_launch_df['Country Code'],
    z = loc_launch_df['No. of Launches'],
    text = loc_launch_df['Country'],
    reversescale=True,
    marker_line_color='black',
    marker_line_width=1,
    hovertemplate="%{text}<br>No. of Missions: %{z}<br><extra></extra>",
    colorbar_title = 'No. of Missions',
))

fig.update_layout(
    title={"text": "Frequency of Missions by Location", "x": 0.5},
    geo=dict(
        showframe=False,
        showcoastlines=False,
        showocean=True,
        oceancolor="cornflowerblue",
        showland=True,
        landcolor="mediumspringgreen",
        projection_type='orthographic'
    )
)

annot_text = f"A few regions<br>which haven't been marked on the map<br>have combinedly launched {len(space_data) - loc_launch_df['No. of Launches'].sum()} missions."

fig.add_annotation(
        x=0.86,
        y=0,
        xref="paper",
        yref="paper",
        text=annot_text,
        font=dict(
            size=13,
            color="#ffffff"
            ),
        borderpad=5,
        bgcolor="#000000",
        opacity=0.7
    )


fig.show()

**Facts :**
* **Russia** and **USA** has launched the most number of missions with over **1300** missions.<br><br>
* **Kazakhstan** has launched over **700** missions.<br><br>
* **France** has launched over **300** missions while **China** and **Japan** have launched over **250** and over **100** missions respectively.<br><br>
* **India** has launched over **70** missions while all the remaining locations have launched less than **15** missions each.<br><br>
* A few other regions like **Pacific Ocean**, **Barents Sea**, **Gran Canaria** etc. which haven't been marked on the map have combined launched **44** missions.

<hr>

In [None]:
mission_date = pd.DataFrame({"Year": space_data["Date & Time"].dt.year,
                            "Month": space_data["Date & Time"].dt.month})

In [None]:
col = mission_date["Month"]
for i in range(1,5):
    l = list(range((i-1)*3 + 1, i*3+1))
    col = col.replace(l, f"Q{i}")
    
mission_date["Quarter"] = col

In [None]:
year_launches = mission_date["Year"].value_counts().sort_index()

In [None]:
fig = go.Figure(data=[
    go.Scatter(x=year_launches.index, y=year_launches.values,
                    mode='lines+markers', line=dict(color='black'),
               hovertemplate="Year: %{x}<br>No. of Missions: %{y}<br><extra></extra>"
            )
])

fig.update_layout(title={"text": "Frequency of Missions by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions", 
                  hovermode="x", plot_bgcolor="mediumspringgreen")

fig.update_xaxes(showgrid=False, tickvals=list(range(1960,2021, 5)),
                 linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, zeroline=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

**Facts :**

* The year **1957** in which the **first ever space mission** was launched saw the least number of total missions with only **3** missions.<br><br>
* Each year from **1965** to **1978** saw around **100** space missions which includes **119** missions in the year **1971** which is the **most number of missions** launched in an year ever.<br><br>
* From **1979** to **2015**, the count of the number of missions each year was not as much as it was from **1965** to **1978**.<br><br>
* From **2016 to 2019**, the number of missions in each year is around **100** and the count will mostly be maintained once the year **2020** ends<br> in which over **60** missions has been launched as of the first week of **August**.<br><br>
* In **2018**, a whopping **117** missions were launched in total which is the most number of missions launched in an year after **1971**.

<hr>

In [None]:
month_dict = {1: "January", 2: "February", 3: "March", 4: "April", 5: "May", 6: "June",
              7: "July", 8: "August", 9: "September", 10: "October", 11: "November", 12: "December"}

In [None]:
month_launches = mission_date["Month"].value_counts().sort_index()

In [None]:
fig = go.Figure(data=[
    go.Scatter(x=month_launches.index.map(month_dict), 
               y=month_launches.values, mode='markers', 
               marker=dict(size=month_launches.values//10, color=month_launches.values, showscale=True,
                           colorscale="hot", reversescale=True, line_width=2, line_color="black"),
               hovertemplate="Month: %{x}<br>No. of Missions: %{y}<br><extra></extra>"
            )
])

fig.update_layout(title={"text": "Frequency of Missions by Month", "x": 0.5},
                 xaxis_title="Month", yaxis_title="No. of Missions", plot_bgcolor="#E3F2FD")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

**Facts :**

* **December** has seen the most number of missions while **January** which is its very next month has surprisingly seen the least number of missions.

<hr>

In [None]:
month_year_launches = pd.pivot_table(index="Year", columns="Month", values="Quarter", aggfunc=len, data=mission_date)

month_year_launches = month_year_launches.fillna(0).astype(int)

In [None]:
fig = go.Figure(data=go.Heatmap(
                    z=month_year_launches.values,
                    x=month_year_launches.columns,
                    y=month_year_launches.index,
                    colorbar={"title": "No. of Missions"},
                    colorscale="dense",
                    hovertemplate='%{x}, %{y}<br>No. of Missions: %{z}<extra></extra>'))

fig.update_xaxes(tickvals = list(month_dict.keys()), ticktext = list(month_dict.values()),
                 linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(tickvals=list(range(1960,2021, 5)), linecolor="black", linewidth=2, mirror=True)

fig.update_layout(title={"text": "Frequency of Missions by Month and Year", "x": 0.5},
                  xaxis_title="Month", yaxis_title="Year", height=600)


fig.show()

**Facts :**

* The most number of missions were launched in **December 1971** with **18** missions.<br><br>
* Atleast **15** missions were launched in **April 1968**, **October 1970**, **April 1975**, **September 1975**, **July 1976**, **September 1977**, <br>**December 2018** and **December 2019**.

<hr>

In [None]:
quarter_year_launches = pd.pivot_table(index="Year", columns="Quarter", values="Month", aggfunc=len, data=mission_date)
quarter_year_launches = quarter_year_launches.fillna(0).astype(int)

quarter_year_launches = quarter_year_launches.stack()
quarter_year_launches.name = "No. of Missions"

In [None]:
quarter_year_launches_df = quarter_year_launches.reset_index()

quarter_year_launches_df.insert(0, "Quarter-Year",
                                quarter_year_launches_df["Quarter"] + ", " + quarter_year_launches_df["Year"].astype(str))

quarter_year_launches_df.drop(["Year", "Quarter"], axis=1, inplace=True)

In [None]:
fig = go.Figure(data=[
    go.Scatter(x=quarter_year_launches_df["Quarter-Year"], y=quarter_year_launches_df["No. of Missions"],
                    mode='lines', line=dict(color='#0277BD'),
               hovertemplate="No. of Missions: %{y}<br><extra></extra>"
            )
])

fig.update_layout(title={"text": "Frequency of Missions by Quarter and Year", "x": 0.5},
                 xaxis_title="Quarter-Year", yaxis_title="No. of Missions", plot_bgcolor="#FFF176",
                 hovermode="x unified"
                 )

fig.update_xaxes(showgrid=False, 
                 tickvals = quarter_year_launches_df["Quarter-Year"].iloc[3::8],
                 ticks="outside", linecolor="black", linewidth=2)
fig.update_yaxes(showgrid=False, zeroline=False, linecolor="black", linewidth=2)

fig.show()

**Facts :**

* The most number of mssions were launched in **Q4-2018** with **39** missions.<br><br>
* Atleast **35** missions were launched in **Q4-1970**, **Q2-1971**, **Q4-1971**, **Q4-1973**, **Q2-1975**, **Q3-1977**.

<hr>

In [None]:
rocket_status = space_data["Rocket Status"].value_counts()

In [None]:
colors = ["#FF7043", "#00E676"]

fig = go.Figure(data=[go.Pie(labels=rocket_status.index,
                             values=rocket_status.values,
                             textinfo='label+percent',
                             textfont=dict(family="sans serif", size=18, color="black"),
                    hovertemplate="Rocket Status: %{label}<br>Frequency: %{value}<br>Proportion: %{percent}<extra></extra>",
                    marker=dict(colors=colors, line=dict(color='#000000', width=1.5))
                            )])
fig.update_layout(title={"text": "Frequency of Missions by Rocket Status", "x": 0.5},
                  legend_title="Rocket Status")

fig.show()

**Facts :**

* Out of all rockets launched, over **80%** of the rockets are **retired** while less than **20%** of the rockets are still **active**.

<hr>

In [None]:
mission_status = space_data["Mission Status"].value_counts()

In [None]:
colors = ["00E676", "#FF6E40", "#FF9E80", "#000000"]

fig = go.Figure(data=[go.Pie(labels=mission_status.index,
                             values=mission_status.values,
                             textinfo='label+percent',
                             textposition="inside",
                             textfont=dict(family="sans serif", size=15),
                    hovertemplate="Mission Status: %{label}<br>Frequency: %{value}<br>Proportion: %{percent}<extra></extra>",
                    marker=dict(colors=colors)
                            )])

fig.update_layout(title={"text": "Frequency of Missions by Mission Status", "x": 0.5},
                  legend_title="Mission Status")

fig.show()

**Facts :**

* Out of all the missions, around **90%** of the missions have been **successful**.<br><br>
* Around **8%** of the total missions have **failed**.<br><br>
* Around **2%** of the total missions have **partially failed** and exactly **4** missions have **failed before launch**.

<hr>

In [None]:
comp_loc = pd.DataFrame({"Company": space_data["Company"],
                        "Location": space_data["Location"].str.split(", ").str[-1]})

comp_loc = comp_loc.groupby(by=["Location", "Company"]).size()
comp_loc.name = "No. of Launches"

In [None]:
comp_loc_df = comp_loc.reset_index()

In [None]:
labels = []
labels.extend(comp_loc_df["Location"].unique())
labels.extend(comp_loc_df["Company"])

parents = []
parents.extend([""] * comp_loc_df["Location"].nunique())
parents.extend(comp_loc_df["Location"])

ids = []
ids.extend(comp_loc_df["Location"].unique())
ids.extend(comp_loc_df["Location"] + " - " + comp_loc_df["Company"])

In [None]:
hovertext = [] 
hovertext.extend(
    
      "<b>" + comp_loc_df.groupby("Location").size().index + "</b><br>" + 
     "Total Missions: " + comp_loc_df.groupby("Location")["No. of Launches"].sum().astype(str) + "<br>" +
     "No. of Companies hosted: " + comp_loc_df.groupby("Location").size().astype(str)
                 
)



hovertext.extend(
    
    "<b>" + comp_loc_df["Company"] + "</b> (" + comp_loc_df["Location"] + ")" + "<br>" +
    "No. of Missions: " + comp_loc_df["No. of Launches"].astype(str)
                
)



In [None]:
fig =go.Figure(go.Sunburst(
    ids=ids,
    labels=labels,
    parents=parents,
    hoverinfo="text",
    hovertext=hovertext,
))

fig.update_layout(title={"text": "Statistics by Location", "x": 0.5}, height=650)

fig.show()

* **The graph describes the total no. of missions launched by each location.**
* **Additionally, for each location, the no. of companies hosted and the no. of missions launched by each company in that location is described.**

**Facts :**

* **Russia** has launched the most number of missions and has hosted about **8** different space companies.<br>**RVSN USSR** has hosted a staggering **1198** missions for Russia followed by **VKS RF** which has hosted about **157** missions.<br>The remaining **6** space companies have combinedly hosted **40** missions for **Russia**.<br><br>

* **USA** has hosted the most number of space companies with **16** companies.<br>**General Dynamics** has launched the most number of missions for USA with **251** missions followed by **NASA** with **203** missions.<br>**US Air Force**, **ULA**, **Boeing**, **Martin Marietta** and **SpaceX** have each launched atleast **100** missions.<br><br>

* **Kazakhstan** has hosted about **10** different space companies.<br>**RVSN USSR** has launched the most number of missions for Kazakhstan with **579** missions followed by **Roscosmos** with **47** missions.<br><br>

* For **China**, most of its missions were launched by **CASC** with **250** missions.<br><br>

* For **India**, all its missions were launched by **ISRO**.

<hr>

In [None]:
top_15_comp = space_data["Company"].value_counts().head(15).index

In [None]:
top_15_comp_data = space_data.loc[space_data["Company"].isin(top_15_comp)].copy()

In [None]:
top_15_comp_rock_stat = pd.pivot_table(data=top_15_comp_data, index="Company", columns="Rocket Status",
                                       values="Mission Cost", aggfunc=len)

top_15_comp_rock_stat = top_15_comp_rock_stat.fillna(0).astype(int)

In [None]:
fig = go.Figure(data=[
    
    go.Bar(name='Retired',
           x=top_15_comp_rock_stat.index, y=top_15_comp_rock_stat["Retired"],
           text=top_15_comp_rock_stat.sum(axis=1),
           hovertemplate="<b>%{x}</b><br><br>Rockets launched: %{text}<br>Retired: %{y}<extra></extra>",
           marker_color="#FF7043"),
    
    go.Bar(name='Active',
           x=top_15_comp_rock_stat.index, y=top_15_comp_rock_stat["Active"],
           text=top_15_comp_rock_stat.sum(axis=1),
           hovertemplate="<b>%{x}</b><br><br>Rockets launched: %{text}<br>Active: %{y}<extra></extra>",
           marker_color="#00E676")   
    
])

fig.update_layout(barmode='stack')

fig.update_layout(title={"text": "Rocket Status of Top 15 Space Companies", "x": 0.5},
                 xaxis_title="Company", yaxis_title="No. of Rockets",
                 legend_title="Rocket Status", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

* **The first 15 companies with the most number of missions are termed as 'Top 15'.**

**Facts :**

* The rockets of **RVSN USSR**, **General Dynamics**, **NASA**, **US Air Force**, **Martin Marietta** and **Lockheed** are all **retired**.<br><br>

* **Boeing** which has launched **136** rockets has only **1 active** rocket.<br><br>

* **CASC** has the most number of **active** rockets with **211** rockets followed by **Arianespace** with **114 active** rockets and **ULA** with **87 active** rockets.<br><br>

* **ISRO** has exactly **50 active** rockets out of its **76** rockets launched.

<hr>

In [None]:
top_15_comp_miss_stat = pd.pivot_table(data=top_15_comp_data, index="Company", columns="Mission Status",
                                       values="Mission Cost", aggfunc=len)

top_15_comp_miss_stat = top_15_comp_miss_stat.fillna(0).astype(int)

In [None]:
top_15_comp_miss_stat["Total Missions"] = top_15_comp_miss_stat.sum(axis=1)

top_15_comp_miss_stat["Success %"] = top_15_comp_miss_stat["Success"]/top_15_comp_miss_stat["Total Missions"] * 100

top_15_comp_miss_stat["Failure %"] = (
     
    (top_15_comp_miss_stat["Failure"]+top_15_comp_miss_stat["Partial Failure"]+top_15_comp_miss_stat["Prelaunch Failure"])/top_15_comp_miss_stat["Total Missions"] * 100
)

top_15_comp_miss_stat[["Success %", "Failure %"]] = top_15_comp_miss_stat[["Success %", "Failure %"]].round(2)

In [None]:
fig = go.Figure(data=[
    go.Bar(name='Success',
           x=top_15_comp_miss_stat.index, y=top_15_comp_miss_stat["Success"],
           text=top_15_comp_miss_stat["Total Missions"],
           meta=top_15_comp_miss_stat["Success %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of missions: %{text}<br>Successful Missions: %{y}<br>Success Rate: %{meta}%<extra></extra>",
           marker_color="#00E676"),
    
    go.Bar(name='Failure',
           x=top_15_comp_miss_stat.index, 
           y=top_15_comp_miss_stat[["Failure", "Partial Failure", "Prelaunch Failure"]].sum(axis=1),
           text=top_15_comp_miss_stat["Total Missions"],
           meta=top_15_comp_miss_stat["Failure %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of missions: %{text}<br>Failed Missions: %{y}<br>Failure Rate: %{meta}%<extra></extra>",
           marker_color="#FF7043"),
    
])

fig.update_layout(barmode='stack')

fig.update_layout(title={"text": "Mission Status of Top 15 Space Companies", "x": 0.5},
                 xaxis_title="Company", yaxis_title="No. of Missions",
                 legend_title="Mission Status", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

* **Since there are very few missions which have "partially failed" or "failed before launch", they have been accounted as "failed missions".**

**Facts :**

* **ULA** which has launched **140** missions has a mind-blowing **success rate** of **99.29%**.<br><br>

* **Arianespace** which has launched **279** missions has a **success rate** of **96.42%**.<br><br>

* **Boeing** which has launched **136** missions is just behind **Arianespace** with a **success rate** of **96.32%**.<br><br>

* **MHI** which has launched **84** missions has a **success rate** of **95.24%**.<br><br>

* **SpaceX**, **Lockheed**, **VKS RF**, **CASC**, **NASA** and **RVSN USSR** each has a **success rate** of atleast **90%**.<br><br>

* **RVSN USSR** which has launched a whopping **1777** missions has a success rate of **90.83%** which is amazing.<br><br>

* Although **RVSN USSR** has **163** failed missions, its **failure rate** of **9.17%** is lesser than **US Air Force**, **General Dynamics**, **ISRO** etc.

<hr>

In [None]:
def max_missions_year(data):
    d = [data["Company"].value_counts().sort_index().idxmax(), data["Company"].value_counts().sort_index().max()]
    i = ["Company", "No. of Missions"]
    return pd.Series(d, index=i)

In [None]:
year_max_miss_comp = space_data.groupby(space_data["Date & Time"].dt.year).apply(max_missions_year)
year_max_miss_comp.index.name = "Year"

In [None]:
colors_dict = {"RVSN USSR": "#FF5252", "US Navy": "#1565C0", "US Air Force": "#4DD0E1", "VKS RF": "#B388FF",
               "Lockheed": "#8BC34A", "Arianespace": "#FF9800", "Boeing": "#42A5F5",
               "CASC": "#FFD600", "ULA": "#00E676", "SpaceX": "#616161"}

In [None]:
colors = year_max_miss_comp["Company"].map(colors_dict)

In [None]:
fig = go.Figure(data=[
    
    go.Bar(
           x=year_max_miss_comp.index, y=year_max_miss_comp["No. of Missions"],
           text=year_max_miss_comp["Company"],
           hovertemplate="<b>%{x}</b><br><br>Company: %{text}<br>No. of Missions: %{y}<extra></extra>",
           marker_color=colors)
    
])


fig.update_layout(title={"text": "Companies with the most no. of missions by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, tickvals=list(range(1960,2021, 5)))
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, zerolinecolor="#616161", zerolinewidth=1)

fig.show()

* **The graph describes the Space companies which has launched the most number of missions in each year and the number of missions launched by the company in that year.**

**Facts :**

* **US Air Force** has launched the most number of missions for **4 consecutive years** from **1959** to **1962**.<br>In **1962**, the company has launched **41** missions which is its highest in an year.<br><br>

* **RVSN USSR** has launched the most number of missions in an year **30** times which includes a staggering **29 consecutive years** from **1963** to **1991**.<br>The company has launched atleast **50 missions per year 17** times in total.<br>In **1977**, the company has launched **97** missions which is the highest by any company in any year.<br><br>

* **VKS RF** has launched the most number of missions for **3 consecutive years** from **1992** to **1994**.<br><br>

* **Boeing** has launched the most number of missions in an year **5** times.<br><br>

* **Arianespace** has launched the most number of missions in an year **7** times which includes **3 consecutive years** from **2000** to **2002**.<br><br>

* **ULA** has launched the most number of missions in an year **7** times which includes **6 consecutive years** from **2009** to **2014**.<br><br>

* **CASC** has launched the most number of missions in an year **5** times which includes **3 consecutive years** from **2018** to **2020**.

<hr>

In [None]:
top_5_comp = top_15_comp[:5]

In [None]:
top_5_comp_data = top_15_comp_data.loc[top_15_comp_data["Company"].isin(top_5_comp)].copy()

In [None]:
top_5_comp_data["Year"] = top_5_comp_data["Date & Time"].dt.year

In [None]:
top_5_comp_miss = pd.pivot_table(data=top_5_comp_data, index="Year", columns="Company", values="Mission Cost", aggfunc=len)

In [None]:
fig = go.Figure()

for comp in top_5_comp_miss.columns:
    fig.add_trace(go.Scatter(x=top_5_comp_miss.index, y=top_5_comp_miss[comp],
                    mode='lines', name=comp,
                    meta=comp,
                    hovertemplate="<b>%{meta}</b> - %{y}<extra></extra>"                    
                ))
    
fig.update_layout(title={"text": "Frequency of Missions of Top 5 Companies by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions",
                 hovermode="x unified", legend_title="Company", plot_bgcolor="white"
                 )

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, tickvals=list(range(1960,2021, 5)))
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, zerolinecolor="#616161", zerolinewidth=1)

fig.show()

* **The first 5 companies with the most number of missions are termed as 'Top 5'.**<br>

* **The graph describes the number of missions launched by the Top 5 Space Companies in each year from 1957 to 2020.**

**Facts :**

* It is obvious that **RVSN USSR** is way ahead of the other space companies from **1962** to **1992**.<br><br>

* Post **RVSN USSR's** final mission in **1992**, **Arianespace** has launched most missions among the other space companies from **1993** to **2002**.<br>Again, the company has launched most missions from **2009** to **2015**.<br><br>

* From **2016** to **2020**, **CASC** has launched the most number of missions than the other space companies.<br><br>

* After **2011**, only **Arianespace** and **CASC** have been **active** among the **Top 5** space companies in terms of launching missions.

<hr>

In [None]:
top_5_comp_data["Month"] = top_5_comp_data["Date & Time"].dt.month

In [None]:
top_5_comp_miss_month = top_5_comp_data.groupby(["Company", "Month"]).size().unstack(level=0)

In [None]:
fig = go.Figure()

for comp in top_5_comp_miss_month.columns:
    fig.add_trace(go.Scatter(x=top_5_comp_miss_month.index, y=top_5_comp_miss_month[comp],
                    mode='lines', name=comp,
                    meta=comp,
                    hovertemplate="<b>%{meta}</b> - %{y}<extra></extra>"                    
                ))
    
fig.update_layout(title={"text": "Frequency of Missions of Top 5 Companies by Month", "x": 0.5},
                 xaxis_title="Month", yaxis_title="No. of Missions",
                 hovermode="x unified", legend_title="Company", plot_bgcolor="white"
                 )

fig.update_xaxes(tickvals=list(month_dict.keys()), ticktext=list(month_dict.values()),
                showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

**Facts :**

* **RVSN USSR** has launched most of its missions in **December** followed by **October**.<br>The company has launched least number of missions in **January**.<br><br>

* Similarly, **Arianespace** and **Genral Dynamics** has launched most of its missions in **December** and least number of missions in **January**.<br><br>

* **CASC** has launched most of its missions in **November** and least number of missions in **February**.<br><br>

* **NASA** has launched most of its missions in **November** and least number of missions in **August**.<br><br>

* Each of the top 5 space companies has launched most of its missions either in **November** or **December**.<br><br>

* Each of the top 5 space companies except **NASA** has launched the least number of missions either in **January** or **February**.<br>**NASA** has launched least number of missions in **August**.

<hr>

In [None]:
rock_miss_status = pd.pivot_table(data=space_data, index="Mission Status", columns="Rocket Status",
                                  values="Mission Cost", aggfunc=len)

rock_miss_status = rock_miss_status.astype(int)

In [None]:
colors = ["#FF6E40", "#FF9E80", "#000000", "00E676"]

fig = make_subplots(
    rows=1, cols=2,
    specs=[[{"type": "domain"}, {"type": "domain"}]],
    subplot_titles=("Active", "Retired"),
    horizontal_spacing=0.15
)

for i, rock_stat in enumerate(rock_miss_status.columns):
    fig.add_trace(go.Pie(labels=rock_miss_status.index,
                        values=rock_miss_status[rock_stat],
                        textinfo='label+percent',
                        textposition="inside",
                        textfont=dict(family="sans serif", size=15),
                        meta=rock_stat,
hovertemplate="<b>%{meta}</b><br><br>Mission Status: %{label}<br>Frequency: %{value}<br>Proportion: %{percent}<extra></extra>",
                        marker=dict(colors=colors)),
                  row=1, col=i+1)




fig.update_layout(title={"text": "Mission Status of Active and Retired rockets", "x": 0.5, "y": 0.95},
                  legend_title="Mission Status")


fig.show()

**Facts :**

* Around **93%** of the **active** rockets are **successful**.<br>Around **5%** of the **active** rockets have **failed**.<br>The remaining **active** rockets have either **partially failed** or **failed before launch**.<br><br>

* Around **89%** of the **retired** rockets are **successful**.<br>Around **8.5%** of the **retired** rockets have **failed**.<br>The remaining **retired** rockets have either **partially failed** or **failed before launch**.

<hr>

In [None]:
top_5_loc = loc_launch.index[:5]

In [None]:
top_5_loc_data = space_data.loc[space_data["Location"].str.split(", ").str[-1].isin(top_5_loc)].copy()

top_5_loc_data["Location"] = top_5_loc_data["Location"].str.split(", ").str[-1]

In [None]:
top_5_loc_rock_stat = pd.pivot_table(data=top_5_loc_data, index="Location", columns="Rocket Status",
                       values="Mission Cost", aggfunc=len)

top_5_loc_rock_stat = top_5_loc_rock_stat.fillna(0).astype(int)

top_5_loc_rock_stat["Total Missions"] = top_5_loc_rock_stat.sum(axis=1)

In [None]:
fig = go.Figure(data=[
    
    go.Bar(name='Retired',
           x=top_5_loc_rock_stat.index,
           y=top_5_loc_rock_stat["Retired"],
           hovertemplate="<b>%{x}</b><br><br>No. of Retired rockets: %{y}<extra></extra>",
           marker_color="#FF7043",
           width=[0.6]*len(top_5_loc_rock_stat)),
    
    go.Bar(name='Active',
           x=top_5_loc_rock_stat.index,
           y=top_5_loc_rock_stat["Active"],
           hovertemplate="<b>%{x}</b><br><br>No. of Active rockets: %{y}<extra></extra>", 
           marker_color="#00E676",
           width=[0.6]*len(top_5_loc_rock_stat))
       
])

fig.update_layout(barmode='stack')

fig.update_layout(title={"text": "Rocket Status of Top 5 Locations", "x": 0.5},
                 xaxis_title="Location", yaxis_title="No. of Missions",
                 legend_title="Rocket Status", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

* **The first 5 locations with the most number of missions are termed as 'Top 5'.**

**Facts :**

* **China** has the most number of **active** rockets followed by **USA**.<br><br>

* **Russia** has the least number of **active** rockets followed by **Kazakhstan**.<br><br>

* **Russia** also has the most number of **retired** rockets followed by **USA**.<br><br>

* **China** has the least number of **retired** rockets followed by **France**.

<hr>

In [None]:
top_5_loc_miss_stat = pd.pivot_table(data=top_5_loc_data, index="Location", columns="Mission Status",
                       values="Mission Cost", aggfunc=len)

top_5_loc_miss_stat = top_5_loc_miss_stat.fillna(0).astype(int)

In [None]:
top_5_loc_miss_stat["Total Missions"] = top_5_loc_miss_stat.sum(axis=1)

top_5_loc_miss_stat["Success %"] = (top_5_loc_miss_stat["Success"]/top_5_loc_miss_stat["Total Missions"]) * 100

top_5_loc_miss_stat["Failure %"] = (
    (top_5_loc_miss_stat[["Failure", "Partial Failure", "Prelaunch Failure"]].sum(axis=1)/top_5_loc_miss_stat["Total Missions"]) * 100
)

top_5_loc_miss_stat[["Success %", "Failure %"]] = top_5_loc_miss_stat[["Success %", "Failure %"]].round(2)

In [None]:
fig = go.Figure(data=[
    go.Bar(name='Success',
           x=top_5_loc_miss_stat.index, y=top_5_loc_miss_stat["Success"],
           text=top_5_loc_miss_stat["Total Missions"],
           meta=top_5_loc_miss_stat["Success %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of missions: %{text}<br>Successful Missions: %{y}<br>Success Rate: %{meta}%<extra></extra>", 
           marker_color="#00E676",
           width=[0.6]*len(top_5_loc_miss_stat)),
    
    go.Bar(name='Failure',
           x=top_5_loc_miss_stat.index, 
           y=top_5_loc_miss_stat[["Failure", "Partial Failure", "Prelaunch Failure"]].sum(axis=1),
           text=top_5_loc_miss_stat["Total Missions"],
           meta=top_5_loc_miss_stat["Failure %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of missions: %{text}<br>Failed Missions: %{y}<br>Failure Rate: %{meta}%<extra></extra>",
           marker_color="#FF6E40",
           width=[0.6]*len(top_5_loc_miss_stat)),
    
])

fig.update_layout(barmode='stack')

fig.update_layout(title={"text": "Mission Status of Top 5 Locations", "x": 0.5},
                 xaxis_title="Location", yaxis_title="No. of Missions",
                 legend_title="Mission Status", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

**Facts :**

* **France** has the highest **success rate** with **94.06%** of its missions being successful.<br><br>

* Inspite of launching a whopping **1395** missions, **Russia** has a **success rate** of **93.41%** which is amazing.<br><br>

* **China** has a **success rate** of **90.67%**.<br><br>

* **USA** and **Kazakhstan** has a **success rate** of **88.24%** and **86.73%** respectively.

<hr>

In [None]:
top_5_loc_data["Year"] = top_5_loc_data["Date & Time"].dt.year

In [None]:
top_5_loc_miss = pd.pivot_table(data=top_5_loc_data, index="Year", columns="Location",
                                values="Mission Cost", aggfunc=len)

In [None]:
fig = go.Figure()

for loc in top_5_loc_miss.columns:
    fig.add_trace(go.Scatter(x=top_5_loc_miss.index, y=top_5_loc_miss[loc],
                    mode='lines', name=loc,
                    meta=loc,
                    hovertemplate="<b>%{meta}</b> - %{y}<extra></extra>"                    
                ))
    
fig.update_layout(title={"text": "Frequency of Missions of Top 5 Locations by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions",
                 hovermode="x unified", legend_title="Host Location",
                 plot_bgcolor="white"
                 )

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, tickvals=list(range(1960,2021, 5)))
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, zerolinecolor="#616161", zerolinewidth=1)

fig.show()

* **The graph describes the number of missions launched by the Top 5 Locations in each year from 1957 to 2020.**

**Facts :**

* **Russia** is way ahead of the other 4 locations from **1968** to **1991**.<br><br>

* **USA** has launched the most number missions in each year from **1958** to **1963**.<br>Again, **USA** has launched the most number missions in each year from **1992** to **2017**.<br><br>

* **China** has been launching the most number of missions in each year from **2018** to **2020**.<br><br>

* Only in **2012**, **Kazakhstan** didn't launch even a single mission.<br><br>

* **France** has never launched the most number of missions in an year among the top 5 locations inspite of having the **highest success rate**.

<hr>

In [None]:
year_miss_stat = (
    space_data.groupby(space_data["Date & Time"].dt.year).apply(lambda data: data["Mission Status"].value_counts()).unstack()
                 )

In [None]:
year_miss_stat = year_miss_stat.fillna(0).astype(int)
year_miss_stat.index.name = "Year"

year_miss_stat.loc[:, "Failure"] = year_miss_stat[["Failure", "Partial Failure", "Prelaunch Failure"]].sum(axis=1)

year_miss_stat = year_miss_stat.drop(["Partial Failure", "Prelaunch Failure"], axis=1)

year_miss_stat["Total Missions"] = year_miss_stat.sum(axis=1)
year_miss_stat["Failure %"] = (year_miss_stat["Failure"]/year_miss_stat["Total Missions"]) * 100
year_miss_stat["Success %"] = (year_miss_stat["Success"]/year_miss_stat["Total Missions"]) * 100

year_miss_stat[["Success %", "Failure %"]] = year_miss_stat[["Success %", "Failure %"]].round(2)

In [None]:
fig = go.Figure(data=[
    
    go.Bar(name='Success',
           x=year_miss_stat.index,
           y=year_miss_stat["Success"],
           text=year_miss_stat["Total Missions"],
           meta=year_miss_stat["Success %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of Missions: %{text}<br>Successful Missions: %{y}<br>Success Rate: %{meta}%<extra></extra>",
           marker_color="#00E676"),
    
    go.Bar(name='Failure',
           x=year_miss_stat.index,
           y=year_miss_stat["Failure"],
           text=year_miss_stat["Total Missions"],
           meta=year_miss_stat["Failure %"],
           hovertemplate="<b>%{x}</b><br><br>Total no. of Missions: %{text}<br>Failure Missions: %{y}<br>Failure Rate: %{meta}%<extra></extra>",
           marker_color="#FF6E40")
       
])

fig.update_layout(barmode='stack')

fig.update_layout(title={"text": "Mission Status by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions",
                 legend_title="Mission Status", plot_bgcolor="#fff")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, tickvals=list(range(1960,2021, 5)))
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)


fig.show()

**Facts :**

* The year **2018** saw the most number of **successful** missions with **113** out of 117 missions launched being successful in the year.<br>The **success rate** of the year was **96.58%**.<br><br>

* The years **1971**, **1975**, **1976**, **1977**, **2018** and **2019** each saw atleast **100** successful missions.<br>The years **1976**, **1977** had a **success rate** of over **95%**.<br>The years **1975**, **2019** had a **success rate** of over **90%** while the year **1971** had a **success rate** around **88%**.<br><br>

* In terms of **success rate**, the year **1983** was the most successful with a **success rate** of **98.48%**.<br>**65** out of **66** missions launched in the year were **successful**.<br><br>

* The years **1973**, **1975**, **1976**, **1977**, **2018** and **2019** launched over **100** missions and each had a **success rate** of atleast **90%**.

<hr>

In [None]:
def max_miss_loc_comp(data):
    loc = data["Location"].str.split(", ").str[-1]
    max_loc = loc.value_counts().sort_index().idxmax()
    max_loc_miss = loc.value_counts().sort_index().max()
    max_loc_data = data[loc == max_loc]
    max_loc_comp = max_loc_data["Company"].value_counts().sort_index().idxmax()
    max_loc_comp_miss = max_loc_data["Company"].value_counts().sort_index().max()
    
    return pd.Series({"Location": max_loc, "No. of Missions": max_loc_miss,
                  "Company": max_loc_comp, "Missions": max_loc_comp_miss})

In [None]:
year_max_miss_loc_comp = space_data.groupby(space_data["Date & Time"].dt.year).apply(max_miss_loc_comp)

In [None]:
colors_dict = {"Russia": "#FF5252", "USA": "#1565C0",
               "Kazakhstan": "#00BFA5", "France": "#B388FF",
               "China": "#FDD835"}

In [None]:
colors = year_max_miss_loc_comp["Location"].map(colors_dict)

In [None]:
fig = go.Figure(data=[
    
    go.Bar(
           x=year_max_miss_loc_comp.index,
           y=year_max_miss_loc_comp["No. of Missions"],
           text=year_max_miss_loc_comp["Location"],
           meta = np.transpose([year_max_miss_loc_comp["Company"], year_max_miss_loc_comp["Missions"]]),
           hovertemplate="<b>%{x}</b><br><br>Location: %{text}<br>No. of Missions: %{y}<br><br>%{meta[0]} has launched<br>the most no. of missions for<br>%{text} in the year with %{meta[1]} missions<extra></extra>",
           marker_color=colors
    )
    
])


fig.update_layout(title={"text": "Locations and their corresponding Companies with the most no. of missions by Year", "x": 0.5},
                 xaxis_title="Year", yaxis_title="No. of Missions", plot_bgcolor="white")

fig.update_xaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True, tickvals=list(range(1960,2021, 5)))
fig.update_yaxes(showgrid=False, linecolor="black", linewidth=2, mirror=True)

fig.show()

* **The graph describes the Locations which has launched the most number of missions in each year and the companies that has launched the most number of missions for the location in that year.**

**Facts :**

* **USA** has launched the most number of missions in an year **33** times.<br>**Boeing** and **ULA** has launched the most missions for **USA** in an year **11** and **10** times respectively.<br>In **2017**, **USA** launched the most number of missions with **SpaceX** launching **18** out of the **30** missions launched by USA.<br><br>

* **Russia** has launched the most number of missions in an year **23** times.<br>Without surprise, **RVSN USSR** has launched the most missions for **Russia** all those times.<br>Another mindblowing fact is that **RVSN USSR** has launched all the missions for **Russia** from **1969** to **1990**.<br><br>

* **Kazakhstan** has launched the most number of missions in an year **5** times.<br>**RVSN USSR** has launched the most missions for **Kazakhstan** all those times.<br><br>

* **China** has launched the most number of missions in an year **3** times.<br>**CASC** has launched the most missions for **China** all those times.<br>In **2018**, **China** launched the most number of missions with **CASC** launching **37** out of the **39** missions launched by China.

<hr>

In [None]:
def top_comp_loc(data):
    
    tot_miss = len(data)
    max_comp = data["Company"].value_counts().sort_index().idxmax()
    miss = data["Company"].value_counts().max()
    max_comp_data = data[data["Company"] == max_comp]
    succ_rate = round(sum(max_comp_data["Mission Status"] == "Success")/len(max_comp_data)*100, 2)
    
    return pd.Series({"Total Missions": tot_miss, "Company": max_comp, "Missions": miss, "Success %": succ_rate})

In [None]:
loc_top_comp = space_data.groupby(space_data["Location"].str.split(", ").str[-1]).apply(top_comp_loc)

loc_top_comp = loc_top_comp.reset_index()

In [None]:
labels = []
labels.extend(loc_top_comp["Location"])
labels.extend(loc_top_comp["Company"])

parents = []
parents.extend([""] * len(loc_top_comp))
parents.extend(loc_top_comp["Location"])

ids = []
ids.extend(loc_top_comp["Location"])
ids.extend(loc_top_comp["Location"] + " - " + loc_top_comp["Company"])

In [None]:
hovertext = [] 
hovertext.extend(
    "<b>" + loc_top_comp["Location"] + "</b><br>" + "Total Missions: " + loc_top_comp["Total Missions"].astype(str)
)





hovertext.extend(
    (
        "<b>" + loc_top_comp["Company"] + "</b> (" + loc_top_comp["Location"] + ")<br>" + 
        "No. of Missions: "+ loc_top_comp["Missions"].astype(str) + "<br>" +
        "Success rate: " + loc_top_comp["Success %"].astype(str) + "%"
    )
)

In [None]:
fig =go.Figure(go.Sunburst(
    ids=ids,
    labels=labels,
    parents=parents,
    hoverinfo="text",
    hovertext=hovertext
))

fig.update_layout(title={"text": "Top Space Company for each Location", "x": 0.5}, height=650)

fig.show()

* **Space company that has launched the most number of missions for a location is termed as 'Top Company' of the location.**

**Facts :**

* Without surprise, **RVSN USSR** is the top space company for **Russia** as well as **Kazakhstan**.<br>The company has launched a whopping **1198** missions and **579** missions for **Russia** and **Kazakhstan** respectively.<br>It has a **success rate** of **93.41%** for **Russia** and **85.49%** for **Kazakhstan**.<br><br>

* **General Dynamics** is the top space company for **USA**. The company has launched **251** missions for **USA** with a **success rate** of **80.88%**.<br><br>

* **CASC** is the top space company for **China**. The company has launched **250** missions for **China** with a **success rate** of exactly **92%**.<br><br>

* For **France**, **Arianespace** has launched the most number of missions with **277** missions and has a staggering **96.39% success rate**.<br><br>

* **MHI** has launched the most number of missions for **Japan** with **84** missions and has a **success rate** of **95.24%**.<br><br>

* For **India**, **ISRO** has launched all its **76** missions with a **success rate** of **82.89%**.

<hr>

## Hope you enjoyed the Exploration. Thank You!!

<hr>