# Introduction

<center><img src="https://i.imgur.com/9hLRsjZ.jpg" height=400></center>

This dataset was scraped from [nextspaceflight.com](https://nextspaceflight.com/launches/past/?page=1) and includes all the space missions since the beginning of Space Race between the USA and the Soviet Union in 1957!

### Install Package with Country Codes

In [316]:
%pip install iso3166

Note: you may need to restart the kernel to use updated packages.


### Upgrade Plotly

Run the cell below if you are working with Google Colab.

In [317]:
%pip install --upgrade plotly

Note: you may need to restart the kernel to use updated packages.


### Import Statements

In [318]:
import numpy as np
import pandas as pd
import plotly.express as px
import matplotlib.pyplot as plt
import seaborn as sns
import math
import pycountry

# These might be helpful:
from iso3166 import countries
from datetime import datetime, timedelta

### Notebook Presentation

In [319]:
pd.options.display.float_format = '{:,.2f}'.format

### Load the Data

In [320]:
df_data = pd.read_csv('mission_launches.csv')

# Preliminary Data Exploration

* What is the shape of `df_data`? 
* How many rows and columns does it have?
* What are the column names?
* Are there any NaN values or duplicates?

In [321]:
df_data.shape

(4324, 9)

In [322]:
df_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4324 entries, 0 to 4323
Data columns (total 9 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Unnamed: 0.1    4324 non-null   int64 
 1   Unnamed: 0      4324 non-null   int64 
 2   Organisation    4324 non-null   object
 3   Location        4324 non-null   object
 4   Date            4324 non-null   object
 5   Detail          4324 non-null   object
 6   Rocket_Status   4324 non-null   object
 7   Price           964 non-null    object
 8   Mission_Status  4324 non-null   object
dtypes: int64(2), object(7)
memory usage: 304.2+ KB


## Data Cleaning - Check for Missing Values and Duplicates

Consider removing columns containing junk data. 

In [323]:
df_data = df_data.drop_duplicates()

In [324]:
df_data = df_data.dropna()

## Descriptive Statistics

In [325]:
df_data.describe()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0
count,964.0,964.0
mean,858.49,858.49
std,784.21,784.21
min,0.0,0.0
25%,324.75,324.75
50%,660.5,660.5
75%,1112.0,1112.0
max,4020.0,4020.0


# Number of Launches per Company

Create a chart that shows the number of space mission launches by organisation.

In [326]:
df_data

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Organisation,Location,Date,Detail,Rocket_Status,Price,Mission_Status
0,0,0,SpaceX,"LC-39A, Kennedy Space Center, Florida, USA","Fri Aug 07, 2020 05:12 UTC",Falcon 9 Block 5 | Starlink V1 L9 & BlackSky,StatusActive,50.0,Success
1,1,1,CASC,"Site 9401 (SLS-2), Jiuquan Satellite Launch Ce...","Thu Aug 06, 2020 04:01 UTC",Long March 2D | Gaofen-9 04 & Q-SAT,StatusActive,29.75,Success
3,3,3,Roscosmos,"Site 200/39, Baikonur Cosmodrome, Kazakhstan","Thu Jul 30, 2020 21:25 UTC",Proton-M/Briz-M | Ekspress-80 & Ekspress-103,StatusActive,65.0,Success
4,4,4,ULA,"SLC-41, Cape Canaveral AFS, Florida, USA","Thu Jul 30, 2020 11:50 UTC",Atlas V 541 | Perseverance,StatusActive,145.0,Success
5,5,5,CASC,"LC-9, Taiyuan Satellite Launch Center, China","Sat Jul 25, 2020 03:13 UTC","Long March 4B | Ziyuan-3 03, Apocalypse-10 & N...",StatusActive,64.68,Success
...,...,...,...,...,...,...,...,...,...
3855,3855,3855,US Air Force,"SLC-4W, Vandenberg AFB, California, USA","Fri Jul 29, 1966 18:43 UTC",Titan IIIB | KH-8,StatusRetired,59.0,Success
3971,3971,3971,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu May 06, 1965 15:00 UTC",Titan IIIA | LES 2 & LCS 1,StatusRetired,63.23,Success
3993,3993,3993,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Feb 11, 1965 15:19 UTC",Titan IIIA | LES 1,StatusRetired,63.23,Success
4000,4000,4000,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Dec 10, 1964 16:52 UTC",Titan IIIA | Transtage 2,StatusRetired,63.23,Success


In [327]:
missions_per_org = df_data.groupby(["Organisation"])["Mission_Status"].count().reset_index(name="Missions_Count")
missions_per_org.head

<bound method NDFrame.head of        Organisation  Missions_Count
0       Arianespace              96
1            Boeing               7
2              CASC             158
3               EER               1
4               ESA               1
5          Eurockot              13
6            ExPace               1
7               ILS              13
8              ISRO              67
9              JAXA               3
10        Kosmotras              22
11         Lockheed               8
12              MHI              37
13  Martin Marietta               9
14             NASA             149
15         Northrop              83
16        RVSN USSR               2
17       Rocket Lab              13
18        Roscosmos              23
19           Sandia               1
20           SpaceX              99
21              ULA              98
22     US Air Force              26
23           VKS RF              33
24     Virgin Orbit               1>

In [328]:
fig = px.bar(missions_per_org, x="Organisation", y="Missions_Count")
fig.show()

# Number of Active versus Retired Rockets

How many rockets are active compared to those that are decomissioned? 

In [329]:
active_counts = (df_data['Rocket_Status'] == "StatusRetired").sum()
inactive_counts = (df_data['Rocket_Status'] == "StatusActive").sum()

In [330]:
abs(active_counts - inactive_counts)

np.int64(208)

# Distribution of Mission Status

How many missions were successful?
How many missions failed?

In [331]:
success_counts = (df_data['Mission_Status'] == "Success").sum()
success_counts

np.int64(910)

In [332]:
failed_counts = (df_data['Mission_Status'] == "Failure").sum()
failed_counts

np.int64(36)

# How Expensive are the Launches? 

Create a histogram and visualise the distribution. The price column is given in USD millions (careful of missing values). 

In [333]:
fig = px.histogram(df_data, y="Price")

fig.update_layout(
    xaxis_title="Mission Counts"
)

fig.show()

# Use a Choropleth Map to Show the Number of Launches by Country

* Create a choropleth map using [the plotly documentation](https://plotly.com/python/choropleth-maps/)
* Experiment with [plotly's available colours](https://plotly.com/python/builtin-colorscales/). I quite like the sequential colour `matter` on this map. 
* You'll need to extract a `country` feature as well as change the country names that no longer exist.

Wrangle the Country Names

You'll need to use a 3 letter country code for each country. You might have to change some country names.

* Russia is the Russian Federation
* New Mexico should be USA
* Yellow Sea refers to China
* Shahrud Missile Test Site should be Iran
* Pacific Missile Range Facility should be USA
* Barents Sea should be Russian Federation
* Gran Canaria should be USA


You can use the iso3166 package to convert the country names to Alpha3 format.

In [334]:
def get_iso_alpha_3(country_name: str):
    try:
        return pycountry.countries.get(name=country_name.strip()).alpha_3
    except AttributeError:
        manual_mapping = {
            "Gran Canaria": "USA",
            "Pacific Missile Range Facility": "USA",
            "Yellow Sea": "CHN", 
            "USA": "USA",
            "Russia": "RUS"
        }
        return manual_mapping.get(country_name.strip(), None)

In [335]:
df_data['Country'] = df_data["Location"].apply( lambda x: x.split(",")[-1])
df_data['Country_ISO'] = df_data["Country"].apply(get_iso_alpha_3)

In [336]:

df_launches = df_data.groupby(["Country_ISO"])["Organisation"].count().reset_index(name="Launches")


In [337]:

fig = px.choropleth(data_frame=df_launches, locations='Country_ISO', color='Launches',
                        title="Lauches per Country"
                          )

fig.show()

# Use a Choropleth Map to Show the Number of Failures by Country


In [338]:
df_failures_by_iso = df_data.where(df_data['Mission_Status'] == "Failure").groupby(["Country_ISO"])["Organisation"].count().reset_index(name="Failures")

df_failures_by_iso


Unnamed: 0,Country_ISO,Failures
0,CHN,4
1,FRA,2
2,IND,5
3,KAZ,2
4,NZL,2
5,RUS,1
6,USA,20


In [339]:
fig = px.choropleth(data_frame=df_failures_by_iso, locations='Country_ISO', color='Failures',
                        title="Failures per Country"
                          )

fig.show()

# Create a Plotly Sunburst Chart of the countries, organisations, and mission status. 

In [340]:
df_data.head()

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Organisation,Location,Date,Detail,Rocket_Status,Price,Mission_Status,Country,Country_ISO
0,0,0,SpaceX,"LC-39A, Kennedy Space Center, Florida, USA","Fri Aug 07, 2020 05:12 UTC",Falcon 9 Block 5 | Starlink V1 L9 & BlackSky,StatusActive,50.0,Success,USA,USA
1,1,1,CASC,"Site 9401 (SLS-2), Jiuquan Satellite Launch Ce...","Thu Aug 06, 2020 04:01 UTC",Long March 2D | Gaofen-9 04 & Q-SAT,StatusActive,29.75,Success,China,CHN
3,3,3,Roscosmos,"Site 200/39, Baikonur Cosmodrome, Kazakhstan","Thu Jul 30, 2020 21:25 UTC",Proton-M/Briz-M | Ekspress-80 & Ekspress-103,StatusActive,65.0,Success,Kazakhstan,KAZ
4,4,4,ULA,"SLC-41, Cape Canaveral AFS, Florida, USA","Thu Jul 30, 2020 11:50 UTC",Atlas V 541 | Perseverance,StatusActive,145.0,Success,USA,USA
5,5,5,CASC,"LC-9, Taiyuan Satellite Launch Center, China","Sat Jul 25, 2020 03:13 UTC","Long March 4B | Ziyuan-3 03, Apocalypse-10 & N...",StatusActive,64.68,Success,China,CHN


In [341]:
df_data['Mission_Status_Bool'] = df_data["Mission_Status"].apply(lambda x: 0 if x == 'Success' else 1)

fig = px.sunburst(
    df_data,
    path=["Country", "Organisation"], 
    values="Mission_Status_Bool",                       
    title="Sunburst Chart"
)

fig.show()

# Analyse the Total Amount of Money Spent by Organisation on Space Missions

In [342]:
df_data['Price'] = df_data["Price"].apply(lambda x: str(x).replace(",", "")).astype(float)

df_data.groupby(["Organisation"])['Price'].sum()



Organisation
Arianespace       16,345.00
Boeing             1,241.00
CASC               6,340.26
EER                   20.00
ESA                   37.00
Eurockot             543.40
ExPace                28.30
ILS                1,320.00
ISRO               2,177.00
JAXA                 168.00
Kosmotras            638.00
Lockheed             280.00
MHI                3,532.50
Martin Marietta      721.40
NASA              76,280.00
Northrop           3,930.00
RVSN USSR         10,000.00
Rocket Lab            97.50
Roscosmos          1,187.50
Sandia                15.00
SpaceX             5,444.00
ULA               14,798.00
US Air Force       1,550.92
VKS RF             1,548.90
Virgin Orbit          12.00
Name: Price, dtype: float64

# Analyse the Amount of Money Spent by Organisation per Launch

In [343]:
df_data.groupby(["Organisation"])['Price'].mean()

Organisation
Arianespace         170.26
Boeing              177.29
CASC                 40.13
EER                  20.00
ESA                  37.00
Eurockot             41.80
ExPace               28.30
ILS                 101.54
ISRO                 32.49
JAXA                 56.00
Kosmotras            29.00
Lockheed             35.00
MHI                  95.47
Martin Marietta      80.16
NASA                511.95
Northrop             47.35
RVSN USSR         5,000.00
Rocket Lab            7.50
Roscosmos            51.63
Sandia               15.00
SpaceX               54.99
ULA                 151.00
US Air Force         59.65
VKS RF               46.94
Virgin Orbit         12.00
Name: Price, dtype: float64

# Chart the Number of Launches per Year

In [344]:
df_data

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Organisation,Location,Date,Detail,Rocket_Status,Price,Mission_Status,Country,Country_ISO,Mission_Status_Bool
0,0,0,SpaceX,"LC-39A, Kennedy Space Center, Florida, USA","Fri Aug 07, 2020 05:12 UTC",Falcon 9 Block 5 | Starlink V1 L9 & BlackSky,StatusActive,50.00,Success,USA,USA,0
1,1,1,CASC,"Site 9401 (SLS-2), Jiuquan Satellite Launch Ce...","Thu Aug 06, 2020 04:01 UTC",Long March 2D | Gaofen-9 04 & Q-SAT,StatusActive,29.75,Success,China,CHN,0
3,3,3,Roscosmos,"Site 200/39, Baikonur Cosmodrome, Kazakhstan","Thu Jul 30, 2020 21:25 UTC",Proton-M/Briz-M | Ekspress-80 & Ekspress-103,StatusActive,65.00,Success,Kazakhstan,KAZ,0
4,4,4,ULA,"SLC-41, Cape Canaveral AFS, Florida, USA","Thu Jul 30, 2020 11:50 UTC",Atlas V 541 | Perseverance,StatusActive,145.00,Success,USA,USA,0
5,5,5,CASC,"LC-9, Taiyuan Satellite Launch Center, China","Sat Jul 25, 2020 03:13 UTC","Long March 4B | Ziyuan-3 03, Apocalypse-10 & N...",StatusActive,64.68,Success,China,CHN,0
...,...,...,...,...,...,...,...,...,...,...,...,...
3855,3855,3855,US Air Force,"SLC-4W, Vandenberg AFB, California, USA","Fri Jul 29, 1966 18:43 UTC",Titan IIIB | KH-8,StatusRetired,59.00,Success,USA,USA,0
3971,3971,3971,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu May 06, 1965 15:00 UTC",Titan IIIA | LES 2 & LCS 1,StatusRetired,63.23,Success,USA,USA,0
3993,3993,3993,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Feb 11, 1965 15:19 UTC",Titan IIIA | LES 1,StatusRetired,63.23,Success,USA,USA,0
4000,4000,4000,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Dec 10, 1964 16:52 UTC",Titan IIIA | Transtage 2,StatusRetired,63.23,Success,USA,USA,0


In [345]:
df_data["Datetime"] = pd.to_datetime(df_data["Date"], errors='coerce', infer_datetime_format=True)
df_data["Year"]= df_data["Datetime"].dt.year
df_data['Month'] = df_data["Datetime"].dt.month_name()




The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.



In [346]:
df_launch_per_year = df_data.groupby('Year')["Organisation"].count().reset_index(name="Launches")
df_launch_per_year

Unnamed: 0,Year,Launches
0,1964.0,2
1,1965.0,2
2,1966.0,3
3,1967.0,7
4,1968.0,10
5,1969.0,8
6,1970.0,1
7,1971.0,2
8,1972.0,2
9,1973.0,1


In [347]:
fig = px.line(df_launch_per_year, x='Year', y="Launches")

fig.update_layout(
    xaxis=dict(
        tickmode='linear',  
        dtick=5 
    )
)

fig.show()

# Chart the Number of Launches Month-on-Month until the Present

Which month has seen the highest number of launches in all time? Superimpose a rolling average on the month on month time series chart. 

# Launches per Month: Which months are most popular and least popular for launches?

Some months have better weather than others. Which time of year seems to be best for space missions?

In [348]:
df_launch_per_month = df_data.groupby('Month')["Organisation"].count().head(10).reset_index(name="Launches").sort_values(by="Launches", ascending=False)
df_launch_per_month = df_launch_per_month.sort_values(by="Launches").reset_index(drop=True)


In [349]:
fig = px.bar(df_launch_per_month, x="Month", y="Launches")
fig.show()

# How has the Launch Price varied Over Time? 

Create a line chart that shows the average price of rocket launches over time. 

In [350]:
df_price_per_year = df_data.groupby(['Year'])['Price'].mean().reset_index(name="Avg_Price")
df_price_per_year

Unnamed: 0,Year,Avg_Price
0,1964.0,63.23
1,1965.0,63.23
2,1966.0,59.0
3,1967.0,216.29
4,1968.0,279.2
5,1969.0,609.5
6,1970.0,1160.0
7,1971.0,1160.0
8,1972.0,1160.0
9,1973.0,1160.0


In [351]:
fig = px.line(df_price_per_year, x='Year', y="Avg_Price", title='Avg Price per Year')
fig.show()

# Chart the Number of Launches over Time by the Top 10 Organisations. 

How has the dominance of launches changed over time between the different players? 

In [352]:
df_data_by_top_10_org = df_data.groupby(['Organisation'])['Date'].count().reset_index(name="Launches").sort_values(by="Launches", ascending=False).head(10)
df_data_by_top_10_org

Unnamed: 0,Organisation,Launches
2,CASC,158
14,NASA,149
20,SpaceX,99
21,ULA,98
0,Arianespace,96
15,Northrop,83
8,ISRO,67
12,MHI,37
23,VKS RF,33
22,US Air Force,26


In [353]:
fig = px.bar(df_data_by_top_10_org, x="Organisation", y="Launches")
fig.show()

# Cold War Space Race: USA vs USSR

The cold war lasted from the start of the dataset up until 1991. 

In [354]:
df_data.loc[df_data["Country_ISO"] == 'KAZ', "Country_ISO"] = "USSR"
df_coldwar_data = df_data[(df_data["Year"] <= 1991)].sort_values(by="Datetime").reset_index(drop=True)
df_coldwar_data

Unnamed: 0.2,Unnamed: 0.1,Unnamed: 0,Organisation,Location,Date,Detail,Rocket_Status,Price,Mission_Status,Country,Country_ISO,Mission_Status_Bool,Datetime,Year,Month
0,4020,4020,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Tue Sep 01, 1964 15:00 UTC",Titan IIIA | Transtage 1,StatusRetired,63.23,Failure,USA,USA,1,1964-09-01 15:00:00+00:00,1964.00,September
1,4000,4000,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Dec 10, 1964 16:52 UTC",Titan IIIA | Transtage 2,StatusRetired,63.23,Success,USA,USA,0,1964-12-10 16:52:00+00:00,1964.00,December
2,3993,3993,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu Feb 11, 1965 15:19 UTC",Titan IIIA | LES 1,StatusRetired,63.23,Success,USA,USA,0,1965-02-11 15:19:00+00:00,1965.00,February
3,3971,3971,US Air Force,"SLC-20, Cape Canaveral AFS, Florida, USA","Thu May 06, 1965 15:00 UTC",Titan IIIA | LES 2 & LCS 1,StatusRetired,63.23,Success,USA,USA,0,1965-05-06 15:00:00+00:00,1965.00,May
4,3855,3855,US Air Force,"SLC-4W, Vandenberg AFB, California, USA","Fri Jul 29, 1966 18:43 UTC",Titan IIIB | KH-8,StatusRetired,59.00,Success,USA,USA,0,1966-07-29 18:43:00+00:00,1966.00,July
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,1750,1750,NASA,"LC-39B, Kennedy Space Center, Florida, USA","Wed Jun 05, 1991 13:24 UTC",Space Shuttle Columbia | STS-40,StatusRetired,450.00,Success,USA,USA,0,1991-06-05 13:24:00+00:00,1991.00,June
96,1743,1743,Northrop,"NB-52B Carrier, Edwards AFB, California, USA","Sun Jul 21, 1991 17:33 UTC",Pegasus/HAPS | 7 Microsats,StatusRetired,40.00,Partial Failure,USA,USA,1,1991-07-21 17:33:00+00:00,1991.00,July
97,1741,1741,NASA,"LC-39A, Kennedy Space Center, Florida, USA","Fri Aug 02, 1991 15:02 UTC",Space Shuttle Atlantis | STS-43,StatusRetired,450.00,Success,USA,USA,0,1991-08-02 15:02:00+00:00,1991.00,August
98,1732,1732,NASA,"LC-39A, Kennedy Space Center, Florida, USA","Thu Sep 12, 1991 23:11 UTC",Space Shuttle Discovery | STS-48,StatusRetired,450.00,Success,USA,USA,0,1991-09-12 23:11:00+00:00,1991.00,September


## Create a Plotly Pie Chart comparing the total number of launches of the USSR and the USA

Hint: Remember to include former Soviet Republics like Kazakhstan when analysing the total number of launches. 

In [355]:

df_ussr_and_usa_data = df_coldwar_data[(df_coldwar_data["Country_ISO"] == "USSR") | (df_coldwar_data["Country_ISO"] == "USA")].groupby('Country_ISO')["Date"].count().reset_index(name="Launches")

df_ussr_and_usa_data


Unnamed: 0,Country_ISO,Launches
0,USA,89
1,USSR,2


In [356]:
fig = px.pie(df_ussr_and_usa_data, values='Launches', names='Country_ISO', title='USSR vs USA')
fig.show()

## Create a Chart that Shows the Total Number of Launches Year-On-Year by the Two Superpowers

In [357]:
superpowers = ["USA", "RUS"]
df_superpowers = df_data[df_data['Country_ISO'].isin(superpowers)]

# Group by year and superpower, and count launches
df_grouped_top2 = df_superpowers.groupby(['Year', 'Country_ISO']).size().reset_index(name='Launches')

df_grouped_top2

Unnamed: 0,Year,Country_ISO,Launches
0,1964.00,USA,2
1,1965.00,USA,2
2,1966.00,USA,3
3,1967.00,USA,7
4,1968.00,USA,10
...,...,...,...
64,2018.00,USA,29
65,2019.00,RUS,5
66,2019.00,USA,19
67,2020.00,RUS,1


In [358]:
fig = px.bar(
    df_grouped_top2,
    x='Year',
    y='Launches',
    color='Country_ISO',
    title='Year-On-Year Launches by Superpowers (USA vs RUS)',
    labels={'Launches': 'Total Launches'}
)

fig.show()

## Chart the Total Number of Mission Failures Year on Year.

In [359]:
df_grouped_failures = df_superpowers.loc[df_superpowers['Mission_Status_Bool'] == 1].groupby(['Year', 'Mission_Status']).size().reset_index(name='Failure_Launches')
df_grouped_failures

Unnamed: 0,Year,Mission_Status,Failure_Launches
0,1964.0,Failure,1
1,1967.0,Partial Failure,1
2,1968.0,Partial Failure,1
3,1986.0,Failure,1
4,1990.0,Failure,1
5,1991.0,Partial Failure,1
6,1993.0,Failure,1
7,1994.0,Failure,1
8,1994.0,Partial Failure,1
9,1995.0,Failure,2


In [360]:
fig = px.bar(
    df_grouped_failures,
    x='Year',
    y='Failure_Launches',
    color='Mission_Status',
    title='Year-On-Year Launches by Superpowers (USA vs RUS)',
    labels={'Launches': 'Total Launches'}
)

fig.show()

## Chart the Percentage of Failures over Time

Did failures go up or down over time? Did the countries get better at minimising risk and improving their chances of success over time? 

In [361]:
df_failure_percetage = df_data.groupby('Year').agg(
    Total_Launches=('Mission_Status', 'count'),
    Failed_Launches=('Mission_Status_Bool', 'sum')
).reset_index()

df_failure_percetage["Percentage"] = (df_failure_percetage['Failed_Launches'] / df_failure_percetage['Total_Launches']) * 100 

df_failure_percetage


Unnamed: 0,Year,Total_Launches,Failed_Launches,Percentage
0,1964.0,2,1,50.0
1,1965.0,2,0,0.0
2,1966.0,3,0,0.0
3,1967.0,7,1,14.29
4,1968.0,10,1,10.0
5,1969.0,8,0,0.0
6,1970.0,1,0,0.0
7,1971.0,2,0,0.0
8,1972.0,2,0,0.0
9,1973.0,1,0,0.0


In [362]:
fig = px.line(df_failure_percetage, x='Year', y="Percentage", title='Failure % per year')
fig.show()

# For Every Year Show which Country was in the Lead in terms of Total Number of Launches up to and including including 2020)

Do the results change if we only look at the number of successful launches? 

In [363]:
df_laucnhes_per_year_for_org = df_data.groupby(['Organisation', 'Year']).size().reset_index(name='Count_per_year')

In [364]:
fig = px.bar(df_laucnhes_per_year_for_org, x="Year", y="Count_per_year", color="Organisation")
fig.show()

# Create a Year-on-Year Chart Showing the Organisation Doing the Most Number of Launches

Which organisation was dominant in the 1970s and 1980s? Which organisation was dominant in 2018, 2019 and 2020? 

In [365]:
fig.update_xaxes(
    dtick=1
)
fig.show()