# The Final Frontier
by [David Roberts](https://www.kaggle.com/davidroberts13) in Aug 2020

This notebook explores the second space race we are currently in. More specifically I'm interested in the relationship between the traditional governmental agencies and the burgeoning Entrepurunial Space Explorers like Elon Musk, Richard Branson, and Jeff Bezos.  

![](https://i.imgur.com/AxjZ7pL.jpg)

### Table of Contents

* [1.Introduction](#chapter1)
    * [1.1 Load and Check Data](#section_1_1)
* [2. Feature Engineering](#Chapter2)
    * [2.1 Country Issolation](#section_2_1)
    * [2.2 Organization Elaboration](#section_2_2)
    * [2.3 Date & Time Extraction](#section_2_3)
* [3. Visualizations](#chapter3)
    * [3.1 Top 5's and 10's](#section_3_1)
    * [3.2 Bars Bars Bars](#section_3_2)
    * [3.3 Temporal Visualizations](#section_3_3)
* [4. Conclusion](#chapter4)
   

### 1.Introduction <a class="anchor" id="chapter1"></a>


   We are in the dawn of a second Space Race. We saw this technological arms race emerge between the Cold War rivals, the Soviet Union, and the United States, to achieve first in spaceflight. This original race. This original race went from 1955-1975 culminating with the US achieving the first moon landing and space walk-in in 1969. After The US had won the race to the moon the Soviet Union transitioned its efforts towards an orbital space station. They accomplished this when their 3 man crew Successfully docked and remained attached to the [Salyut 1](https://en.wikipedia.org/wiki/Salyut_1), the Societ laboratory, for a record 22 days. This race left in wake many wonderful technologies we could not function without to this day. Technologies I'm sure you are using right now like camera phones, computer mouses, and scratch-Resistant lenses just to name a few. This race allowed tensions during the cold war to 'thaw' for some time through a negotiation known as [Détente](https://en.wikipedia.org/wiki/D%C3%A9tente). 

   Many years went by with these organizations and many others quietly working in the background with only the most momentous of achievements reaching the general public. Until the birth of the second space race known as the [Billionaire Space Race](https://en.wikipedia.org/wiki/Billionaire_space_race) these men are the leaded edge of what is being referred to as the New Space Age. No longer is this a matter of countries going head to head. In New Space, we have private organizations leading the charge. I'm extremely interested to see how these private organizations stack up against the state-run goliaths of the earlier era. 

   Below I will attempt to explore this relationship between State and Private organizations. This will include adding additional data from the personal research focused on the organizations themselves. 

## 1.1 Load and Check Data <a class="anchor" id="section_1_1"></a>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib as plt #aditional data visualization
import matplotlib.animation as animation
import seaborn as sns #additional data visualization 
from datetime import datetime #manipulation DateTime 
from datetime import date

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session
from plotly.offline import init_notebook_mode, iplot
import plotly.figure_factory as ff
import cufflinks
cufflinks.go_offline()
cufflinks.set_config_file(world_readable=True, theme='pearl')
import plotly.graph_objs as go
import plotly
from plotly import tools
import plotly.express as px
from scipy.stats import boxcox
init_notebook_mode(connected=True)
pd.set_option('display.max_columns', 100)

In [None]:
#Importing Data 
df=pd.read_csv("/kaggle/input/all-space-missions-from-1957/Space_Corrected.csv")
df=df.drop(['Unnamed: 0','Unnamed: 0.1'], axis=1)
print(df.info())
print(df.describe())
df.head()

We've got a sense of our variables, and the first few observations of each. We know that we are working with roughly 4300 observations of 6 variables. The one variable that could be more explicit is Rocket which is the cost of the rocket in millions not factoring in inflations.

In [None]:
# get all the unique values in the 'Location' column
Locations = df['Location'].unique()

# sort them alphabetically and then take a closer look
Locations.sort()
Locations

#Wonderful seems to all be in order 

# 2.Feature Engineering <a class="anchor" id="Chapter2"></a>


## 2.1 Country Issolation <a class="anchor" id="section_2_1"></a>


In [None]:
#creating new column 'Country of Lanuch'
new=df["Location"].str.rsplit(',',1, expand=True) 
df["Country of Launch"]= new[1]
df.head(3)

In [None]:
df['Country of Launch'].unique()
#spotting an issue with our system, it looks like all the location stings did not end in a country
#we will need to isolate these issue locations for further inspection

As you see above many countries came out as we expected, but we did have some issues. if you look closely we have a country named ' New Mexico' which we can safely assume is part of the greater ' USA' but we also have something called ' Pacific Missile Range Facility' now that is a little less obvious we will have to do some independent research to find their proper Country of Launch value. First we need to group up the issue locaitons

In [None]:
#Creating a list of the bad locations named badloc
badloc=[' Shahrud Missile Test Site',' New Mexico', ' Yellow Sea',
       ' Pacific Missile Range Facility', ' Pacific Ocean',
        ' Barents Sea', ' Gran Canaria']

#Creating a new df with only the bad locations so we can fix them 
df_badloc=df.loc[df['Country of Launch'].isin(badloc)]

print(df_badloc.count())

df_badloc.head(1)

#Looks like all the bad locations are not isolated and 
#We have 48 issue cells that will need correction 


This is a straightforward operation but posed a tougher problem than I expected. upon further research, the observation (in Country of Launch) ' Pacific' is a converted floating oil drilling platform owned and operated by [Sea Launch](https://en.wikipedia.org/wiki/Sea_Launch) a multinational organizations with representations from the US private sector and the Russian. From 1995-2010 the ownership was 25% Energia a Russian company, Boeing owning 40%, Aker Solutions a Norwegian company owning 20%, and a Ukrain company named Yuzhnoye at 15%. In 2010 however, Russia purchased the vast majority of the company bringing its total ownership to 95% with Boeing and Aker solution each with 2.5%. the company recently changed hands again to total ownership by a Russian airline S7. Since it is hard to lump it into either country and the majority of its launches happened during the 95-2010 window it will remain its own observation for later examination.

In [None]:
#replacing all the badloc with the newly researched data

df['Country of Launch'] = df['Country of Launch'].replace([' Shahrud Missile Test Site'],' Iran')
df['Country of Launch'] = df['Country of Launch'].replace([' New Mexico'],' USA')
df['Country of Launch'] = df['Country of Launch'].replace([' Yellow Sea'],' China')
df['Country of Launch'] = df['Country of Launch'].replace([' Pacific Missile Range Facility'],' USA')
df['Country of Launch'] = df['Country of Launch'].replace([' Barents Sea'],' Russia')
df['Country of Launch'] = df['Country of Launch'].replace([' Gran Canaria'],' USA')
df['Country of Launch'] = df['Country of Launch'].replace([' Pacific Ocean'],' Sea Launch')
df.head(3)


If you have a good eye you will notice an issue with the observations in the 'Country of Launch' column! So we have successfully made a column off country names mostly thought the utilization of just a handful of methods. [ .rsplit(), .replace(), .loc(), .isin()] but this left us with a white space before in the beginning of each country string. We will now need to remove this whitespace.

In [None]:
#Removing the white space in country string using the .strip() function
#More specificaly the .lstrip() function of leading strip


df['Country of Launch']=df['Country of Launch'].str.lstrip()
df['Country of Launch'].unique()

Now we have a more useable column with no erroneous data for our 'Country of Launch'

## 2.2 Organization Elaboration <a class="anchor" id="section_2_2"></a>

In [None]:
#Now lets do the same country based organization for the companies to see which nation is doing the best
print(df['Company Name'].nunique())
df['Company Name'].unique()

#After looking up the company's to find out their country of origin and if they are private or public
#I came across some issues looking up the following companies. This was normally due to a lack of 
#information in the acronym or too short of a title to lookup/common acronym.

issue_comp=[ 'IRGC','i-Space','KCST',
       'KARI','MITT','EER','RAE', 'UT', 'AMBA',
       "Arm??e de l'Air"]
#now lets plug in our compnays with issues list and return a new DF that will show 
#us some more information on the companys in question 

#during this process I found some repeats such as CASA is just a branch of the larger CASIC 
#so both will be named CASIC for ease of viewing. 
df['Company Name']=df['Company Name'].replace(['CASC'],'CASIC') #CASIC is the Parent organizaiton to CASA

df['Company Name']=df['Company Name'].replace([''"Arm??e de l'Air"''],''"Arme de l'Air"'') 
#Arme de l'Air is a French Military organization and should not have the '?' inside of it. 
#Also a good demonstration of how to handle quotation marks within a string 

df_Company_issue=df.loc[df['Company Name'].isin(issue_comp)]
df_Company_issue.head(3)

Now its time to do a little bit of additional research to figure out where all these issue companies hail from. about 30 min of googling and jotting things down we are ready to insert all of our Companys country of origin.

In [None]:
#All the Organization names we will be using for our new column 'Companies Country of Origin'

df['Companys Country of Origin']=df['Company Name']
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['SpaceX'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['CASIC'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Roscosmos'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ULA'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['JAXA'],'Japan')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Northrop'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ExPace'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['IAI'],'Isreal')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Rocket Lab'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Virgin Orbit'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['VKS RF'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['MHI'],'Japan')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['IRGC'],'Iran')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Arianespace'],'Multi')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ISA'],'Isreal')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Blue Origin'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ISRO'],'India')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Exos'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ILS'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['i-Space'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['OneSpace'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Landspace'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Eurockot'],'Germany')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Land Launch'],'Multi')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['CASIC'],'China')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['KCST'],'North Korea')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Sandia'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Kosmotras'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Khrunichev'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Sea Launch'],'Multi')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['KARI'],'South Korea')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ESA'],'Multi')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['NASA'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Boeing'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ISAS'],'Japan')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['SRC'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['MITT'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Lockheed'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['AEB'],'Brazil')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Starsem'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['RVSN USSR'],'Russia')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['EER'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['General Dynamics'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Martin Marietta'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Yuzhmash'],'Ukraine')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['Douglas'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['ASI'],'Italy')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['US Air Force'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['CNES'],'France')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['CECLES'],'Multi')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['RAE'],'England')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['UT'],'Japan')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['OKB-586'],'Ukraine')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['AMBA'],'USA')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['''"Arme de l'Air"'''],'French')
df['Companys Country of Origin']=df['Companys Country of Origin'].replace(['US Navy'],'USA')


#Now lets test to make sure it worked
print(df['Companys Country of Origin'].unique())
df

Another interesting column we are going to add is Private or State Run. This information was readily available while I was looking into the company's country of origin so I jotted it down. This category does come with its issues, however. While setting up this column you run into organizations like Sea Launch. Sea launch is a multinational organization that was founded with Boeing and some other organizations some of which are state-run. Making Sea Launch a mix of both Private and State funding. Yet it is run as a private organization so it will be deemed Private. This style of thought goes for all Companys.

In [None]:
df['Private or State Run']=df['Company Name']
df['Private or State Run']=df['Private or State Run'].replace(['SpaceX'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['CASIC'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Roscosmos'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['ULA'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['JAXA'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Northrop'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ExPace'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['IAI'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Rocket Lab'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['Virgin Orbit'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['VKS RF'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['MHI'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['IRGC'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Arianespace'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ISA'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Blue Origin'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ISRO'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Exos'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ILS'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['i-Space'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['OneSpace'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['Landspace'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['Eurockot'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Land Launch'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['CASIC'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['KCST'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Sandia'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Kosmotras'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Khrunichev'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Sea Launch'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['KARI'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['ESA'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['NASA'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Boeing'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ISAS'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['SRC'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['MITT'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Lockheed'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['AEB'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Starsem'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['RVSN USSR'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['EER'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['General Dynamics'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['Martin Marietta'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['Yuzhmash'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['Douglas'],'P')
df['Private or State Run']=df['Private or State Run'].replace(['ASI'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['US Air Force'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['CNES'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['CECLES'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['RAE'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['UT'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['OKB-586'],'S')
df['Private or State Run']=df['Private or State Run'].replace(['AMBA'],'S')
df['Private or State Run']=df['Private or State Run'].replace(["Arme de l'Air"],'S')
df['Private or State Run']=df['Private or State Run'].replace(['US Navy'],'S')


#Now lets test to make sure it worked
print(df['Private or State Run'].unique())


I'm pretty sure there must be a better way to add this information in that requires less typing on my end but as of now, this is the most sophisticated way I can add information like this into a data frame. If you know of a better way please comment below I'm eager to learn a more effective workflow!!

### 2.3 Date & Time Extraction <a class="anchor" id="section_2_3"></a>

Now we must break up our column which is currently known as Datum into its respective elements mainly isolate Year, Month, Day(of 365), and Time and Time Zone. This will be done utilizing Datetime which we imported at the top of this Notebook
strptime() can read strings with date and time information and convert them to datetime objects,

In [None]:
df['DateTime']=df['Datum']
df.pop('Datum')
df['DateTime'] =  pd.to_datetime(df['DateTime'], infer_datetime_format=True,utc=True)
df['Year'] = df['DateTime'].dt.strftime('%Y')
df['Month'] = df['DateTime'].dt.strftime('%m')
df['Day'] = df['DateTime'].dt.strftime('%d')
df['Date'] = df['DateTime'].dt.strftime('%d/%m/%Y')
df['Time'] = df['DateTime'].dt.strftime('%H:%M')
df.head(1)

Below i'm just saving this as a new CSV. I uploaded it to a new [Kaggle Dataset](https://www.kaggle.com/davidroberts13/one-small-step-for-data) If anyone would like to use it  

In [None]:
df.to_csv('Global Space Launches.csv',index=False)

# 3.Visualizations <a class="anchor" id="chapter3"></a>


## 3.1 Top 5's and 10's <a class="anchor" id="section_3_1"></a>
*This first section of visualizations only takes into account the raw number of launches not the quality, size, intention, or cost. We will get to those that we can later on.* 

In [None]:
#Top 10 countrys with the most Location [VISUALIZATION]
plt.style.use('seaborn-darkgrid')
sns.set(rc={'figure.figsize':(11.7,8.27)})
ax=df['Country of Launch'].value_counts()[:10].iplot(kind='bar',
                                                     xTitle='Country of Launch',
                                                     yTitle='Number of Launch',
                                                  title='Space Launch:\nTop 10 Countrys of Launch',
                                                    color='blue')

plt.pyplot.show()

Kazakhstan seems to be peculiar to say the best not known as a technological giant there must be something here. Let's make the same graph but from the companies origin country.

In [None]:
#Top 5 countrys with the most launches by Company [VISUALIZATION]
plt.style.use('seaborn-darkgrid')
sns.set(rc={'figure.figsize':(11.7,8.27)})

ax=df['Companys Country of Origin'].value_counts()[:5].iplot(kind='bar',
                                                             xTitle='Country of Launch',
                                                             yTitle='Number of Launch',
                                                  title ='Space Launch:\nTop 5 Countrys of Organizaitons',
                                                            color='green')
plt.pyplot.show()

We can see that our hunch about Kazakhstan was correct all of their launches were attributed to Russian organizations. allowing us to see the true relationship between nations launches

Remember that Multi is a group of organizations most exemplified by the European Space Agency (ESA) an organization made up of 22 nations. 

In [None]:
#Top 10 Organizations with the most launchs

plt.style.use('seaborn-darkgrid')
sns.set(rc={'figure.figsize':(11.7,8.27)})
ax=df['Company Name'].value_counts()[:10].iplot(kind='bar',
                                                xTitle='Company Name',
                                                yTitle='Number of Launch',
                                                  title='Space Launch:\nTop 10 Organizations (all time)')

plt.pyplot.show()

Having launched more rockets than the 9 runners up put together we can clearly see Russia's determination to master space. 

In [None]:
df1=df[df["Companys Country of Origin"].isin(['Russia','USA','Multi','China','Japan'])]
plt.style.use('seaborn-darkgrid')
sns.set(rc={'figure.figsize':(11.7,8.27)})
ax=sns.countplot(x='Companys Country of Origin', hue='Private or State Run',data=df1)
ax.set_xlabel('Country of Launch')
ax.set_ylabel('Number of Launch')
ax.set_title('Top 5 Countrys by Total Launch and ')
plt.pyplot.legend(loc="upper right", title='Private (P) or State-Run(S)',)
plt.pyplot.show()

In [None]:
#Private vs State Volume of launches
plt.style.use('seaborn-darkgrid')
sns.set(rc={'figure.figsize':(11.7,8.27)})
ax=df['Private or State Run'].value_counts()[:5].iplot(kind='bar',
                                                       xTitle='Private vs State',
                                                       yTitle='Number of Launch',
                                                  title='Space Launch:\nPrivate vs State',
                                                     color='purple')
plt.pyplot.show()

This would make us think that the private sector is still far behind its state-run counterpart but as we continue we will see this might not be the case. 

## 3.2 Bars Bars Bars <a class="anchor" id="section_3_2"></a>
*Let's take a look at how our top 5 countries in the new space-race, roughly from 1990-present, preform. we will evaluate them using the 'Status of Mission' column.*

In [None]:
df1['Year']=pd.to_numeric(df1['Year'])
df1_1990 = df1[df1['Year']>=(1990)]
df1_1990.head(1)

In [None]:
sns.set_palette("magma",3)

grouped = df1_1990.groupby(['Companys Country of Origin','Status Mission'])['Companys Country of Origin'].count().unstack()
grouped.sort_index(ascending=False)
grouped.iplot(kind='barh',xTitle='Number of Launches', yTitle='Top 5 Countrys',
              title='Mission Status by Country (1990-2020)')
plt.pyplot.show()

We see a vast majority of the missions regardless of the nation have a high success rate.

## 3.3 Temporal Visualizations <a class="anchor" id="section_3_3"></a>
*We will now examine this information over time*

In [None]:
#Lets make a new DF that is home to our Top 5 over all organizations by launch volume
df=df.reset_index()
df1=df.set_index('Company Name')
df1=df1.loc[['RVSN USSR','Arianespace','CASIC','General Dynamics','NASA']]
df1.head(1)

In [None]:
#Lets make a new DF that is home to our Top 5 over all organizations by launch volume
df2=df.reset_index()
df2=df.set_index('Company Name')
df2=df2.loc[['RVSN USSR','Arianespace','CASIC','General Dynamics','NASA']]
df2.tail(1)

In [None]:
#Viz of the Top 5 organizations by total launch count 
plt.style.use('seaborn-darkgrid')
sns.set_palette("tab20",5)
df2.groupby(['Year','Company Name']).count()['index'].unstack().iplot(ax=ax,xTitle='Year',
                                                                      yTitle='Number of Launches',
                                                                     title='Top 5 Organizations by launches\nLaunches by Year',
                                                                     )
plt.pyplot.show()

We can now see the true volume of Russian launches in comparison to the US. It goes to show that I guess quality is better than quantity. Remember that from 1955-1975 was the first Space Race.This is interesting and all but what we originally set out to do was explore the second Space Race and all this old data is throwing off our findings. Let's drill in and specifically look at this data after the turn of the century.

In [None]:
#Data frame post turn of the century 
df['Year']=pd.to_numeric(df['Year'])
df_2000 = df[df['Year']>=(2000)]
df_2000.head(1)

In [None]:
#A visualization showing how each country's organizations compare in total launch outputsns.set(style="white", context="talk")
plt.style.use('seaborn-darkgrid')
sns.set_palette("tab20",12)
df_2000.groupby(['Year','Companys Country of Origin']).count()['index'].unstack().iplot(ax=ax,
                                                                                       xTitle='Year',
                                                                                       yTitle='Number of Launches',
                                                                                       title='Launches by Country\n  by Year',
                                                                                      )
plt.pyplot.show()

In [None]:
plt.style.use('seaborn-darkgrid')
sns.set_palette("viridis",2)
df.groupby(['Year','Private or State Run']).count()['index'].unstack().iplot(ax=ax,
                                                                            xTitle='Year',
                                                                            yTitle='Number of Launches',
                                                                            title='Private vs State-Run Organizations|n Launches by Year'
                                                                            )

plt.pyplot.show()

We can see a trend start to become established as we cross into the 21st century. Private launches start to win out and this new space race starts to take off between both sectors around 2015. That is until efforts came to a halt recently most likely due to the Covid-19 pandemic. 

In [None]:
plt.style.use('seaborn-darkgrid')
sns.set_palette("plasma",2)
df_2000.groupby(['Year','Private or State Run']).count()['index'].unstack().iplot(ax=ax,
                                                                            xTitle='Year',
                                                                            yTitle='Number of Launches',
                                                                            title='Private vs State-Run Organizations|n Launches by Year'
                                                                            )

plt.pyplot.show()

As we drill down to take a closer look at the recent years we see that there was a large uptick in launches that was unfortunately nipped in the bud due to the pandemic. With the current trajectory, we are not sure when this race will get back underway but just because we don't see launches currently underway doesn't mean that the teams involved aren't tirelessly working to bring us the next wave in technological wonderment. 

# 4. Conclusion <a class="anchor" id="chapter4"></a>

All in all, we found there is a rapidly growing trend of growth in the private sector. mostly since the turn of the century. and even more recently we have seen a substantial uptick in overall space efforts in the last 5 years. unfortunately, this was growing intensity was quenched by Coronavirus imposing limitations on the industry but I think it is safe to say we will continue to see this new space race blossom over the coming years.

I just want to say thank you and I appreciate you if you made it this far through my first Kaggle Dataset. I look forward to many more. Again, I'm a certified newbie trying to learn as fast as I can so all comments and suggestions are encouraged!!

-Have a good one