# SpaceX Falcon 9 first stage Landing Prediction
Predicting if the Falcon 9 first stage will land succcessfully. SpaceX advertised Falcon 9 rocket launches on its website with a cost of 62 million dollars. By determining if the first stage will land, we can detemine the cost of a launch.

## 1. Data Collection
- Request to the  SpaceX API
- Clean and preprocess the requested data

In [5]:
## Imports
import requests 
import pandas as pd 
import numpy as np 
import datetime


## We need different to use the API again to gather specific data for each ID number
1. getBoosterVersion --> Get Infomation on the Rocket Booster Names
2. getLaunchSite --> Get information on the name of the launchsite being used, the longitude, and the lattitude
3. getPayloadData --> Get information on the Payloads
4. getCoreData --> Get some core data

* **From <code>cores</code> we would like to learn the outcome of the landing, the type of the landing, number of flights with that core, whether gridfins were used, whether the core is reused, whether legs were used, the landing pad used, the block of the core which is a number used to seperate version of cores, the number of times this specific core has been reused, and the serial of the core.**

In [44]:
# Takes the dataset and uses the rocket column to call the API and append the data to the list
def getBoosterVersion(data):
    for x in data['rocket']:
       if x:
        response = requests.get("https://api.spacexdata.com/v4/rockets/"+str(x)).json()
        BoosterVersion.append(response['name'])

In [45]:
# Takes the dataset and uses the launchpad column to call the API and append the data to the list
def getLaunchSite(data):
    for x in data['launchpad']:
       if x:
         response = requests.get("https://api.spacexdata.com/v4/launchpads/"+str(x)).json()
         Longitude.append(response['longitude'])
         Latitude.append(response['latitude'])
         LaunchSite.append(response['name'])

In [46]:
## Takes the dataset and uses the payload column to call the API and append the data to the lists
# Takes the dataset and uses the payloads column to call the API and append the data to the lists
def getPayloadData(data):
    for load in data['payloads']:
       if load:
        response = requests.get("https://api.spacexdata.com/v4/payloads/"+load).json()
        PayloadMass.append(response['mass_kg'])
        Orbit.append(response['orbit'])

In [47]:
# Takes the dataset and uses the cores column to call the API and append the data to the lists
def getCoreData(data):
    for core in data['cores']:
            if core['core'] != None:
                response = requests.get("https://api.spacexdata.com/v4/cores/"+core['core']).json()
                Block.append(response['block'])
                ReusedCount.append(response['reuse_count'])
                Serial.append(response['serial'])
            else:
                Block.append(None)
                ReusedCount.append(None)
                Serial.append(None)
            Outcome.append(str(core['landing_success'])+' '+str(core['landing_type']))
            Flights.append(core['flight'])
            GridFins.append(core['gridfins'])
            Reused.append(core['reused'])
            Legs.append(core['legs'])
            LandingPad.append(core['landpad'])

In [12]:
spacex_url = "https://api.spacexdata.com/v4/launches/past"

In [13]:
response = requests.get(spacex_url)

In [None]:
print(response.content)

## Request and parse the SpaceX launch data using the GET request

In [15]:
## To make the requested JSON results more consistent, we will use the following static response object for this project
static_json_url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/API_call_spacex_api.json"

In [16]:
response.status_code

200

In [17]:
data = pd.json_normalize(response.json())

In [23]:
data.head(5)

Unnamed: 0,rocket,payloads,launchpad,cores,flight_number,date_utc,data


In [25]:
# Lets take a subset of our dataframe keeping only the features we want and the flight number, and date_utc.
data = data[['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc']]

# We will remove rows with multiple cores because those are falcon rockets with 2 extra rocket boosters and rows that have multiple payloads in a single rocket.
data = data[data['cores'].map(len)==1]
data = data[data['payloads'].map(len)==1]

# Since payloads and cores are lists of size 1 we will also extract the single value in the list and replace the feature.
data['cores'] = data['cores'].map(lambda x : x[0])
data['payloads'] = data['payloads'].map(lambda x : x[0])

# We also want to convert the date_utc to a datetime datatype and then extracting the date leaving the time
data['date'] = pd.to_datetime(data['date_utc']).dt.date

# Using the date we will restrict the dates of the launches
data = data[data['date'] <= datetime.date(2020, 11, 13)]

In [49]:
BoosterVersion = []
PayloadMass = []
Orbit = []
LaunchSite = []
Outcome = []
Flights = []
GridFins = []
Reused = []
Legs = []
LandingPad = []
Block = []
ReusedCount = []
Serial = []
Longitude = []
Latitude = []

In [50]:
BoosterVersion

[]

In [51]:
getBoosterVersion(data)

In [52]:
BoosterVersion[0:5]

[]

In [53]:
getLaunchSite(data)

In [54]:
getPayloadData(data)

In [55]:
getCoreData(data)

In [56]:
launch_dict = {'FlightNumber': list(data['flight_number']),
'Date': list(data['date']),
'BoosterVersion':BoosterVersion,
'PayloadMass':PayloadMass,
'Orbit':Orbit,
'LaunchSite':LaunchSite,
'Outcome':Outcome,
'Flights':Flights,
'GridFins':GridFins,
'Reused':Reused,
'Legs':Legs,
'LandingPad':LandingPad,
'Block':Block,
'ReusedCount':ReusedCount,
'Serial':Serial,
'Longitude': Longitude,
'Latitude': Latitude}

In [57]:
df = pd.DataFrame(launch_dict)

In [58]:
df.head()

Unnamed: 0,FlightNumber,Date,BoosterVersion,PayloadMass,Orbit,LaunchSite,Outcome,Flights,GridFins,Reused,Legs,LandingPad,Block,ReusedCount,Serial,Longitude,Latitude


## Filtering the DataFrame to only include Falcon 9 Launches

In [59]:
data_falcon9 = data[data["BoosterVersion"] != "Falcon 1"]

KeyError: 'BoosterVersion'

In [60]:
data_falcon9.loc[:, "FlightNumber"] = list(range(1, data_falcon9.shape[0]+1))

NameError: name 'data_falcon9' is not defined

## Data Wrangling

In [61]:
data_falcon9.isnull().sum()

NameError: name 'data_falcon9' is not defined

In [62]:
payload_mass_mean = data_falcon9["PayloadMass"].mean()
data_falcon9["PayloadMass"] = data_falcon9["PayloadMass"].filna(payload_mass_mean)

NameError: name 'data_falcon9' is not defined

In [63]:
data_falcon9.to_csv("dataset_part_1.csv", index=False) 

NameError: name 'data_falcon9' is not defined