# **Predictive Modeling for Rocket Landing Success**  
### *A Machine Learning Approach Using SpaceX Falcon 9 Data* 

## **Data Collection API**

### Installing Essential Python Libraries

**`requests`**: Simplifies sending HTTP requests.

**`pandas`**: Tools for data analysis and manipulation.

**`numpy`**: Library for working with arrays and performing mathematical operations.

In [68]:
# !pip install requests 
# !pip install pandas
# !pip install numpy

### Import Libraries

In [69]:
import requests
import pandas as pd
import numpy as np

import datetime
# Why No `pip install` for `datetime`?
# It's a standard Python library, which means it is included with Python by default. 
# There's no need to install it using `pip`, just import it directly and use it.


### Setting to Display All Columns And Data in a Dataframe

In [70]:
# Setting this option will print all columns of a dataframe
pd.set_option('display.max_columns', None)

# Setting this option will print all of the data in a feature
pd.set_option('display.max_colwidth', None)

### Define Auxiliary Functions that help us use the SpaceX API and extract data

#### Function to Get Booster Version from Dataset

This code block defines a function that is used to extract the booster name information from the `rocket` column in a dataset. The function works as follows:

1. **Iterates through each value in the `rocket` column**: For each rocket ID, it sends a request to the SpaceX API.
2. **Makes a request to the SpaceX API**: Uses the rocket identifier to fetch additional information about the rocket.
3. **Extracts the booster name**: The booster name is retrieved from the JSON response of the API.
4. **Appends the booster name to the list**: The booster name is added to the `BoosterVersion` list.

In [71]:
def getBoosterVersion(data):
    for x in data['rocket']:
        if x:
            response = requests.get("https://api.spacexdata.com/v4/rockets/"+str(x)).json()
            BoosterVersion.append(response['name'])

#### Function to Get Launch Site Information from Dataset

This code block defines a function that extracts launch site information using the `launchpad` identifier from the dataset. The function works as follows:

1. **Iterates through each value in the `launchpad` column**: For each launchpad ID, it sends a request to the SpaceX API.
2. **Makes a request to the SpaceX API**: Uses the launchpad identifier to fetch additional information about the launch site.
3. **Extracts relevant information**: The launch site name, longitude, and latitude are retrieved from the API's JSON response.
4. **Appends the data to the lists**: The launch site name, longitude, and latitude are added to the `LaunchSite`, `Longitude`, and `Latitude` lists, respectively.

In [72]:
def getLaunchSite(data):
    for x in data['launchpad']:
       if x:
         response = requests.get("https://api.spacexdata.com/v4/launchpads/"+str(x)).json()
         Longitude.append(response['longitude'])
         Latitude.append(response['latitude'])
         LaunchSite.append(response['name'])



#### Function to Fetch Payload Data

The getPayloadData function uses the payloads column in a dataset to make API calls to SpaceX and fetch information about the payloads associated with launches. The extracted data is stored in the global lists PayloadMass and Orbit.

Process Description:
Iterating Through the payloads Column:

The function loops through each value in the payloads column of the dataset. Each value is a unique identifier for a payload.
API Request:

For each identifier, an HTTP GET request is made to the SpaceX API using the base URL https://api.spacexdata.com/v4/payloads/ followed by the identifier.
Data Extraction:

The JSON response from the API is processed to extract:
mass_kg: Payload mass in kilograms, added to the PayloadMass list.
orbit: Orbital destination of the payload, added to the Orbit list.

In [73]:
# Function to fetch payload data
def getPayloadData(data):
    for load in data['payloads']:
        if load:
            response = requests.get("https://api.spacexdata.com/v4/payloads/" + load).json()
            PayloadMass.append(response['mass_kg'])
            Orbit.append(response['orbit'])


#### Function to Retrieve Rocket Core Data

This code block defines a function that extracts and stores additional data related to rocket cores from the SpaceX API. It uses the `cores` column from a dataset to retrieve detailed information about each rocket core. The function works as follows:

1. **Checks if a core ID exists**: If the core has a valid identifier (`core`), a request is made to the SpaceX API.
2. **Fetches core information**: The API is called to retrieve details about the core, such as the block, reuse count, and serial number.
3. **Appends data to lists**: The fetched data is added to the corresponding lists:
   - `Block`: The block number, indicating the core's version.
   - `ReusedCount`: The number of times the core has been reused.
   - `Serial`: The serial number of the core.
4. **Extracts additional information from the `cores` column**:
   - **Outcome**: The landing outcome (success and type).
   - **Flights**: The number of flights for that core.
   - **GridFins**: Whether gridfins (air control surfaces) were used.
   - **Reused**: Whether the core has been reused.
   - **Legs**: Whether legs were used for landing.
   - **LandingPad**: The identifier of the landing pad used.

In [119]:
# Takes the dataset and uses the cores column to call the API and append the data to the lists
def getCoreData(data):
    for core in data['cores']:
            if core['core'] != None:
                response = requests.get("https://api.spacexdata.com/v4/cores/"+core['core']).json()
                Block.append(response['block'])
                ReusedCount.append(response['reuse_count'])
                Serial.append(response['serial'])
            else:
                Block.append(None)
                ReusedCount.append(None)
                Serial.append(None)
            Outcome.append(str(core['landing_success'])+' '+str(core['landing_type']))
            Flights.append(core['flight'])
            GridFins.append(core['gridfins'])
            Reused.append(core['reused'])
            Legs.append(core['legs'])
            LandingPad.append(core['landpad'])


### Define the url to access data

In [75]:
spacex_url="https://api.spacexdata.com/v4/launches/past"

In [103]:
import requests

spacex_url = "https://api.spacexdata.com/v4/launches/past"
response = requests.get(spacex_url)

target_id = "5eb87d4dffd86e000604b38e"

# Buscar el lanzamiento con el id especificado
spacex_url = None
for launch in response.json():
    if launch['id'] == target_id:
        spacex_url = launch
        break

print(spacex_url)



{'fairings': None, 'links': {'patch': {'small': 'https://images2.imgbox.com/98/cc/UJd0SS73_o.png', 'large': 'https://images2.imgbox.com/03/3d/LzQWXPfy_o.png'}, 'reddit': {'campaign': 'https://www.reddit.com/r/spacex/comments/iwb8bl/crew1_launch_campaign_thread/', 'launch': 'https://www.reddit.com/r/spacex/comments/ju7fxv/rspacex_crew1_official_launch_coast_docking/', 'media': 'https://www.reddit.com/r/spacex/comments/judv0r/rspacex_crew1_media_thread_photographer_contest/', 'recovery': None}, 'flickr': {'small': [], 'original': ['https://live.staticflickr.com/65535/50618376646_8f52c31fc4_o.jpg', 'https://live.staticflickr.com/65535/50618376731_43ddaab1b8_o.jpg', 'https://live.staticflickr.com/65535/50618376671_ba4e60af7c_o.jpg', 'https://live.staticflickr.com/65535/50618376351_ecfdee4ab2_o.jpg', 'https://live.staticflickr.com/65535/50618727917_01e579c4d9_o.jpg', 'https://live.staticflickr.com/65535/50618355216_2872d1fe98_o.jpg', 'https://live.staticflickr.com/65535/50618354801_ff3e7228

#### Verifying Request Success with Status Code 200

In [104]:
response.status_code

200

### Decode Response and Convert It to a Pandas DataFrame
This code block shows how to fetch and process the response from a request to the SpaceX API. 

This code block does the following:

1. **Request to the API**: Sends a request to the specified URL to fetch the data.
2. **Status Code Check**: Verifies if the request was successful by checking that the response status code is 200.
3. **Decode the Response**: If the request is successful, the response is decoded into JSON format using `.json()`.
4. **Convert to DataFrame**: The decoded response (which is a JSON object) is then converted into a Pandas DataFrame using `.json_normalize()`, making it easier to manipulate and analyze.

In [105]:
if response.status_code == 200:
    response= response.json()
    #print(response)

In [106]:
data = pd.json_normalize(response)
data.head(3)

Unnamed: 0,static_fire_date_utc,static_fire_date_unix,net,window,rocket,success,failures,details,crew,ships,capsules,payloads,launchpad,flight_number,name,date_utc,date_unix,date_local,date_precision,upcoming,cores,auto_update,tbd,launch_library_id,id,fairings.reused,fairings.recovery_attempt,fairings.recovered,fairings.ships,links.patch.small,links.patch.large,links.reddit.campaign,links.reddit.launch,links.reddit.media,links.reddit.recovery,links.flickr.small,links.flickr.original,links.presskit,links.webcast,links.youtube_id,links.article,links.wikipedia,fairings
0,2006-03-17T00:00:00.000Z,1142554000.0,False,0.0,5e9d0d95eda69955f709d1eb,False,"[{'time': 33, 'altitude': None, 'reason': 'merlin engine failure'}]",Engine failure at 33 seconds and loss of vehicle,[],[],[],[5eb0e4b5b6c3bb0006eeb1e1],5e9e4502f5090995de566f86,1,FalconSat,2006-03-24T22:30:00.000Z,1143239400,2006-03-25T10:30:00+12:00,hour,False,"[{'core': '5e9e289df35918033d3b2623', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}]",True,False,,5eb87cd9ffd86e000604b32a,False,False,False,[],https://images2.imgbox.com/94/f2/NN6Ph45r_o.png,https://images2.imgbox.com/5b/02/QcxHUb5V_o.png,,,,,[],[],,https://www.youtube.com/watch?v=0a_00nJ_Y88,0a_00nJ_Y88,https://www.space.com/2196-spacex-inaugural-falcon-1-rocket-lost-launch.html,https://en.wikipedia.org/wiki/DemoSat,
1,,,False,0.0,5e9d0d95eda69955f709d1eb,False,"[{'time': 301, 'altitude': 289, 'reason': 'harmonic oscillation leading to premature engine shutdown'}]","Successful first stage burn and transition to second stage, maximum altitude 289 km, Premature engine shutdown at T+7 min 30 s, Failed to reach orbit, Failed to recover first stage",[],[],[],[5eb0e4b6b6c3bb0006eeb1e2],5e9e4502f5090995de566f86,2,DemoSat,2007-03-21T01:10:00.000Z,1174439400,2007-03-21T13:10:00+12:00,hour,False,"[{'core': '5e9e289ef35918416a3b2624', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}]",True,False,,5eb87cdaffd86e000604b32b,False,False,False,[],https://images2.imgbox.com/f9/4a/ZboXReNb_o.png,https://images2.imgbox.com/80/a2/bkWotCIS_o.png,,,,,[],[],,https://www.youtube.com/watch?v=Lk4zQ2wP-Nc,Lk4zQ2wP-Nc,https://www.space.com/3590-spacex-falcon-1-rocket-fails-reach-orbit.html,https://en.wikipedia.org/wiki/DemoSat,
2,,,False,0.0,5e9d0d95eda69955f709d1eb,False,"[{'time': 140, 'altitude': 35, 'reason': 'residual stage-1 thrust led to collision between stage 1 and stage 2'}]",Residual stage 1 thrust led to collision between stage 1 and stage 2,[],[],[],"[5eb0e4b6b6c3bb0006eeb1e3, 5eb0e4b6b6c3bb0006eeb1e4]",5e9e4502f5090995de566f86,3,Trailblazer,2008-08-03T03:34:00.000Z,1217734440,2008-08-03T15:34:00+12:00,hour,False,"[{'core': '5e9e289ef3591814873b2625', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}]",True,False,,5eb87cdbffd86e000604b32c,False,False,False,[],https://images2.imgbox.com/6c/cb/na1tzhHs_o.png,https://images2.imgbox.com/4a/80/k1oAkY0k_o.png,,,,,[],[],,https://www.youtube.com/watch?v=v0w9p3U8860,v0w9p3U8860,http://www.spacex.com/news/2013/02/11/falcon-1-flight-3-mission-summary,https://en.wikipedia.org/wiki/Trailblazer_(satellite),


### SpaceX Launch Data Preprocessing



#### Feature Selection

In [107]:
# A subset of the DataFrame is created, keeping only the necessary columns, 
# such as `rocket`, `payloads`, `launchpad`, `cores`, `flight_number`, and `date_utc`.

data = data[['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc']]
data.head(2)

Unnamed: 0,rocket,payloads,launchpad,cores,flight_number,date_utc
0,5e9d0d95eda69955f709d1eb,[5eb0e4b5b6c3bb0006eeb1e1],5e9e4502f5090995de566f86,"[{'core': '5e9e289df35918033d3b2623', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}]",1,2006-03-24T22:30:00.000Z
1,5e9d0d95eda69955f709d1eb,[5eb0e4b6b6c3bb0006eeb1e2],5e9e4502f5090995de566f86,"[{'core': '5e9e289ef35918416a3b2624', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}]",2,2007-03-21T01:10:00.000Z


#### Removing Rows with Multiple Cores and Payloads

In [108]:
# Rows with more than one core (for Falcon rockets with 2 extra boosters) or more than one payload 
# in a single launch are removed. This ensures that only launches with a single core and payload are kept.

data = data[data['cores'].map(len) == 1]
data = data[data['payloads'].map(len) == 1]

#### Extract values from Lists

In [100]:
# Since the `cores` and `payloads` columns contain lists of size 1, 
# the single value in each list is extracted and the column is replaced with that value.

data['cores'] = data['cores'].map(lambda x : x[0])
data['payloads'] = data['payloads'].map(lambda x : x[0])

#### Converting UTC dates

In [109]:
# The `date_utc` column is converted to a `datetime` datatype, and only the date 
# (without the time) is extracted.

data['date'] = pd.to_datetime(data['date_utc']).dt.date
data['date'].head(3)

0    2006-03-24
1    2007-03-21
3    2008-09-28
Name: date, dtype: object

In [110]:
data.shape

(172, 7)

#### Restrict the dates of the launches

In [112]:
data = data[data['date'] <= datetime.date(2020, 11, 13)]
data.shape

(94, 7)

### Extract Additional Information from SpaceX Launches

In this step, specific features are identified to be extracted from the `rocket`, `payload`, `launchpad`, and `cores` columns. 

Additionally, global lists are created to store the extracted data, which will be used to construct a new DataFrame.

#### Data to Extract:
1. **Rocket**:
   - Retrieve the booster name.

2. **Payload**:
   - Extract the payload mass.
   - Identify the orbit it is headed to.

3. **Launchpad**:
   - Retrieve the name of the launch site.
   - Obtain the longitude and latitude of the launchpad.

4. **Cores**:
   - Retrieve the landing outcome and landing type.
   - Number of flights performed with that core.
   - Whether gridfins were used.
   - Whether the core was reused.
   - Whether legs were used for landing.
   - The landing pad used.
   - The block number of the core (version of the core).
   - The number of times this specific core was reused.
   - The serial number of the core.

#### Global Lists to Store Data:
The following lists are initialized to store extracted data:


In [113]:
BoosterVersion = []
PayloadMass = []
Orbit = []
LaunchSite = []
Outcome = []
Flights = []
GridFins = []
Reused = []
Legs = []
LandingPad = []
Block = []
ReusedCount = []
Serial = []
Longitude = []
Latitude = []

#### Applying Functions to Populate Global Variables
In this step, previously defined functions are used to populate the global lists that store data fetched from the API. For example, the getBoosterVersion function is applied to populate the BoosterVersion list.

Behavior of the BoosterVersion Variable:
Before Application:

The BoosterVersion list is initially empty because no data has been fetched yet.
Example: BoosterVersion = [].
After Application:

The getBoosterVersion function is called with the data DataFrame as input.
This populates the BoosterVersion list with booster names associated with the launches.

In [114]:
# Call the getBoosterVersion function
getBoosterVersion(data)


Check if the list have been updated 

In [115]:
BoosterVersion[0:10]


['Falcon 1',
 'Falcon 1',
 'Falcon 1',
 'Falcon 1',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9']

We can apply the rest of the functions:

In [116]:
# Call getLaunchSite
getLaunchSite(data)
LaunchSite[:10]

['Kwajalein Atoll',
 'Kwajalein Atoll',
 'Kwajalein Atoll',
 'Kwajalein Atoll',
 'CCSFS SLC 40',
 'CCSFS SLC 40',
 'CCSFS SLC 40',
 'VAFB SLC 4E',
 'CCSFS SLC 40',
 'CCSFS SLC 40']

In [120]:
# Call getCoreData
getCoreData(data)

TypeError: list indices must be integers or slices, not str

####  Creation of a Launch Data Dictionary

In [92]:
launch_dictionary = {
    'FlightNumber': list(data['flight_number']),
    'Date': list(data['date']),
    'BoosterVersion': BoosterVersion,
    'PayloadMass': PayloadMass,
    'Orbit': Orbit,
    'LaunchSite': LaunchSite,
    'Outcome': Outcome,
    'Flights': Flights,
    'GridFins': GridFins,
    'Reused': Reused,
    'Legs': Legs,
    'LandingPad': LandingPad,
    'Block': Block,
    'ReusedCount': ReusedCount,
    'Serial': Serial,
    'Longitude': Longitude,
    'Latitude': Latitude
}


In [93]:
launch_dictionary

{'FlightNumber': [1,
  2,
  4,
  5,
  6,
  8,
  10,
  11,
  12,
  13,
  14,
  15,
  16,
  17,
  18,
  19,
  20,
  22,
  23,
  24,
  25,
  26,
  27,
  28,
  29,
  30,
  32,
  33,
  34,
  35,
  36,
  37,
  38,
  39,
  40,
  41,
  42,
  43,
  44,
  45,
  46,
  47,
  48,
  49,
  50,
  51,
  52,
  53,
  54,
  57,
  58,
  59,
  60,
  61,
  63,
  64,
  65,
  66,
  67,
  68,
  69,
  70,
  71,
  72,
  73,
  74,
  76,
  78,
  79,
  80,
  82,
  83,
  84,
  85,
  86,
  87,
  88,
  89,
  90,
  91,
  92,
  93,
  94,
  95,
  96,
  97,
  98,
  100,
  101,
  102,
  103,
  104,
  105,
  106],
 'Date': [datetime.date(2006, 3, 24),
  datetime.date(2007, 3, 21),
  datetime.date(2008, 9, 28),
  datetime.date(2009, 7, 13),
  datetime.date(2010, 6, 4),
  datetime.date(2012, 5, 22),
  datetime.date(2013, 3, 1),
  datetime.date(2013, 9, 29),
  datetime.date(2013, 12, 3),
  datetime.date(2014, 1, 6),
  datetime.date(2014, 4, 18),
  datetime.date(2014, 7, 14),
  datetime.date(2014, 8, 5),
  datetime.date(2014, 9,

#### Create a DataFrame from launch_dictionary

In [121]:
launch_df = pd.DataFrame(launch_dictionary)

ValueError: All arrays must be of the same length