<p style="text-align:center">
    <a href="https://skills.network" target="_blank">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" width="200" alt="Skills Network Logo">
    </a>
</p>


Index([], dtype='object')
Empty DataFrame
Columns: []
Index: []# **SpaceX  Falcon 9 first stage Landing Prediction**


# Lab 1: Collecting the data


Estimated time needed: **45** minutes


In this capstone, we will predict if the Falcon 9 first stage will land successfully. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upward of 165 million dollars each, much of the savings is because SpaceX can reuse the first stage. Therefore if we can determine if the first stage will land, we can determine the cost of a launch. This information can be used if an alternate company wants to bid against SpaceX for a rocket launch. In this lab, you will collect and make sure the data is in the correct format from an API. The following is an example of a successful and launch.


![](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/lab_v2/images/landing_1.gif)


Several examples of an unsuccessful landing are shown here:


![](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/lab_v2/images/crash.gif)


Most unsuccessful landings are planned. Space X performs a controlled landing in the oceans.


## Objectives


In this lab, you will make a get request to the SpaceX API. You will also do some basic data wrangling and formating.

- Request to the SpaceX API
- Clean the requested data


----


## Import Libraries and Define Auxiliary Functions


We will import the following libraries into the lab


In [1]:
import sys, requests, numpy as np, pandas as pd, datetime

Below we will define a series of helper functions that will help us use the API to extract information using identification numbers in the launch data.

From the <code>rocket</code> column we would like to learn the booster name.


In [2]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

From the <code>launchpad</code> we would like to know the name of the launch site being used, the logitude, and the latitude.


In [3]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

# Step 3: Define helper functions to get additional data from the API
def getBoosterVersion(data):
    for x in data['rocket']:
        if x:
            response = requests.get("https://api.spacexdata.com/v4/rockets/"+str(x)).json()
            BoosterVersion.append(response['name'])


def getLaunchSite(data):
    for x in data['launchpad']:
        if x:
            response = requests.get("https://api.spacexdata.com/v4/launchpads/"+str(x)).json()
            Longitude.append(response['longitude'])
            Latitude.append(response['latitude'])
            LaunchSite.append(response['name'])


def getPayloadData(data):
    for load in data['payloads']:
        if load:
            response = requests.get("https://api.spacexdata.com/v4/payloads/"+load).json()
            PayloadMass.append(response['mass_kg'])
            Orbit.append(response['orbit'])


def getCoreData(data):
    for core in data['cores']:
        if core['core'] != None:
            response = requests.get("https://api.spacexdata.com/v4/cores/"+core['core']).json()
            Block.append(response['block'])
            ReusedCount.append(response['reuse_count'])
            Serial.append(response['serial'])
        else:
            Block.append(None)
            ReusedCount.append(None)
            Serial.append(None)
        Outcome.append(str(core['landing_success'])+' '+str(core['landing_type']))
        Flights.append(core['flight'])
        GridFins.append(core['gridfins'])
        Reused.append(core['reused'])
        Legs.append(core['legs'])
        LandingPad.append(core['landpad'])


From the <code>payload</code> we would like to learn the mass of the payload and the orbit that it is going to.


In [4]:
spacex_url = "https://api.spacexdata.com/v4/launches/past"
response = requests.get(spacex_url)

if response.status_code == 200:
    json_data = response.json()
    data = pd.json_normalize(json_data)
else:
    print(f"Failed to fetch data. Status code: {response.status_code}")

From <code>cores</code> we would like to learn the outcome of the landing, the type of the landing, number of flights with that core, whether gridfins were used, wheter the core is reused, wheter legs were used, the landing pad used, the block of the core which is a number used to seperate version of cores, the number of times this specific core has been reused, and the serial of the core.


In [5]:
data = data[['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc']]

Now let's start requesting rocket launch data from SpaceX API with the following URL:


In [6]:
data = data[data['cores'].map(len) == 1]
data = data[data['payloads'].map(len) == 1]

In [7]:
data['cores'] = data['cores'].map(lambda x: x[0])
data['payloads'] = data['payloads'].map(lambda x: x[0])

Check the content of the response


In [8]:
data['date'] = pd.to_datetime(data['date_utc']).dt.date
data = data[data['date'] <= datetime.date(2020, 11, 13)]

You should see the response contains massive information about SpaceX launches. Next, let's try to discover some more relevant information for this project.


### Task 1: Request and parse the SpaceX launch data using the GET request


To make the requested JSON results more consistent, we will use the following static response object for this project:


In [9]:
# static_json_url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/API_call_spacex_api.json'

We should see that the request was successfull with the 200 status response code


In [10]:
# response.status_code

In [11]:
BoosterVersion = []
PayloadMass = []
Orbit = []
LaunchSite = []
Outcome = []
Flights = []
GridFins = []
Reused = []
Legs = []
LandingPad = []
Block = []
ReusedCount = []
Serial = []
Longitude = []
Latitude = []

Now we decode the response content as a Json using <code>.json()</code> and turn it into a Pandas dataframe using <code>.json_normalize()</code>


In [12]:
getBoosterVersion(data)
getLaunchSite(data)
getPayloadData(data)
getCoreData(data)

Using the dataframe <code>data</code> print the first 5 rows


In [13]:
data['BoosterVersion'] = BoosterVersion
data['LaunchSite'] = LaunchSite
data['PayloadMass'] = PayloadMass
data['Orbit'] = Orbit
data['Outcome'] = Outcome
data['Flights'] = Flights
data['GridFins'] = GridFins
data['Reused'] = Reused
data['Legs'] = Legs
data['LandingPad'] = LandingPad
data['Block'] = Block
data['ReusedCount'] = ReusedCount
data['Serial'] = Serial
data['Longitude'] = Longitude
data['Latitude'] = Latitude

You will notice that a lot of the data are IDs. For example the rocket column has no information about the rocket just an identification number.

We will now use the API again to get information about the launches using the IDs given for each launch. Specifically we will be using columns <code>rocket</code>, <code>payloads</code>, <code>launchpad</code>, and <code>cores</code>.


In [14]:
data_falcon9 = data[data['BoosterVersion'] == 'Falcon 9']

* From the <code>rocket</code> we would like to learn the booster name

* From the <code>payload</code> we would like to learn the mass of the payload and the orbit that it is going to

* From the <code>launchpad</code> we would like to know the name of the launch site being used, the longitude, and the latitude.

* **From <code>cores</code> we would like to learn the outcome of the landing, the type of the landing, number of flights with that core, whether gridfins were used, whether the core is reused, whether legs were used, the landing pad used, the block of the core which is a number used to seperate version of cores, the number of times this specific core has been reused, and the serial of the core.**

The data from these requests will be stored in lists and will be used to create a new dataframe.


In [15]:
data_falcon9 = data[data['BoosterVersion'] == 'Falcon 9']


In [16]:
# Get rid of copy warning (hopefully)
data_falcon9 = data[data['BoosterVersion'] == 'Falcon 9'].copy()

data_falcon9.loc[:, 'FlightNumber'] = list(range(1, data_falcon9.shape[0] + 1))

print("Filtered Data for Falcon 9 launches:")
print(data_falcon9.head())

Filtered Data for Falcon 9 launches:
                      rocket                  payloads  \
5   5e9d0d95eda69973a809d1ec  5eb0e4b7b6c3bb0006eeb1e7   
7   5e9d0d95eda69973a809d1ec  5eb0e4bab6c3bb0006eeb1ea   
9   5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ed   
10  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ee   
11  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ef   

                   launchpad  \
5   5e9e4501f509094ba4566f84   
7   5e9e4501f509094ba4566f84   
9   5e9e4501f509094ba4566f84   
10  5e9e4502f509092b78566f87   
11  5e9e4501f509094ba4566f84   

                                                                                                                                                                                                cores  \
5      {'core': '5e9e289ef359185f2b3b2628', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}   
7      {'core': '5e

In [17]:
print("Filtered Data for Falcon 9 launches:")
print(data_falcon9.head())

Filtered Data for Falcon 9 launches:
                      rocket                  payloads  \
5   5e9d0d95eda69973a809d1ec  5eb0e4b7b6c3bb0006eeb1e7   
7   5e9d0d95eda69973a809d1ec  5eb0e4bab6c3bb0006eeb1ea   
9   5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ed   
10  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ee   
11  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ef   

                   launchpad  \
5   5e9e4501f509094ba4566f84   
7   5e9e4501f509094ba4566f84   
9   5e9e4501f509094ba4566f84   
10  5e9e4502f509092b78566f87   
11  5e9e4501f509094ba4566f84   

                                                                                                                                                                                                cores  \
5      {'core': '5e9e289ef359185f2b3b2628', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}   
7      {'core': '5e

These functions will apply the outputs globally to the above variables. Let's take a looks at <code>BoosterVersion</code> variable. Before we apply  <code>getBoosterVersion</code> the list is empty:


In [18]:
BoosterVersion

['Falcon 1',
 'Falcon 1',
 'Falcon 1',
 'Falcon 1',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',
 'Falcon 9',

Now, let's apply <code> getBoosterVersion</code> function method to get the booster version


In [19]:
# Call getBoosterVersion
getBoosterVersion(data)

the list has now been update


In [20]:
BoosterVersion[0:5]

['Falcon 1', 'Falcon 1', 'Falcon 1', 'Falcon 1', 'Falcon 9']

we can apply the rest of the  functions here:


In [21]:
# Call getLaunchSite
getLaunchSite(data)

In [22]:
# Call getPayloadData
getPayloadData(data)

In [23]:
# Call getCoreData
getCoreData(data)

Finally lets construct our dataset using the data we have obtained. We we combine the columns into a dictionary.


In [24]:
# Filter for Falcon 9 launches
data_falcon9 = data[data['rocket'] == '5e9d0d95eda69973a809d1ec']  # Falcon 9 rocket ID


In [25]:
# Clear existing global variables
BoosterVersion = []
PayloadMass = []
Orbit = []
LaunchSite = []
Outcome = []
Flights = []
GridFins = []
Reused = []
Legs = []
LandingPad = []
Block = []
ReusedCount = []
Serial = []
Longitude = []
Latitude = []

# Apply functions to data_falcon9
getBoosterVersion(data_falcon9)
getLaunchSite(data_falcon9)
getPayloadData(data_falcon9)
getCoreData(data_falcon9)


In [26]:
launch_dict = {
    'FlightNumber': list(data_falcon9['flight_number']),
    'Date': list(data_falcon9['date']),
    'BoosterVersion': BoosterVersion,
    'PayloadMass': PayloadMass,
    'Orbit': Orbit,
    'LaunchSite': LaunchSite,
    'Outcome': Outcome,
    'Flights': Flights,
    'GridFins': GridFins,
    'Reused': Reused,
    'Legs': Legs,
    'LandingPad': LandingPad,
    'Block': Block,
    'ReusedCount': ReusedCount,
    'Serial': Serial,
    'Longitude': Longitude,
    'Latitude': Latitude
}

Then, we need to create a Pandas data frame from the dictionary launch_dict.


In [27]:
for key, value in launch_dict.items():
    print(f"{key}: {len(value)}")

FlightNumber: 90
Date: 90
BoosterVersion: 90
PayloadMass: 90
Orbit: 90
LaunchSite: 90
Outcome: 90
Flights: 90
GridFins: 90
Reused: 90
Legs: 90
LandingPad: 90
Block: 90
ReusedCount: 90
Serial: 90
Longitude: 90
Latitude: 90


In [28]:
launch_data_df = pd.DataFrame(launch_dict)
launch_data_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90 entries, 0 to 89
Data columns (total 17 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   FlightNumber    90 non-null     int64  
 1   Date            90 non-null     object 
 2   BoosterVersion  90 non-null     object 
 3   PayloadMass     85 non-null     float64
 4   Orbit           90 non-null     object 
 5   LaunchSite      90 non-null     object 
 6   Outcome         90 non-null     object 
 7   Flights         90 non-null     int64  
 8   GridFins        90 non-null     bool   
 9   Reused          90 non-null     bool   
 10  Legs            90 non-null     bool   
 11  LandingPad      64 non-null     object 
 12  Block           90 non-null     int64  
 13  ReusedCount     90 non-null     int64  
 14  Serial          90 non-null     object 
 15  Longitude       90 non-null     float64
 16  Latitude        90 non-null     float64
dtypes: bool(3), float64(3), int64(4), obj

Show the summary of the dataframe


In [29]:
# Display a statistical summary of the DataFrame
print(launch_data_df.describe())


       FlightNumber   PayloadMass    Flights      Block  ReusedCount  \
count     90.000000     85.000000  90.000000  90.000000    90.000000   
mean      56.477778   6123.547647   1.788889   3.500000     3.188889   
std       29.232977   4870.916417   1.213172   1.595288     4.194417   
min        6.000000    350.000000   1.000000   1.000000     0.000000   
25%       32.250000   2482.000000   1.000000   2.000000     0.000000   
50%       55.500000   4535.000000   1.000000   4.000000     1.000000   
75%       82.750000   9600.000000   2.000000   5.000000     4.000000   
max      106.000000  15600.000000   6.000000   5.000000    13.000000   

        Longitude   Latitude  
count   90.000000  90.000000  
mean   -86.366477  29.449963  
std     14.149518   2.141306  
min   -120.610829  28.561857  
25%    -80.603956  28.561857  
50%    -80.577366  28.561857  
75%    -80.577366  28.608058  
max    -80.577366  34.632093  


In [30]:
print(data.head())

                     rocket                  payloads  \
0  5e9d0d95eda69955f709d1eb  5eb0e4b5b6c3bb0006eeb1e1   
1  5e9d0d95eda69955f709d1eb  5eb0e4b6b6c3bb0006eeb1e2   
3  5e9d0d95eda69955f709d1eb  5eb0e4b7b6c3bb0006eeb1e5   
4  5e9d0d95eda69955f709d1eb  5eb0e4b7b6c3bb0006eeb1e6   
5  5e9d0d95eda69973a809d1ec  5eb0e4b7b6c3bb0006eeb1e7   

                  launchpad  \
0  5e9e4502f5090995de566f86   
1  5e9e4502f5090995de566f86   
3  5e9e4502f5090995de566f86   
4  5e9e4502f5090995de566f86   
5  5e9e4501f509094ba4566f84   

                                                                                                                                                                                            cores  \
0  {'core': '5e9e289df35918033d3b2623', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}   
1  {'core': '5e9e289ef35918416a3b2624', 'flight': 1, 'gridfins': False, 'leg

Finally we will remove the Falcon 1 launches keeping only the Falcon 9 launches. Filter the data dataframe using the <code>BoosterVersion</code> column to only keep the Falcon 9 launches. Save the filtered data to a new dataframe called <code>data_falcon9</code>.


### Task 2: Filter the dataframe to only include `Falcon 9` launches


In [31]:
print(data.columns)

print(data.head())

Index(['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc',
       'date', 'BoosterVersion', 'LaunchSite', 'PayloadMass', 'Orbit',
       'Outcome', 'Flights', 'GridFins', 'Reused', 'Legs', 'LandingPad',
       'Block', 'ReusedCount', 'Serial', 'Longitude', 'Latitude'],
      dtype='object')
                     rocket                  payloads  \
0  5e9d0d95eda69955f709d1eb  5eb0e4b5b6c3bb0006eeb1e1   
1  5e9d0d95eda69955f709d1eb  5eb0e4b6b6c3bb0006eeb1e2   
3  5e9d0d95eda69955f709d1eb  5eb0e4b7b6c3bb0006eeb1e5   
4  5e9d0d95eda69955f709d1eb  5eb0e4b7b6c3bb0006eeb1e6   
5  5e9d0d95eda69973a809d1ec  5eb0e4b7b6c3bb0006eeb1e7   

                  launchpad  \
0  5e9e4502f5090995de566f86   
1  5e9e4502f5090995de566f86   
3  5e9e4502f5090995de566f86   
4  5e9e4502f5090995de566f86   
5  5e9e4501f509094ba4566f84   

                                                                                                                                                             

In [32]:
# Proceed with filtering out rows that have multiple payloads (length of list should be 1)
data = data[data['payloads'].map(len) == 1]

# Convert 'cores' from a dict to a single item (since it's treated as a dict with core details)
data['cores'] = data['cores'].map(lambda x: x if isinstance(x, dict) else None)

# Now the DataFrame should be correctly filtered and formatted for further use
print(data.head())


Empty DataFrame
Columns: [rocket, payloads, launchpad, cores, flight_number, date_utc, date, BoosterVersion, LaunchSite, PayloadMass, Orbit, Outcome, Flights, GridFins, Reused, Legs, LandingPad, Block, ReusedCount, Serial, Longitude, Latitude]
Index: []


In [33]:
print(data.columns)

Index(['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc',
       'date', 'BoosterVersion', 'LaunchSite', 'PayloadMass', 'Orbit',
       'Outcome', 'Flights', 'GridFins', 'Reused', 'Legs', 'LandingPad',
       'Block', 'ReusedCount', 'Serial', 'Longitude', 'Latitude'],
      dtype='object')


In [34]:
print(data.head())

Empty DataFrame
Columns: [rocket, payloads, launchpad, cores, flight_number, date_utc, date, BoosterVersion, LaunchSite, PayloadMass, Orbit, Outcome, Flights, GridFins, Reused, Legs, LandingPad, Block, ReusedCount, Serial, Longitude, Latitude]
Index: []


In [35]:
data_falcon9.loc[:, 'FlightNumber'] = list(range(1, data_falcon9.shape[0]+1))
data_falcon9

Unnamed: 0,rocket,payloads,launchpad,cores,flight_number,date_utc,date,BoosterVersion,LaunchSite,PayloadMass,Orbit,Outcome,Flights,GridFins,Reused,Legs,LandingPad,Block,ReusedCount,Serial,Longitude,Latitude,FlightNumber
5,5e9d0d95eda69973a809d1ec,5eb0e4b7b6c3bb0006eeb1e7,5e9e4501f509094ba4566f84,"{'core': '5e9e289ef359185f2b3b2628', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",6,2010-06-04T18:45:00.000Z,2010-06-04,Falcon 9,CCSFS SLC 40,,LEO,None None,1,False,False,False,,1.0,0,B0003,-80.577366,28.561857,1
7,5e9d0d95eda69973a809d1ec,5eb0e4bab6c3bb0006eeb1ea,5e9e4501f509094ba4566f84,"{'core': '5e9e289ef35918f39c3b262a', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",8,2012-05-22T07:44:00.000Z,2012-05-22,Falcon 9,CCSFS SLC 40,525.0,LEO,None None,1,False,False,False,,1.0,0,B0005,-80.577366,28.561857,2
9,5e9d0d95eda69973a809d1ec,5eb0e4bbb6c3bb0006eeb1ed,5e9e4501f509094ba4566f84,"{'core': '5e9e289ff3591884e03b262c', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",10,2013-03-01T19:10:00.000Z,2013-03-01,Falcon 9,CCSFS SLC 40,677.0,ISS,None None,1,False,False,False,,1.0,0,B0007,-80.577366,28.561857,3
10,5e9d0d95eda69973a809d1ec,5eb0e4bbb6c3bb0006eeb1ee,5e9e4502f509092b78566f87,"{'core': '5e9e289ff359180ae23b262d', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': True, 'landing_success': False, 'landing_type': 'Ocean', 'landpad': None}",11,2013-09-29T16:00:00.000Z,2013-09-29,Falcon 9,VAFB SLC 4E,500.0,PO,False Ocean,1,False,False,False,,1.0,0,B1003,-120.610829,34.632093,4
11,5e9d0d95eda69973a809d1ec,5eb0e4bbb6c3bb0006eeb1ef,5e9e4501f509094ba4566f84,"{'core': '5e9e289ff35918862c3b262e', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",12,2013-12-03T22:41:00.000Z,2013-12-03,Falcon 9,CCSFS SLC 40,3170.0,GTO,None None,1,False,False,False,,1.0,0,B1004,-80.577366,28.561857,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
101,5e9d0d95eda69973a809d1ec,5ef6a4600059c33cee4a829e,5e9e4502f509094188566f88,"{'core': '5ef670f10059c33cee4a826c', 'flight': 2, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'ASDS', 'landpad': '5e9e3032383ecb6bb234e7ca'}",102,2020-09-03T12:46:00.000Z,2020-09-03,Falcon 9,KSC LC 39A,15600.0,VLEO,True ASDS,2,True,True,True,5e9e3032383ecb6bb234e7ca,5.0,12,B1060,-80.603956,28.608058,86
102,5e9d0d95eda69973a809d1ec,5ef6a48e0059c33cee4a829f,5e9e4502f509094188566f88,"{'core': '5e9e28a7f3591817f23b2663', 'flight': 3, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'ASDS', 'landpad': '5e9e3032383ecb6bb234e7ca'}",103,2020-10-06T11:29:00.000Z,2020-10-06,Falcon 9,KSC LC 39A,15600.0,VLEO,True ASDS,3,True,True,True,5e9e3032383ecb6bb234e7ca,5.0,13,B1058,-80.603956,28.608058,87
103,5e9d0d95eda69973a809d1ec,5ef6a4d50059c33cee4a82a1,5e9e4502f509094188566f88,"{'core': '5e9e28a6f35918c0803b265c', 'flight': 6, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'ASDS', 'landpad': '5e9e3032383ecb6bb234e7ca'}",104,2020-10-18T12:25:00.000Z,2020-10-18,Falcon 9,KSC LC 39A,15600.0,VLEO,True ASDS,6,True,True,True,5e9e3032383ecb6bb234e7ca,5.0,12,B1051,-80.603956,28.608058,88
104,5e9d0d95eda69973a809d1ec,5ef6a4ea0059c33cee4a82a2,5e9e4501f509094ba4566f84,"{'core': '5ef670f10059c33cee4a826c', 'flight': 3, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'ASDS', 'landpad': '5e9e3033383ecbb9e534e7cc'}",105,2020-10-24T15:31:00.000Z,2020-10-24,Falcon 9,CCSFS SLC 40,15600.0,VLEO,True ASDS,3,True,True,True,5e9e3033383ecbb9e534e7cc,5.0,12,B1060,-80.577366,28.561857,89


## Data Wrangling


We can see below that some of the rows are missing values in our dataset.


In [36]:
data_falcon9.isnull().sum()

Unnamed: 0,0
rocket,0
payloads,0
launchpad,0
cores,0
flight_number,0
date_utc,0
date,0
BoosterVersion,0
LaunchSite,0
PayloadMass,5


Before we can continue we must deal with these missing values. The <code>LandingPad</code> column will retain None values to represent when landing pads were not used.


### Task 3: Dealing with Missing Values


Calculate below the mean for the <code>PayloadMass</code> using the <code>.mean()</code>. Then use the mean and the <code>.replace()</code> function to replace `np.nan` values in the data with the mean you calculated.


In [37]:
# Calculate the mean of the PayloadMass column
payload_mass_mean = data_falcon9['PayloadMass'].mean()

# Replace the np.nan values with the mean
data_falcon9['PayloadMass'].replace(np.nan, payload_mass_mean, inplace=True)

# Display the updated DataFrame to verify
print(data_falcon9.head())

                      rocket                  payloads  \
5   5e9d0d95eda69973a809d1ec  5eb0e4b7b6c3bb0006eeb1e7   
7   5e9d0d95eda69973a809d1ec  5eb0e4bab6c3bb0006eeb1ea   
9   5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ed   
10  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ee   
11  5e9d0d95eda69973a809d1ec  5eb0e4bbb6c3bb0006eeb1ef   

                   launchpad  \
5   5e9e4501f509094ba4566f84   
7   5e9e4501f509094ba4566f84   
9   5e9e4501f509094ba4566f84   
10  5e9e4502f509092b78566f87   
11  5e9e4501f509094ba4566f84   

                                                                                                                                                                                                cores  \
5      {'core': '5e9e289ef359185f2b3b2628', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}   
7      {'core': '5e9e289ef35918f39c3b262a', 'flight': 1,

In [38]:
## Qyestion 1
# Year in 1st row of column `static_fire_data_utc`
spacex_url = "https://api.spacexdata.com/v4/launches/past"
response = requests.get(spacex_url)

# Check if the request was successful
if response.status_code == 200:
    # Step 3: Convert the response JSON to a DataFrame
    data = pd.json_normalize(response.json())

    # Convert type to 'year'
    data['static_fire_date_utc'] = pd.to_datetime(data['static_fire_date_utc'])

    # Get year
    first_row_year = data.loc[0, 'static_fire_date_utc'].year if pd.notnull(data.loc[0, 'static_fire_date_utc']) else None
else:
    first_row_year = None

first_row_year

2006

You should see the number of missing values of the <code>PayLoadMass</code> change to zero.


Now we should have no missing values in our dataset except for in <code>LandingPad</code>.


We can now export it to a <b>CSV</b> for the next section,but to make the answers consistent, in the next lab we will provide data in a pre-selected date range.


In [39]:
data_falcon9.to_csv('dataset_part_1.csv', index=False)


## Authors


<a href="https://www.linkedin.com/in/joseph-s-50398b136/">Joseph Santarcangelo</a> has a PhD in Electrical Engineering, his research focused on using machine learning, signal processing, and computer vision to determine how videos impact human cognition. Joseph has been working for IBM since he completed his PhD.


<!--## Change Log
-->


<!--

|Date (YYYY-MM-DD)|Version|Changed By|Change Description|
|-|-|-|-|
|2020-09-20|1.1|Joseph|get result each time you run|
|2020-09-20|1.1|Azim |Created Part 1 Lab using SpaceX API|
|2020-09-20|1.0|Joseph |Modified Multiple Areas|
-->


Copyright © 2021 IBM Corporation. All rights reserved.
