# Collecting Wrangling Data from SpaceX Api

### Project Description

In this capstone project, we aim to predict whether the Falcon 9 first stage will successfully land. 

SpaceX advertises Falcon 9 rocket launches at a cost of 62 million USD, whereas other providers charge upwards of  165 million USD per launch. A significant portion of SpaceX’s cost savings comes from its ability to reuse the first stage of the rocket.

By predicting the success of the first stage landing, we can estimate the cost of a launch. This information can be valuable for alternative companies that may want to compete with SpaceX for launch contracts.

In this lab, we will:
- Collect data from the SpaceX API.
- Ensure the data is properly formatted for analysis.

## Goal / Objectives

### Objective
Reduce the cost of a launch by determining whether SpaceX can reuse the first stage of its rockets.

### Tasks
- Perform basic data wrangling and formatting.
- Retrieve data from the SpaceX API.
- Clean and preprocess the requested data.

## Import Libraries and Define Auxiliary Functions

In [64]:
import json
import requests
import logging
import pandas as pd
import numpy as np
import datetime
from tqdm import tqdm

# Setting this option will print all collumns of a dataframe
pd.set_option('display.max_columns', None)
# Setting this option will print all of the data in a feature
pd.set_option('display.max_colwidth', None)

## Request and parse the SpaceX launch data using the GET request

In [65]:
# FUNCTION FOR CALLING API WITH EXCEPTION HANDING
def load_API_call_space(static_url):
    try:
        response = requests.get(static_url, timeout=10)
        response.raise_for_status()
        try:
            json_data = response.json()
            return pd.DataFrame(json_data)
        except json.decoder.JSONDecodeError as json_error:
            logging.error(f"JSON decoding error: {json_error}")
            return pd.DataFrame()
    except requests.exceptions.RequestException as e:
        print(f"Request error: {e}")
        
    return pd.DataFrame()

In [66]:
url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBM-DS0321EN-SkillsNetwork/datasets/API_call_spacex_api.json'
data = load_API_call_space(url)
data.sample(2)

Unnamed: 0,fairings,links,static_fire_date_utc,static_fire_date_unix,tbd,net,window,rocket,success,details,crew,ships,capsules,payloads,launchpad,auto_update,failures,flight_number,name,date_utc,date_unix,date_local,date_precision,upcoming,cores,id
41,"{'reused': False, 'recovery_attempt': False, 'recovered': False, 'ships': []}","{'patch': {'small': 'https://images2.imgbox.com/1b/40/Ouyy9Neh_o.png', 'large': 'https://images2.imgbox.com/3b/6c/d5ulGpoh_o.png'}, 'reddit': {'campaign': 'https://www.reddit.com/r/spacex/comments/69hhkm/bulgariasat1_launch_campaign_thread/', 'launch': 'https://www.reddit.com/r/spacex/comments/6isph2/welcome_to_the_rspacex_bulgariasat1_official/', 'media': 'https://www.reddit.com/r/spacex/comments/6iuj1z/rspacex_bulgariasat1_media_thread_videos_images/', 'recovery': 'https://www.reddit.com/r/spacex/comments/6k3kop/b10292_bulgariasat_1_recovery_thread/'}, 'flickr': {'small': [], 'original': ['https://farm5.staticflickr.com/4216/35496028185_ac5456195f_o.jpg', 'https://farm5.staticflickr.com/4278/35496027525_9ab9d90417_o.jpg', 'https://farm5.staticflickr.com/4277/35496026875_fd25c46934_o.jpg', 'https://farm5.staticflickr.com/4257/35496026065_02fe65754b_o.jpg', 'https://farm5.staticflickr.com/4289/35491530485_5a4d0f39ae_o.jpg', 'https://farm5.staticflickr.com/4279/35491529875_1e35ee0a1e_o.jpg', 'https://farm5.staticflickr.com/4230/34681559323_53f05581ca_o.jpg']}, 'presskit': 'http://www.spacex.com/sites/spacex/files/bulgariasat1presskit.pdf', 'webcast': 'https://www.youtube.com/watch?v=Y8mLi-rRTh8', 'youtube_id': 'Y8mLi-rRTh8', 'article': 'https://en.wikipedia.org/wiki/BulgariaSat-1', 'wikipedia': 'https://en.wikipedia.org/wiki/BulgariaSat-1'}",2017-06-15T22:25:00.000Z,1497566000.0,False,False,7200.0,5e9d0d95eda69973a809d1ec,True,Second time a booster will be reused: Second flight of B1029 after the Iridium mission of January 2017. The satellite will be the first commercial Bulgarian-owned communications satellite and it will provide television broadcasts and other communications services over southeast Europe.,[],"[5ea6ed2e080df4000697c906, 5ea6ed2f080df4000697c90b, 5ea6ed2f080df4000697c90c, 5ea6ed30080df4000697c913]",[],[5eb0e4c4b6c3bb0006eeb20f],5e9e4502f509094188566f88,True,[],42,BulgariaSat-1,2017-06-23T19:10:00.000Z,1498245000,2017-06-23T15:10:00-04:00,hour,False,"[{'core': '5e9e28a3f359189e3a3b2645', 'flight': 2, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'ASDS', 'landpad': '5e9e3032383ecb6bb234e7ca'}]",5eb87d04ffd86e000604b353
50,,"{'patch': {'small': 'https://images2.imgbox.com/ea/12/8vVzlOeL_o.png', 'large': 'https://images2.imgbox.com/1b/30/oP1DBQ6b_o.png'}, 'reddit': {'campaign': 'https://www.reddit.com/r/spacex/comments/7bxg5a/crs13_launch_campaign_thread/', 'launch': 'https://www.reddit.com/r/spacex/comments/7j725w/rspacex_crs13_official_launch_discussion_updates/', 'media': 'https://www.reddit.com/r/spacex/comments/7j6oxz/rspacex_crs13_media_thread_videos_images_gifs/', 'recovery': None}, 'flickr': {'small': [], 'original': ['https://farm5.staticflickr.com/4591/38372264594_8140bd943d_o.png', 'https://farm5.staticflickr.com/4546/39051469552_13703e6b2e_o.jpg', 'https://farm5.staticflickr.com/4682/39051469662_55c55150c0_o.jpg', 'https://farm5.staticflickr.com/4565/25215551218_2597838c1a_o.jpg', 'https://farm5.staticflickr.com/4680/39051469812_b6f802fc9d_o.jpg', 'https://farm5.staticflickr.com/4517/27304331429_59b9d6c1d4_o.jpg']}, 'presskit': 'http://www.spacex.com/sites/spacex/files/crs13presskit12_11.pdf', 'webcast': 'https://www.youtube.com/watch?v=OPHbqY9LHCs', 'youtube_id': 'OPHbqY9LHCs', 'article': 'https://spaceflightnow.com/2017/12/15/spacexs-50th-falcon-rocket-launch-kicks-off-station-resupply-mission/', 'wikipedia': 'https://en.wikipedia.org/wiki/SpaceX_CRS-13'}",2017-12-06T20:00:00.000Z,1512590000.0,False,False,0.0,5e9d0d95eda69973a809d1ec,True,Will reuse the Dragon capsule previously flown on CRS-6 and will reuse the booster from CRS-11.,[],[5ea6ed30080df4000697c912],[5e9e2c5cf359188bfb3b266b],[5eb0e4c5b6c3bb0006eeb218],5e9e4501f509094ba4566f84,True,[],51,CRS-13,2017-12-15T15:36:00.000Z,1513352160,2017-12-15T10:36:00-05:00,hour,False,"[{'core': '5e9e28a3f3591856803b264a', 'flight': 2, 'gridfins': True, 'legs': True, 'reused': True, 'landing_attempt': True, 'landing_success': True, 'landing_type': 'RTLS', 'landpad': '5e9e3032383ecb267a34e7c7'}]",5eb87d0effd86e000604b35c


In [75]:
# Let's take a subset of our dataframe keeping only the features we want and the flight number, and date_utc.
data = data[['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc']]
data.head()

Unnamed: 0,rocket,payloads,launchpad,cores,flight_number,date_utc
0,5e9d0d95eda69955f709d1eb,5eb0e4b5b6c3bb0006eeb1e1,5e9e4502f5090995de566f86,"{'core': '5e9e289df35918033d3b2623', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",1,2006-03-24T22:30:00.000Z
1,5e9d0d95eda69955f709d1eb,5eb0e4b6b6c3bb0006eeb1e2,5e9e4502f5090995de566f86,"{'core': '5e9e289ef35918416a3b2624', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",2,2007-03-21T01:10:00.000Z
3,5e9d0d95eda69955f709d1eb,5eb0e4b7b6c3bb0006eeb1e5,5e9e4502f5090995de566f86,"{'core': '5e9e289ef3591855dc3b2626', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",4,2008-09-28T23:15:00.000Z
4,5e9d0d95eda69955f709d1eb,5eb0e4b7b6c3bb0006eeb1e6,5e9e4502f5090995de566f86,"{'core': '5e9e289ef359184f103b2627', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",5,2009-07-13T03:35:00.000Z
5,5e9d0d95eda69973a809d1ec,5eb0e4b7b6c3bb0006eeb1e7,5e9e4501f509094ba4566f84,"{'core': '5e9e289ef359185f2b3b2628', 'flight': 1, 'gridfins': False, 'legs': False, 'reused': False, 'landing_attempt': False, 'landing_success': None, 'landing_type': None, 'landpad': None}",6,2010-06-04T18:45:00.000Z


In [71]:
# We will remove rows with multiple cores because those are falcon rockets with 2 extra rocket boosters and rows that have multiple payloads in a single rocket.
data = data[data['cores'].map(len)==1]
data = data[data['payloads'].map(len)==1]

In [72]:
# Since payloads and cores are lists of size 1 we will also extract the single value in the list and replace the feature.
data['cores'] = data['cores'].map(lambda x : x[0])
data['payloads'] = data['payloads'].map(lambda x : x[0])

In [73]:
# We also want to convert the date_utc to a datetime datatype and then extracting the date leaving the time
data['date'] = pd.to_datetime(data['date_utc']).dt.date

In [74]:
# Using the date we will restrict the dates of the launches
data = data[data['date'] <= datetime.date(2020, 11, 13)]

### Data Requirements from the SpaceX API

From the **rocket**, we would like to learn:
- The **booster name**

From the **payload**, we would like to learn:
- The **mass** of the payload
- The **orbit** that it is going to

From the **launchpad**, we would like to know:
- The **name of the launch site** being used
- The **longitude** of the launch site
- The **latitude** of the launch site

From **cores**, we would like to learn:
- The **outcome** of the landing
- The **type** of the landing
- The **number of flights** with that core
- Whether **gridfins** were used
- Whether the core is **reused**
- Whether **legs** were used
- The **landing pad** used
- The **block** of the core (a number used to separate versions of cores)
- The **number of times** this specific core has been **reused**
- The **serial** of the core

The data from these requests will be stored in lists and will be used to create a new **DataFrame** for further analysis.