# **Space X  Falcon 9 First Stage Landing Prediction**

## Exploratory Data Analysis with SQL

In this part of the project, we will perform some Exploratory Data Analysis (EDA) using SQL by querying data stored in a PostgreSQL database. 

This analysis will give us additional understanding of the data.

## Objectives

In this Python notebook we will:

1.  Understand the Spacex DataSet
2.  Load the dataset from the corresponding table in a PostgreSQL database
3.  Execute SQL queries to answer questions that guide our analysis

***

First let's import required packages for this lab

In [1]:
import psycopg2
from config import config
from sqlalchemy.engine.url import URL
%load_ext sql

The SpaceX dataset has been loaded into a local PostgreSQL using the csv output ('dataset_part_2.csv') from part 3 of the project: 'Data Wrangling'.

We will be connecting to the local database server using a custom config function.

## Connecting to PostgreSQL

In [None]:
# Read connection parameters
params = config()
# Create the Connect string
url = URL.create(**params)
print(url)

Sensitive code cells have been removed.

Let's check our connection by looking at a few rows from the table

In [4]:
%sql select * from spacextbl limit 5;

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
5 rows affected.


flight_id,date_,time_,booster_version,launch_site,payload,payload_mass_kg,orbit,customer,mission_outcome,landing_outcome
4,2010-06-04,18:45:00,F9 v1.0 B0003,CCAFS LC-40,Dragon Spacecraft Qualification Unit,0,LEO,SpaceX,Success,Failure (parachute)
5,2010-12-08,15:43:00,F9 v1.0 B0004,CCAFS LC-40,"Dragon demo flight C1, two CubeSats, barrel of Brouere cheese",0,LEO (ISS),NASA (COTS) NRO,Success,Failure (parachute)
6,2012-05-22,07:44:00,F9 v1.0 B0005,CCAFS LC-40,Dragon demo flight C2,525,LEO (ISS),NASA (COTS),Success,No attempt
7,2012-10-08,00:35:00,F9 v1.0 B0006,CCAFS LC-40,SpaceX CRS-1,500,LEO (ISS),NASA (CRS),Success,No attempt
8,2013-03-01,15:10:00,F9 v1.0 B0007,CCAFS LC-40,SpaceX CRS-2,677,LEO (ISS),NASA (CRS),Success,No attempt


## Data Exploration

##### Let's take a look at the names of the unique launch sites  in the space mission

In [5]:
%sql select DISTINCT (launch_site) from spacextbl;

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
4 rows affected.


launch_site
CCAFS SLC-40
KSC LC-39A
CCAFS LC-40
VAFB SLC-4E


##### Next, let's display launch sites where launch sites begin with the string 'CCA'

In [6]:
%sql select * from SPACEXTBL where launch_site like 'CCA%' limit 5

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
5 rows affected.


flight_id,date_,time_,booster_version,launch_site,payload,payload_mass_kg,orbit,customer,mission_outcome,landing_outcome
4,2010-06-04,18:45:00,F9 v1.0 B0003,CCAFS LC-40,Dragon Spacecraft Qualification Unit,0,LEO,SpaceX,Success,Failure (parachute)
5,2010-12-08,15:43:00,F9 v1.0 B0004,CCAFS LC-40,"Dragon demo flight C1, two CubeSats, barrel of Brouere cheese",0,LEO (ISS),NASA (COTS) NRO,Success,Failure (parachute)
6,2012-05-22,07:44:00,F9 v1.0 B0005,CCAFS LC-40,Dragon demo flight C2,525,LEO (ISS),NASA (COTS),Success,No attempt
7,2012-10-08,00:35:00,F9 v1.0 B0006,CCAFS LC-40,SpaceX CRS-1,500,LEO (ISS),NASA (CRS),Success,No attempt
8,2013-03-01,15:10:00,F9 v1.0 B0007,CCAFS LC-40,SpaceX CRS-2,677,LEO (ISS),NASA (CRS),Success,No attempt


##### What is the total payload mass carried by boosters launched by NASA (CRS)?

In [7]:
%sql select sum(payload_mass_kg) as Total from spacextbl where customer = 'NASA (CRS)'

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
1 rows affected.


total
45596


##### What is the average payload mass carried by booster version F9 v1.1?

In [8]:
%sql select avg(payload_mass_kg) as Average_Payload from spacextbl where booster_version like 'F9 v1.1%'

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
1 rows affected.


average_payload
2534.6666666666665


##### On what date was the first successful landing outcome on a ground pad was acheived?

In [9]:
%sql select min(date_) from spacextbl where landing_outcome = 'Success (ground pad)'

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
1 rows affected.


min
2015-12-22


##### Has SpaceX carried out any launches on a Friday?

In [10]:
%sql select * from spacextbl where EXTRACT(DOW from date_) = 5 limit 5

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
5 rows affected.


flight_id,date_,time_,booster_version,launch_site,payload,payload_mass_kg,orbit,customer,mission_outcome,landing_outcome
4,2010-06-04,18:45:00,F9 v1.0 B0003,CCAFS LC-40,Dragon Spacecraft Qualification Unit,0,LEO,SpaceX,Success,Failure (parachute)
8,2013-03-01,15:10:00,F9 v1.0 B0007,CCAFS LC-40,SpaceX CRS-2,677,LEO (ISS),NASA (CRS),Success,No attempt
12,2014-04-18,19:25:00,F9 v1.1,CCAFS LC-40,SpaceX CRS-3,2296,LEO (ISS),NASA (CRS),Success,Controlled (ocean)
25,2016-03-04,23:35:00,F9 FT B1020,CCAFS LC-40,SES-9,5271,GTO,SES,Success,Failure (drone ship)
26,2016-04-08,20:43:00,F9 FT B1021.1,CCAFS LC-40,SpaceX CRS-8,3136,LEO (ISS),NASA (CRS),Success,Success (drone ship)


##### What are the names of the boosters which have success in drone ship and have payload mass greater than 4000 but less than 6000?

In [11]:
%sql select booster_version from spacextbl where landing_outcome = 'Success (drone ship)' and (payload_mass_kg between 4000 and 6000)

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
4 rows affected.


booster_version
F9 FT B1022
F9 FT B1026
F9 FT B1021.2
F9 FT B1031.2


##### What are the total number of successful and failure mission outcomes?

In [12]:
%sql select mission_outcome, COUNT(*) as Total from spacextbl group by mission_outcome

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
4 rows affected.


mission_outcome,total
Success (payload status unclear),1
Success,98
Success,1
Failure (in flight),1


##### What are the names of the booster_versions which have carried the maximum payload mass?

In [13]:
# Using a subquery
%sql select Distinct (booster_version) from spacextbl where payload_mass_kg = (select max(payload_mass_kg) from spacextbl)

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
12 rows affected.


booster_version
F9 B5 B1048.4
F9 B5 B1048.5
F9 B5 B1049.4
F9 B5 B1049.5
F9 B5 B1049.7
F9 B5 B1051.3
F9 B5 B1051.4
F9 B5 B1051.6
F9 B5 B1056.4
F9 B5 B1058.3


##### Let's see the failed landing_outcomes on drone ships, their booster versions, and launch site names in 2015.

In [14]:
%sql select booster_version, launch_site from spacextbl where landing_outcome = 'Failure (drone ship)' and EXTRACT(YEAR from date_) = '2015'

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
2 rows affected.


booster_version,launch_site
F9 v1.1 B1012,CCAFS LC-40
F9 v1.1 B1015,CCAFS LC-40


##### Lastly, let us rank the count of landing outcomes (such as Failure (drone ship) or Success (ground pad)) between the date 2010-06-04 and 2017-03-20, in descending order.

In [15]:
%sql select landing_outcome, count(*) as Count from spacextbl where date_ between '2010-06-04' and '2017-03-20' group by landing_outcome order by Count DESC

 * postgresql+psycopg2://postgres:***@localhost:5432/mlprojects
8 rows affected.


landing_outcome,count
No attempt,10
Failure (drone ship),5
Success (drone ship),5
Success (ground pad),3
Controlled (ocean),3
Uncontrolled (ocean),2
Failure (parachute),2
Precluded (drone ship),1
