# Introduction

In this section, I'll be performing some exploratory data analysis on the data on the Space X Launches that I scraped from Wikipedia. The data analysis will be done with SQL this time. To use SQL to query the data, I will connect to a DB file I created and then load the CSV file into the database as a new table using Pandas.

## Setting Up Environment

In [1]:
%load_ext sql

In [2]:
import sqlite3
import csv
import pandas as pd
import numpy as np

con = sqlite3.connect("datasets/SpaceXDB.db")
cur = con.cursor()

In [3]:
%sql sqlite:///datasets/SpaceXDB.db

In [4]:
data = 'datasets/launch_data_falcon9_wiki.csv'
df = pd.read_csv(data)

# load the dataframe into SpaceXDB
df.to_sql("SPACEXTBL", con, if_exists='replace', index=False, method="multi")

121

## Querying the Database with SQL

Now the database is set up and the table has been loaded in.

### Task 1

##### Display the names of the unique launch sites  in the space mission.

In [5]:
%%sql
SELECT DISTINCT "Launch site"
FROM SPACEXTBL;

 * sqlite:///datasets/SpaceXDB.db
Done.


Launch site
CCAFS
VAFB
Cape Canaveral
KSC
CCSFS


### Task 2

##### Display 5 records where launch sites begin with the string 'CCA'

In [6]:
%%sql
SELECT * 
FROM SPACEXTBL
WHERE "Launch site"
LIKE "CCA%"
LIMIT 5;

 * sqlite:///datasets/SpaceXDB.db
Done.


Flight No.,Launch site,Payload,Payload mass,Orbit,Customer,Launch outcome,Version Booster,Booster landing,Date,Time
1,CCAFS,Dragon Spacecraft Qualification Unit,0,LEO,SpaceX,Success,F9 v1.0B0003.1,Failure,4 June 2010,18:45
2,CCAFS,Dragon,0,LEO,NASA (COTS) NRO,Success,F9 v1.0B0004.1,Failure,8 December 2010,15:43
3,CCAFS,Dragon,525,LEO,NASA (COTS),Success,F9 v1.0B0005.1,No attempt,22 May 2012,07:44
4,CCAFS,SpaceX CRS-1,4700,LEO,NASA (CRS),Success,F9 v1.0B0006.1,No attempt,8 October 2012,00:35
5,CCAFS,SpaceX CRS-2,4877,LEO,NASA (CRS),Success,F9 v1.0B0007.1,No attempt,1 March 2013,15:10


### Task 3

##### Display the total payload mass carried by boosters launched by NASA (CRS)

In [9]:
%%sql
SELECT Customer, SUM("Payload mass") as Total_Payload_Carried
FROM SPACEXTBL
GROUP BY Customer
HAVING Customer='NASA (CRS)';

 * sqlite:///datasets/SpaceXDB.db
Done.


Customer,Total_Payload_Carried
NASA (CRS),48.0
