# 🚀 SpaceX Falcon 9 first stage Landing Prediction

## 1. Introduction

The goal of this project is to **predict whether the Falcon 9 first stage will land successfully**.

Falcon 9 is a **reusable rocket** developed by the company SpaceX. It consists of two main parts (stages):

- **First stage** is responsible for the initial launch and pushing the rocket upwards into the sky
- Second stage activates after the first stage separates and is in charge of placing the payload into orbit

The key part is that the **first stage must land back on Earth** so it can be reused. 
SpaceX advertises a single Falcon 9 rocket launch at a cost of $62M, while other providers can charge upwards of $165M. A big part of the cost savings comes from being able to reuse the first stage.

If we can figure out what factors influence a successful landing, we can build a model to predict it — which helps with planning, saving money, and improving future missions.

As part of the IBM course, I completed labs on collecting data from the [SpaceX API]("https://api.spacexdata.com/v4/launches/past") and a [Wikipedia article]("https://en.wikipedia.org/wiki/List_of_Falcon_9_and_Falcon_Heavy_launches") on Falcon 9 launches. This notebook skips that part and uses the prepared `space-data.csv` file as the main data source.

## 2. Importing the libraries

We begin by importing the necessary libraries that will help us analyze the data and build the prediction model:
- `pandas` for data manipulation and cleaning
- `numpy` for numerical operations
- `matplotlib` and `seaborn` for exploratory data analysis (EDA) and visualization
- `scikit-learn` which provides implemented machine learning algorithms and tools

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Scikit-learn modules:
# Preprocessing allows us to standarsize our data
from sklearn import preprocessing
# Allows us to split our data into training and testing data
from sklearn.model_selection import train_test_split
# Allows us to test parameters of classification algorithms and find the best one
from sklearn.model_selection import GridSearchCV
# Logistic Regression classification algorithm
from sklearn.linear_model import LogisticRegression
# Support Vector Machine classification algorithm
from sklearn.svm import SVC
# Decision Tree classification algorithm
from sklearn.tree import DecisionTreeClassifier
# K Nearest Neighbors classification algorithm
from sklearn.neighbors import KNeighborsClassifier