# IBM-Data-Science-Capstone-SpaceX

# Introduction

![](https://drop.ndtv.com/images/homepage/spacex_rocket_falcon_heavy_650_636535850757153131.gif?downsize=773:435)

## Background
SpaceX promotes Falcon 9 rocket launches at a cost of 62 million, significantly lower than the 165 million or more charged by other providers. This cost reduction is largely due to the reusability of the rocket's first stage. By predicting whether the first stage will successfully land, we can estimate the potential cost of a launch.To achieve this, I leverage publicly available data and apply machine learning models.

## Explore
* Analyze how payload mass, launch sites, orbit types, and flight numbers influence outcomes.
* Examine the rate of successful landings over time to identify trends.
* Identify the best predictive model for successful landings using binary classification.

## Executive Summary
Summary of Methodologies
To identify patterns in landing outcomes, the following approaches were utilized:
**Data Collection:** 
* Using API GET requests and Web Scraping
* Data Wranglin

**Exploratory Data Analisis:** using SQL and Data Visualization
**Interactive Visualization:**  using Folium and Potly Dash 
**Machine Learning Prediction**
**Summary of Results**
**Exploratory Data Analysis result**  
**Visualization Insights in screenshots**
**Machine Learning Prediction Model** that demonstrated the best performance across evaluation metrics

## Results

### Exploratory Data Analysis:

* **Orbit Success Rates:**
    * SSO Orbit: 100% success rate
    For VLEO, LEO, ISS, MEO Orbits: 60-90% success rate.
    LEO, ISS, PO Orbits: Success correlates with higher flight numbers and greater payloads.
    * GTO Orbit: Success remains unpredictable.
* **Trends and Sites:**
    * Success rates increased steadily from 2013 to 2020.
    * KSC LC-39A has the highest success rate.

* **Booster Version:**
    * FT Booster: Highest success rate overall.
    * B4 Booster: Performs best at KSC LC-39A.

* **Visual Insights:**
    All launch sites are located near the Equator and along coastal areas.
They are strategically placed far from urban areas, highways, and railways to minimize risks from potential launch failures, while still being close enough for logistical support.

* **Predictive Analysis Results**
    * The Decision Tree Classifier is the best-performing model for predicting launch outcomes.

# Methodology

## Data Collection - API
* **Request data** from SpaceX API (rocket launch data)
* **Decode response** using .json() and convert to a dataframe using .json_normalize()
* **Request information** about the launches from SpaceX API using custom functions
* **Create dictionary** from the data
* **Create dataframe** from the dictionary
* **Filter dataframe** to contain only Falcon 9 launches
* **Replace missing values** of Payload Mass with calculated .mean()
* **Export data** to csv file

## Data Collection - Web Scraping
* **Request data** (Falcon 9 launch data) from Wikipedia
* **Create BeautifulSoup object** from HTML response
* **Extract column names** from HTML table header
* **Collect data** from parsing HTML tables
* **Create dictionary** from the data
* **Create dataframe** from the dictionary
* **Export data** to csv file

## Data Wrangling
* **Convert outcomes** into 1 for a successful landing and 0 for an unsuccessful landing

## EDA with Visualization
* **Create charts** to analyze relationships and show comparisons

## EDA with SQL
* **Query the data** to understand more about the data

## Maps with Folium
* **Create maps** to visualize launch sites, view launch outcomes and see distance to proximities

## Dashboard with Plotly Dash
* **Create dashboard**
* Pie chart showing successful launches
* Scatter chart showing Payload Mass vs. Success Rate by Booster Version

## Predictive Analytics
* **Create** NumPy array from the Class column
* **Standardize** the data with StandardScaler. Fit and transform the data.
* **Split** the data using train_test_split
* **Create** a GridSearchCV object with cv=10 for parameter optimization
* **Apply** GridSearchCV on different algorithms: logistic regression (LogisticRegression()), support vector machine (SVC()), decision tree (DecisionTreeClassifier()), K-Nearest Neighbor (KNeighborsClassifier())
* **Calculate** accuracy on the test data using .score() for all models
* **Assess** the confusion matrix for all models
* **Identify** the best model using Jaccard_Score, F1_Score and Accuracy

# Conclusion

* **Orbit Insights**
    * SSO Orbit demonstrates a perfect 100% success rate, highlighting its reliability.
    * For VLEO, LEO, ISS, and MEO orbits, success rates vary between 60-90%, indicating a need for further refinement in these categories.
    * LEO, ISS, and PO orbits show a positive correlation between success rates and higher flight numbers as well as greater payloads.
    * GTO orbit success remains inconsistent, making predictions challenging.
      
* **Trends and Launch Sites**
    * The success rate has steadily increased from 2013 to 2020, reflecting advancements in technology and operations.
    * Among all sites, KSC LC-39A has the highest success rate, making it the most reliable.

* **Booster Performance**
    * The FT Booster version has the highest overall success rate across all orbits.
    * The B4 Booster performs exceptionally well when launched from KSC LC-39A.

* **Visual and Strategic Insights**
    * All launch sites are positioned close to the Equator and coastal areas, leveraging physics for efficient launches.
    * Sites are located far from urban centers, highways, and railways, reducing potential risks from failed launches while remaining logistically viable.
      
* **Predictive Modeling**
    * The Decision Tree Classifier emerged as the most effective model for predicting launch outcomes, supported by superior evaluation metrics.
      

* **Overall Key Takeaways**
The steady improvement in success rates indicates strong growth in space launch capabilities.
Strategic placement of sites and advancements in booster technology significantly contribute to mission success.
Data-driven predictive models offer promising accuracy for future mission planning.






