<img src="Images/SpaceX.jpg" width="5000" height="5000" alt="image description">

# SpaceX - Satellite Failure Prediction
<hr style="border-top: 2px solid black;">

**Introduction:** SpaceX, a leading innovator in the satellite industry, provides advanced satellite-based services and technology to customers globally. With the increasing demand for satellite-based internet and communications, the company is dedicated to ensuring the reliability and longevity of their satellite fleet while minimizing operational costs. One of their most notable ventures is the Starlink program, which aims to provide high-speed internet access to remote and underserved areas around the world through a network of thousands of satellites in low Earth orbit. Starlink's goal is to offer satellite-based internet access to individuals, businesses, and organizations, who would otherwise be unable to access high-speed internet, thus closing the digital divide and connecting the world.

**Business Problem:** SpaceX, through its Starlink program, is facing a significant challenge in maintaining the integrity of their satellite fleet. The high costs associated with replacing failed satellites pose a significant financial burden to the company, and can negatively impact the availability and quality of service for customers. To address this problem, the company aims to predict which satellites are at risk of failure, in order to implement preventative measures and avoid costly replacements. By using data and machine learning techniques, we can analyze telemetry data, weather data, and other relevant information to identify patterns and trends that indicate a satellite's likelihood of failure. This will enable the company to proactively address potential issues and ensure the continuity of service for customers, while reducing costs and maintaining the overall efficiency of the satellite fleet.

**Project Overview:** This project aims to address the challenge of maintaining the integrity and reliability of SpaceX's satellite fleet through the implementation of advanced data analysis and machine learning techniques. As the lead Data Engineer and Data Scientist, my role is to design and implement a comprehensive data pipeline that ingests, processes, and analyzes various data sources such as telemetry, weather and satellite configuration data. By leveraging these insights, I will create predictive models that identify the likelihood of satellite failure based on various factors such as satellite age, orbital characteristics and environmental conditions. These predictions will enable SpaceX to adopt preventative measures and optimize their satellite fleet, resulting in increased reliability, improved efficiency and cost-effectiveness of their operations. Ultimately, this project aims to enhance the customer experience and contribute to the company's profitability.

**Data Collection:** The satellite and telemetry data used in this project was obtained from Kaggle (https://www.kaggle.com/), a platform that hosts a wide range of datasets. The weather data was sourced from the National Oceanic and Atmospheric Administration (NOAA) (https://www.noaa.gov/). The data was collected over a period of several years to ensure a large and diverse dataset for training the predictive models. It is worth noting that this data will be used solely for the purpose of personal projects aimed at increasing my machine learning technical skills and for no other purpose.

In [1]:
!ls Data

SpaceX Satellite Dataset.csv weather.csv
iot_telemetry_data.csv


In [2]:
import pandas as pd

In [3]:
df = pd.read_csv('Data/SpaceX Satellite Dataset.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,Satellite ID(Fake),Current Official Name of Satellite,Country/Org of UN Registry,Country of Operator/Owner,Users,Class of Orbit,Type of Orbit,Longitude of GEO (degrees),Perigee (km),...,Period (minutes),Launch Mass (kg.),Date of Launch,Expected Lifetime (yrs.),Contractor,Country of Contractor,Launch Site,Launch Vehicle,COSPAR Number,NORAD Number
0,0,1,Starlink-1007,USA,USA,Commercial,LEO,Non-Polar Inclined,0.0,559.0,...,95.9,227.0,2019-11-11,,SpaceX,USA,Cape Canaveral,Falcon 9,2019-074A,44713.0
1,1,2,Starlink-1008,USA,USA,Commercial,LEO,Non-Polar Inclined,0.0,549.0,...,95.6,227.0,2019-11-11,,SpaceX,USA,Cape Canaveral,Falcon 9,2019-074B,44714.0
2,2,3,Starlink-1009,USA,USA,Commercial,LEO,Non-Polar Inclined,0.0,549.0,...,95.5,227.0,2019-11-11,,SpaceX,USA,Cape Canaveral,Falcon 9,2019-074C,44715.0
3,3,4,Starlink-1010,USA,USA,Commercial,LEO,Non-Polar Inclined,0.0,533.0,...,95.3,227.0,2019-11-11,,SpaceX,USA,Cape Canaveral,Falcon 9,2019-074D,44716.0
4,4,5,Starlink-1011,USA,USA,Commercial,LEO,Non-Polar Inclined,0.0,548.0,...,95.6,227.0,2019-11-11,,SpaceX,USA,Cape Canaveral,Falcon 9,2019-074E,44717.0


In [4]:
print(df.columns)

Index(['Unnamed: 0', 'Satellite ID(Fake)',
       'Current Official Name of Satellite', 'Country/Org of UN Registry',
       'Country of Operator/Owner', 'Users', 'Class of Orbit', 'Type of Orbit',
       'Longitude of GEO (degrees)', 'Perigee (km)', 'Apogee (km)',
       'Eccentricity', 'Inclination (degrees)', 'Period (minutes)',
       'Launch Mass (kg.)', 'Date of Launch', 'Expected Lifetime (yrs.)',
       'Contractor', 'Country of Contractor', 'Launch Site', 'Launch Vehicle',
       'COSPAR Number', 'NORAD Number'],
      dtype='object')
