## DATA EXPLORATION

This document will serve as a ‘Data Exploration’ phase of the project. The document includes the procedures used to analyze the data set selected for the project. 
The selected data set describes the attributes of all the satellites launched across different countries. The data set has attempted to include all active satellites to provide the most relevant data set for research and analysis.
The major data attributes that are analyzed as a part of this deliverable are as follows - 

1. Number of Satellite Launches each Year
2. Number of Satellite by Country
3. Purpose of Satellites
4. Expected Lifetime of Satellites
5. Satellite Lifetime vs. Launch Year
6. Satellite Positions

In [3]:
import numpy as np
import pandas as pd
import xlrd


In [4]:
# importing some plotly modules

import chart_studio
import plotly.graph_objs as go
from plotly.offline import iplot, init_notebook_mode

# importing cufflinksand setting to enable using plotly offline

import cufflinks
cufflinks.go_offline(connected=True)
init_notebook_mode(connected=True)

import plotly.express as px

In [41]:
import plotly.graph_objects as go

In [5]:
data=pd.read_excel('UCS-Satellite-Database-4-1-2020.xls')

In [6]:
data.head()

Unnamed: 0,"Name of Satellite, Alternate Names",Country/Org of UN Registry,Country of Operator/Owner,Operator/Owner,Users,Purpose,Detailed Purpose,Class of Orbit,Type of Orbit,Number of Satellite,...,Year of Launch,Date of Launch,Expected Lifetime (yrs.),Contractor,Country of Contractor,Launch Site,Launch Vehicle,COSPAR Number,NORAD Number,Comments
0,1HOPSAT-TD (1st-generation High Optical Perfor...,NR (3/20),USA,Hera Systems,Commercial,Earth Observation,Technology Development,LEO,Non-Polar Inclined,1,...,2019,2019-12-11,0.5,Hera Systems,USA,Satish Dhawan Space Centre,PSLV,2019-089H,44589,Pathfinder for planned earth observation const...
1,3Cat-1,NR,Spain,Universitat Politècnica de Catalunya,Civil,Technology Development,,LEO,,1,...,2018,2018-11-29,,Universitat Politècnica de Catalunya,Spain,Satish Dhawan Space Centre,PSLV,2018-096K,43728,Student built.
2,Aalto-1,Finland,Finland,University of Aalto,Civil,Technology Development,,LEO,,1,...,2017,2017-06-23,2.0,University of Aalto,Finland,Satish Dhawan Space Centre,PSLV,2017-036L,42775,Technology development and education.
3,AAUSat-4,Denmark,Denmark,University of Aalborg,Civil,Earth Observation,Automatic Identification System (AIS),LEO,Sun-Synchronous,1,...,2016,2016-04-25,,University of Aalborg,Denmark,Guiana Space Center,Soyuz 2.1a,2016-025E,41460,Carries AIS system.
4,"ABS-2 (Koreasat-8, ST-3)",NR,Multinational,Asia Broadcast Satellite Ltd.,Commercial,Communications,,GEO,,1,...,2014,2014-02-06,15.0,Space Systems/Loral,USA,Guiana Space Center,Ariane 5 ECA,2014-006A,39508,"32 C-band, 51 Ku-band, and 6 Ka-band transpond..."


In [8]:
data.shape

(2666, 29)

Final data assessment after some data cleansing and massaging.

In [9]:
data.isnull().sum()

Name of Satellite, Alternate Names       0
Country/Org of UN Registry               0
Country of Operator/Owner                0
Operator/Owner                           0
Users                                    0
Purpose                                  0
Detailed Purpose                      1779
Class of Orbit                           0
Type of Orbit                          619
Number of Satellite                      0
Longitude of GEO (degrees)               0
Perigee (km)                             0
Apogee (km)                              0
Eccentricity                             0
Inclination (degrees)                    0
Period (minutes)                         0
Launch Mass (kg.)                      189
Dry Mass (kg.)                        2216
Power (watts)                         2063
Year of Launch                           0
Date of Launch                           0
Expected Lifetime (yrs.)              1519
Contractor                               0
Country of 

# 1. Number of Satellite Launches each Year
This histogram analyzes the number of satellites launched every year.

We see about a 50% increase in satellites starting in the year 2017, 
and we have launched 343 since the beginning of this year 2020.

In [111]:
data['Year of Launch'].iplot(kind='hist',xTitle='Year', yTitle='count', 
                             title='Number of Satellite Launches each Year', color='purple')

# 2. Number of Satellite by Country
This view analyses the number of satellites operated by each country.

The United States has complete ownership of 1308 satellites which are about half of the number of satellites in total, this ranking is followed by China having 356 and Russia having 167 satellite respectively.

In [55]:
import plotly.express as px
data_by_country=data['Country of Operator/Owner'].value_counts()
fig = px.bar(data_by_country,title="Number of Satellite By Country", log_y=True)
fig.show()

# 3. Purpose of Satellites
This graph plots the purpose of the satellites in form of a Pie Graph. 

About 45% of satellites are for Communication purposes, followed by Earth Observations including weather monitoring.

In [113]:
fig = px.pie(data, values='Number of Satellite', names='Purpose',  title='%Purpose of Satellites')
fig.show()

# 4. Expected Lifetime of Satellites
This graph plots the Expected Lifetime of all Satellites.

It seems most satellites have an expected lifetime of 15 year.

In [114]:
fig_box = px.histogram(data,  x="Expected Lifetime (yrs.)", color_discrete_sequence=['mediumturquoise'], title = 'Expected Lifetime of Satellites')
fig_box.show()


# 5. Satellite Lifetime vs. Launch Year
This plot analyzes satellite lifetime against its launch year. 

The trendline shows a slight increase in the lifeline of the satellite as we progress every year. 

In [118]:
fig = px.scatter(data, y="Expected Lifetime (yrs.)", x="Year of Launch", title = 'Satellite Lifetime vs. Launch Year',
	         size="Number of Satellite", color="Year of Launch",
                 hover_name="Operator/Owner", size_max=10, trendline="ols")
fig.show()

# 6. Satellite Positions
This plot analyzes satellites as positioned around the surface of the Earth’s center of mass. 

It plots satellites from the point of the orbit closest to the Earth, perigee against the point of the orbit farthest from the Earth, apogee. The color refers to the class of orbit.

This shows most satellites are positioned in orbits closest to the surface of the Earth.

In [120]:
import plotly.express as px


fig = px.scatter(data, x="Perigee (km)", y="Apogee (km)", color="Class of Orbit",
                 hover_name="Class of Orbit", log_x=True, size_max=120, title= 'Satellite Positions')
fig.show()

To break it down further I have analyzed this graph in 3D below.

Apogee and Perigee are addressed along with eccentricity in Z-axis, describes how strongly the orbit deviates from a circle.

Using the turntable rotation of this 3D scatter plot we can analyze Low Earth Orbit (LEO) which refers to orbits with close altitudes hence the plot marks these satellites close to Zero, followed by Medium Earth Orbit (MEO) and Geosynchronous Orbit (GEO)

Elliptical orbits have apogees and perigees that differ significantly from each other and they spend time at many different altitudes above the earth’s surface and hence 3D increase the visibility of these satellites higher up as compared to a 2D scatter plot.

In [88]:
fig_3D = px.scatter_3d(data, x="Perigee (km)", y="Apogee (km)", color="Class of Orbit", z='Eccentricity')
fig_3D.show()

# SUMMARY
Based on many of these above analyses, the dataset could be used to predict different characteristics. 
Some striking predictions include using linear regression to compute the number of satellite launches in the next 20 years or predictive algorithms to analyze the changing climate/weather data collected by these satellites on merging with global climate data.