<a href="https://colab.research.google.com/github/cqwhite/MachineLearning/blob/master/Kratos_Visualizations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Data
For these visualizations we'll specifically be using our training dataset for satelite 42709. This data is publicly available on [GitHub Gist](https://gist.github.com/davidchalifoux/c1c780966a270ade20bf8b2a5520521e).

Let's take a quick look at what this dataset looks like:

In [3]:
import pandas as pd
train_df = pd.read_csv("https://gist.githubusercontent.com/davidchalifoux/c1c780966a270ade20bf8b2a5520521e/raw").dropna()
train_df

Unnamed: 0.1,Unnamed: 0,primary_rx_time,175_177_tdoa,175_177_fdoa,175_176_tdoa,175_176_fdoa,176_177_tdoa,176_177_fdoa,maneuver,175_177_tdoa_scaled,175_177_fdoa_scaled,175_176_tdoa_scaled,175_176_fdoa_scaled,176_177_tdoa_scaled,176_177_fdoa_scaled
0,3085,2021-11-10 04:31:25+00:00,0.006301,0.006301,0.002895,0.002895,0.003406,0.003406,0,0.658884,0.658884,0.605321,0.605321,-0.320791,-0.320791
1,4586,2021-12-13 04:01:55+00:00,0.006300,0.006300,0.002895,0.002895,0.003405,0.003405,0,0.259777,0.259777,0.467071,0.467071,-0.430264,-0.430264
2,2588,2021-10-30 18:31:55+00:00,0.006300,0.006300,0.002893,0.002893,0.003407,0.003407,0,-0.024989,-0.024989,-0.471996,-0.471996,0.609948,0.609948
3,3948,2021-11-29 19:31:55+00:00,0.006300,0.006300,0.002892,0.002892,0.003408,0.003408,0,-0.107826,-0.107826,-1.156217,-1.156217,1.460480,1.460480
4,3409,2021-11-17 12:01:55+00:00,0.006303,0.006303,0.002898,0.002898,0.003404,0.003404,0,2.214159,2.214159,2.158488,2.158488,-1.244289,-1.244289
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4135,4185,2021-12-04 18:01:55+00:00,0.006301,0.006301,0.002893,0.002893,0.003408,0.003408,0,0.697517,0.697517,-0.640749,-0.640749,1.368422,1.368422
4136,4980,2021-12-22 01:31:55+00:00,0.006300,0.006300,0.002892,0.002892,0.003407,0.003407,0,-0.531679,-0.531679,-1.075464,-1.075464,1.041587,1.041587
4137,3741,2021-11-24 10:01:55+00:00,0.006300,0.006300,0.002896,0.002896,0.003404,0.003404,0,-0.135576,-0.135576,1.145978,1.145978,-1.625332,-1.625332
4138,3040,2021-11-09 06:01:25+00:00,0.006303,0.006303,0.002899,0.002899,0.003405,0.003405,0,2.868415,2.868415,2.376998,2.376998,-1.053130,-1.053130


# Matrix Graph
Using our dataset, we'll create a matrix of graphs to better visualize any corelations using Plotly Express.

In [5]:
import plotly.express as px
import pandas as pd

train_df = pd.read_csv("https://gist.githubusercontent.com/davidchalifoux/c1c780966a270ade20bf8b2a5520521e/raw").dropna()

fig = px.scatter_matrix(
    train_df,
    dimensions=[
        # We'll comment out the TDOA data to save space
        # "175_177_tdoa",
        "175_177_fdoa",
        # "175_176_tdoa",
        "175_176_fdoa",
        # "176_177_tdoa",
        "176_177_fdoa",
    ],
    color="maneuver",
    title="Satelite 42709",
)
fig.show()

# Plotting in 3D

We can also explore this data in three dimensions. However, our data has too many features for it to fit. Therefore, we'll need to use PCA in order to reduce the dimensions of the data.

Using SKLearn's PCA impelementation, we'll reduce our data into 3 principal components.

In [9]:
import plotly.express as px
import pandas as pd
from sklearn.decomposition import PCA


train_df = pd.read_csv("https://gist.githubusercontent.com/davidchalifoux/c1c780966a270ade20bf8b2a5520521e/raw").dropna()
principal_df = train_df[
    [
        "175_177_tdoa",
        "175_177_fdoa",
        "175_176_tdoa",
        "175_176_fdoa",
        "176_177_tdoa",
        "176_177_fdoa",
    ]
]
pca = PCA(n_components=3)
principalComponents = pca.fit_transform(principal_df)
principal_df = pd.DataFrame(
    data=principalComponents,
    columns=["Principal_Component_1", "Principal_Component_2", "Principal_Component_3"],
)
principal_df["maneuver"] = train_df["maneuver"]


fig = px.scatter_3d(
    principal_df,
    x="Principal_Component_1",
    y="Principal_Component_2",
    z="Principal_Component_3",
    color="maneuver",
    title="Satelite 42709",
)
fig.show()


# Plotting in 3D over time

The satelite's data is actually time-series data. Therefore, the best way to visualize it is over time. Using the same techniques as before, we'll reduce our data into two dimensions and have time be our third dimension.

In [10]:
"""
Uses SKlearn's PCA impelementation to reduce the dimensions of the data to 3.
Plots the data in 3D.
"""
import plotly.express as px
import pandas as pd
from sklearn.decomposition import PCA


train_df = pd.read_csv("https://gist.githubusercontent.com/davidchalifoux/c1c780966a270ade20bf8b2a5520521e/raw").dropna()
principal_df = train_df[
    [
        "175_177_tdoa",
        "175_177_fdoa",
        "175_176_tdoa",
        "175_176_fdoa",
        "176_177_tdoa",
        "176_177_fdoa",
    ]
]
pca = PCA(n_components=2)
principal_df = pca.fit_transform(principal_df)
principal_df = pd.DataFrame(
    data=principal_df,
    columns=["Principal_Component_1", "Principal_Component_2"],
)
principal_df["maneuver"] = train_df["maneuver"]
principal_df["primary_rx_time"] = train_df["primary_rx_time"]

fig = px.scatter_3d(
    principal_df,
    x="primary_rx_time",
    y="Principal_Component_1",
    z="Principal_Component_2",
    color="maneuver",
    title="Satelite 42709",
)
fig.show()
