## Predicting Satellite Congestion Risk: Neural Network Approach

Just like in the last part, in this we explore how to predict the congestion risk of satellites using machine learning, using a dataset containing various satellite parameters to build a model that can classify satellites into different congestion risk categories (Low, Medium, High). However in this notebook, we take a different approach. Instead of using a traditional, statistical machine learning approach, we use a neural network algorithm to make predictions instead.

In [1]:
# https://www.kaggle.com/datasets/karnikakapoor/satellite-orbital-catalog

### 1. Setting Up Our Environment and Loading Data

Just like last time, before we start, we need to import the necessary libraries.We'll be using `pandas` for data manipulation, `sklearn` (Scikit-learn) for machine learning tasks, and `tensorflow` for neural network models

In [2]:
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd
from sklearn.model_selection import train_test_split, cross_val_score, KFold
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import OrdinalEncoder
from sklearn.utils import class_weight
import tensorflow as tf

We load the same dataset again in the same way as before, and mount it as a pandas dataframe for data manipulation

In [3]:
# download kaggle dataset from google drive, and import as a pandas dataframe
df = pd.read_csv('https://drive.google.com/uc?export=download&id=1i4FdBT71ale29-1ido9Q0HNeNzOZ6lFN')

### 2. Data Preprocessing

Just like before, we need to process the data in the same way for the machine learning algorithms. Using the same code as before, seperate the data into the target variable Y and the features X, exclude irrelevant descriptive columns, convert categorical data into numbers using Ordinal Encoding, and then split the data into training and testing sets for the neural network


In [4]:
# separate features (x) and target (y)
x = df.drop('congestion_risk', axis=1)
y = df['congestion_risk']

# exclude descriptive columns not used for training data
exclude_columns = ['norad_id', 'name', 'epoch', 'data_source', 'snapshot_date', 'last_seen']
categorical_cols = ['object_type', 'satellite_constellation', 'altitude_category', 'orbital_band', 'orbit_lifetime_category', 'country']

# drop excluded columns
x_processed = x.drop(columns=exclude_columns, errors='ignore')

# use ordinal encoding to change categorical data to numerical for training
encoder = OrdinalEncoder()
x_processed[categorical_cols] = encoder.fit_transform(x_processed[categorical_cols])

# fit the encoder on the entire target variable 'y' to ensure all possible labels are learned
y_encoded_full = encoder.fit_transform(y.values.reshape(-1, 1))
num_classes = len(encoder.categories_[0]) # get the number of unique classes

# split data into training and testing sets with the processed data
x_train, x_test, y_train, y_test = train_test_split(x_processed, y, test_size=0.2, random_state=42)

# now transform y_train and y_test using the fitted encoder
y_train_encoded = encoder.transform(y_train.values.reshape(-1, 1)).flatten()
y_test_encoded = encoder.transform(y_test.values.reshape(-1, 1)).flatten()

### 3. Building Our Predictive Brain: The Neural Network Model

Now that our data is processed, we can make our machine learning model. For this, we're using a type of model called a "Fully Connected Neural Network" (also known as a Dense Neural Network), built with `tensorflow` and `keras`. Neural networks are inspired by the human brain and are excellent at finding complex patterns in data.

Let's break down the model's construction:

1.  **`tf.keras.models.Sequential`**: This is like stacking layers on top of each other to build our network, with data flowing from one layer to the next.

2.  **`tf.keras.layers.Dense(128, activation='relu', input_shape=(x_train.shape[1],))`**: This is our input layer and the first 'hidden' layer.
    *   `Dense` means every neuron in this layer is connected to every neuron in the previous layer (or the input).
    *   `128` is the number of 'neurons' or units in this layer. More neurons can capture more complex patterns.
    *   `activation='relu'` stands for Rectified Linear Unit. It's a commonnly used activation function that helps the network learn non-linear relationships.
    *   `input_shape=(x_train.shape[1],)` tells the model how many features it should expect in each input sample (which is the number of columns in our `x_train`).

3.  **`tf.keras.layers.Dropout(0.5)`**: This is a dropout layer, where during training, it randomly 'turns off' 50% of the neurons in the previous layer. This helps to prevent the model from becoming too reliant on any single neuron and to make it more robust, and so helps to reduce risk of overfitting, where the model memorizes the training data instead of learning general patterns.

4.  **`tf.keras.layers.Dense(64, activation='relu')`**: Another hidden layer, similar to the first, but with fewer neurons (`64`).

5.  **`tf.keras.layers.Dropout(0.3)`**: Another dropout layer, this time turning off 30% of neurons.

6.  **`tf.keras.layers.Dense(num_classes, activation='softmax')`**: This is our **output layer**.
    *   `num_classes` is the number of unique congestion risk categories we are trying to predict, 'Low', 'Medium', 'High'.
    *   `activation='softmax'` is used for multi-class classification problems. It outputs a probability distribution over the `num_classes`, meaning it tells us the likelihood that a satellite belongs to each risk category. The sum of these probabilities will be 1.

### Compiling the Model: Setting Up for Learning

After defining the network's structure, we need to 'compile' it. This step configures the learning process:

*   **`optimizer='adam'`**: The optimizer is what trains the neural network to be more accurate. 'Adam' is a very popular and effective algorithm that adjusts the internal weights of the neural network to minimize errors during training.
*   **`loss='sparse_categorical_crossentropy'`**: The 'loss function' measures how far off our model's predictions are from the true values. For multi class classification with integer encoded labels, `sparse_categorical_crossentropy` is the appropriate choice. The optimizer tries to minimize this loss.
*   **`metrics=['accuracy']`**: Metrics are what we use to monitor the training process and evaluate the model's performance. 'Accuracy' is straightforward, it tells us the proportion of correctly predicted instances.

### Training and Evaluating the Model: The Learning Begins!

Finally, we train our model using the `cnn_model.fit()` method. This is where the model 'learns' from the training data:

*   **`x_train`, `y_train_encoded`**: These are our training features and their corresponding encoded target labels.
*   **`epochs=10`**: An epoch is one complete pass through the entire training dataset. So here we are telling the model to iterate over the data 10 times.
*   **`batch_size=32`**: Instead of feeding all data at once which can be memory intensive, the model processes data in smaller 'batches' (here, 32 samples at a time) and updates its weights after each batch.
*   **`validation_split=0.2`**: During training, we reserve a small portion (20%) of the training data as a 'validation set'. The model doesn't learn from this data directly, but is used to monitor performance during training. This helps us catch overfitting early.

After training, we `evaluate` the model on our completely unseen `x_test` and `y_test_encoded` data to get a final, unbiased measure of its performance. The `accuracyCNN` value tells us how well our neural network performed in predicting the congestion risk for satellites it had never seen before.

In [5]:
# define a fully connected (dense) neural network model
cnn_model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(x_train.shape[1],)), # input layer with number of features
    tf.keras.layers.Dropout(0.5), # dropout layer to prevent overfitting
    tf.keras.layers.Dense(64, activation='relu'), # hidden layer
    tf.keras.layers.Dropout(0.3), # dropout layer to prevent overfitting
    tf.keras.layers.Dense(num_classes, activation='softmax') # output layer with number of classes
])

# compile the model
cnn_model.compile(optimizer='adam', # popular and efficient algorithm
              loss='sparse_categorical_crossentropy', # sparse categorical cross-entropy is appropriate for multi-class classification problems
              metrics=['accuracy'])

# fit the model
cnn_model.fit(x_train, y_train_encoded, epochs=10, batch_size=32, validation_split=0.2)

# evaluate the model
lossCNN, accuracyCNN = cnn_model.evaluate(x_test, y_test_encoded)
print('CNN Test accuracy:', accuracyCNN)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.6361 - loss: 206.3427 - val_accuracy: 0.8251 - val_loss: 24.7365
Epoch 2/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.6966 - loss: 34.1427 - val_accuracy: 0.7916 - val_loss: 2.4682
Epoch 3/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.7093 - loss: 12.7993 - val_accuracy: 0.8251 - val_loss: 1.5386
Epoch 4/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.7879 - loss: 5.7955 - val_accuracy: 0.8251 - val_loss: 0.6485
Epoch 5/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8110 - loss: 2.0699 - val_accuracy: 0.8251 - val_loss: 0.5612
Epoch 6/10
[1m273/273[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8141 - loss: 1.3331 - val_accuracy: 0.8251 - val_loss: 0.5450
Epoch 7/10
[1m273/273

### 4. Conclusion: Neural Networks for Satellite Congestion

Our neural network achieved a good accuracy of approximately 0.817 (81.7%) on unseen test data, which indicates that the model is quite effective at classifying satellites into their respective congestion risk categories. While this is a good result, it's worth noting that the Random Forest Classifier from the previous part performed even better on tabular data like this, since the relationships in the data are clearly defined by distinct feature boundaries. Random Forests excel at making decisions based on such clear splits.