## Q2: Neural Network Approach

## Aim
Using the same dataset and preprocessing steps as Q1, we train a simple feedforward neural network to predict space mission success. This gives us a fair comparison between a traditional baseline (logistic regression) and a neural network.

In [1]:
import sys
import os

project_root = os.path.abspath("..")
if project_root not in sys.path:
    sys.path.insert(0, project_root)

from py.functions import load_missions_csv, preprocess_missions, train_test

from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, ConfusionMatrixDisplay, roc_auc_score
import numpy as np

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


2025-12-17 13:23:40.402515: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2025-12-17 13:23:40.402945: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-12-17 13:23:49.335154: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-12-17 13:24:00.431922: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To tur

In [4]:
#loading raw dataset
df = load_missions_csv("../data/mission_launches.csv")

#preprocess into features (X) and labels(y)
X, y = preprocess_missions(df)

#train and test split
X_train, X_test, y_train, y_test, scaler = train_test(X, y)

print("Train shape:", X_train.shape)
print("Test shape:", X_test.shape)
print("Class balance:\n", y.value_counts())

Train shape: (766, 80)
Test shape: (192, 80)
Class balance:
 y
1    906
0     52
Name: count, dtype: int64


## Neural Network Model
We use a simple Multi-Layer Perceptron (MLP):
- Hidden layers use ReLU activation to learn non-linear patterns
- Output layer uses Sigmoid to output a probability of success
- Binary cross-entropy is used as this is a binary classification task

In [6]:
model = Sequential([
    Dense(32, activation = "relu", input_shape = (X_train.shape[1],)),
    Dense(16, activation = "relu"),
    Dense(1, activation = "sigmoid")
])

model.compile(
    optimizer = "adam",
    loss = "binary_crossentropy",
    metrics = ["accuracy"]
)

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
2025-12-17 13:29:29.456837: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


In [8]:
history = model.fit(
    X_train,
    y_train,
    epochs = 20,
    batch_size = 32,
    validation_split = 0.2,
    verbose = 1
)

Epoch 1/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 10ms/step - accuracy: 0.8971 - loss: 0.4692 - val_accuracy: 0.9221 - val_loss: 0.3985
Epoch 2/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9461 - loss: 0.3035 - val_accuracy: 0.9286 - val_loss: 0.3139
Epoch 3/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9477 - loss: 0.2423 - val_accuracy: 0.9286 - val_loss: 0.2889
Epoch 4/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9477 - loss: 0.2142 - val_accuracy: 0.9286 - val_loss: 0.2794
Epoch 5/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9477 - loss: 0.1964 - val_accuracy: 0.9286 - val_loss: 0.2767
Epoch 6/20
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - accuracy: 0.9477 - loss: 0.1846 - val_accuracy: 0.9286 - val_loss: 0.2756
Epoch 7/20
[1m20/20[0m [32m━━━━━━━━━

In [10]:
y_prob_nn = model.predict(X_test).ravel()
y_pred_nn = (y_prob_nn >= 0.5).astype(int)

print("Accuracy:", accuracy_score(y_test, y_pred_nn))
print("ROC-AUC:", roc_auc_score(y_test, y_prob_nn))
print("\nClassification report:\n", classification_report(y_test, y_pred_nn))

[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
Accuracy: 0.9479166666666666
ROC-AUC: 0.5546703296703297

Classification report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00        10
           1       0.95      1.00      0.97       182

    accuracy                           0.95       192
   macro avg       0.47      0.50      0.49       192
weighted avg       0.90      0.95      0.92       192



  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])
  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])
  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])


## Conclusion 
Because Q1 and Q2 use the same dataset and the same preprocessing steps, the comparison is fair.

The neural network did not significantly outperform logistic regression. This suggests the relationship between features and mission success is mostly
linear.

The neural network requires more hyperparamenters and training time, while logistic regression is simpler and more interpretable. Therefore, the neural network adds complexity without improving the results.