## Training a TNO detecting CNN model

**Author:** Aram Lee, Hossen Teimoorinia
**Date:** 2025-02-02
**File Name:** Trainer.ipynb

### [Description]
Using sub-images which contain artificial TNOs, a CNN model is trained to detect the TNOs.

### [Required Libraries]
- numpy: 1.26.4
- astropy: 6.1.0
- scikit-learn: 1.1.1 (sklearn)
- tensorflow: 2.9.1

### [Workflow]  

Steps 1-3 are for training the model, and steps 4-6 are for using the model to detect TNOs.

|Step|File|Input|Output|Purpose|
|-|-|-|-|-|
|1|ImageCutter.ipynb|.fits (with artificial moving objects), .plantlist (artificial objects info)| .npy|Extract sub-images for training|
|2|Concatenator.ipynb|.npy (sub-images from ImageCutter)|.npy|Prepare dataset for training|
|3|Trainer.ipynb **(Here)**|.npy (dataset from Concatenator), .npy (target information)|.h5 (trained CNN models)|Train the model|
|-|-|-|-|-|
|4|ImageCutter.ipynb|.fits (without artificial moving objects)|.npy|Extract sub-images for detection|
|5|Predictor.ipynb|.npy (sub-images from ImageCutter), .npy (target info), .h5 (model)|.npy|Apply trained model to detect objects|
|6a|Link_sources_to_objects.py|.npy (classification and regression output from Predictor)|.npy|Detect moving objects (linear fitting method)|
|6b|CandidateFinder.ipynb|.npy (classification output from Predictor), .npy (sub-images, target info)|.csv|Detect moving objects (scoring method)|

In [13]:
# Import necessary packages to train CNNs
import os
import numpy as np
import logging
logging.getLogger().setLevel(logging.INFO)
from math import ceil

# Tensorflow is used to build CNNs, train them, and analyze the performance.
import tensorflow
from tensorflow.keras.models import load_model
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.layers import Flatten
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
from tensorflow.keras.metrics import Recall
from tensorflow.keras.metrics import Precision
# tensorflow.compat.v1.keras.layers.enable_v2_dtype_behavior()

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

# Tensorflow keep the channels AFTER the rows and columns. We change it to "channels first".
tensorflow.keras.backend.set_image_data_format('channels_first')

# Use the mixed precision training for better performance.
tensorflow.keras.mixed_precision.set_global_policy('mixed_float16')

In [2]:
# Print the percentage of RAM used, total RAM (in GB), and available RAM (in GB) 
import psutil
print('RAM memory % used:', psutil.virtual_memory()[2], psutil.virtual_memory()[0]>>30, psutil.virtual_memory()[1]>>30)

RAM memory % used: 35.3 241 156


In [3]:
# Import pre-cut images and their labels
M_img_train = np.load('trainingsets/inp_ch051020_P99NN.npy', allow_pickle=True)
M_tar_train = np.load('trainingsets/tar_ch051020_P99NN.npy', allow_pickle=True)

In [5]:
"""
Example of information for an image pair:

- Index 0-1: Labels for classification.
- Index 2-3: Lists containing positional values of the moving object:
  [x_sub, y_sub, x, y, center_of_the_cutout_x, center_of_the_cutout_y].
- Index 4-5: Original filenames of the sub-images.
- Index 6-7: Lists from the plantlist files for the moving object in each sub-image:
  [ID, x, y, rate, angle, rate_x, rate_y, magnitude, psf_flux_multiplier, ra, dec].
"""
M_tar_train[0]

array([1, 1,
       list([46.784159935019346, 31.566126532308388, 233.71, 83.58, array(218.92584006), array(84.01387347)]),
       list([42.555362777862655, 31.88228668748326, 228.22, 84.32, array(217.66463722), array(84.43771331)]),
       'calexp-2426932-05-P99NN', 'calexp-2426940-05-P99NN',
       array([ 6.48050000e+04,  2.33710000e+02,  8.35800000e+01,  1.43000000e+00,
               6.61000000e+00, -1.53000000e+00, -1.70000000e-01,  2.62500000e+01,
               2.18475310e-01,  2.89588904e+02, -2.14610940e+01])                ,
       array([ 6.48050000e+04,  2.28220000e+02,  8.43200000e+01,  1.45000000e+00,
               6.55000000e+00, -1.55000000e+00, -1.70000000e-01,  2.62500000e+01,
               2.64396544e-01,  2.89588669e+02, -2.14611200e+01])                ],
      dtype=object)

In [12]:
# Following codes create (design) the ready-made MobileNet architecture and compile (choose the optimizer, loss, and metrics) the MobileNet.

model_MobileNet_cls = tensorflow.keras.applications.MobileNet(
    include_top=True,
    weights=None,
    input_tensor=None,
    input_shape=(4,64,64),
    pooling=None,
    classes=2,
    classifier_activation="sigmoid"
)

opt = tensorflow.keras.optimizers.Adam()
model_MobileNet_cls.compile(optimizer=opt, loss='binary_crossentropy', metrics=['acc', tensorflow.keras.metrics.BinaryAccuracy(), Recall(), Precision(), tensorflow.keras.metrics.TruePositives(), tensorflow.keras.metrics.TrueNegatives(), tensorflow.keras.metrics.FalsePositives(), tensorflow.keras.metrics.FalseNegatives()])

In [10]:
# execute the classfication model training process

# batch size is the number of training examples utilized in one iteration.
# epochs is the number of passes through the entire training dataset
batch_size = 1024
epochs = 30

# Early Stopping stops training if val_loss does not improve for a set number of consecutive epochs.
# It restores the best model weights before stopping.
early_stopping = EarlyStopping(monitor='val_loss', patience=7, restore_best_weights=True)

# This is the actual fitting part.
# validation_data: The loss on this data set is evaluated at the end of each epoch. This validation data is not used for training, only for monitoring performance.
# shuffle: shuffles the training data in batch-sized chunks to improve generalization.
# callbacks: functions that execute during training, such as early stopping.
history_MobileNet_cls=model_MobileNet_cls.fit(M_img_train, M_tar_s_train[:,0:2], validation_data=(M_img_test, M_tar_s_test[:,0:2]),
          shuffle=True, epochs=epochs, batch_size=batch_size, callbacks=[early_stopping])

In [None]:
# save the trained model
model_MobileNet_cls.save('models/M_cls')

In [1]:
# change the target files into Pandas DataFrames with labeled columns for easier magnitude-based filtering of TNOs.
import pandas as pd
dftrain = pd.DataFrame(M_tar_s_train, columns = ['p1', 'p2', 'x1', 'y1', 'x2', 'y2', 'm1', 'm2'])
dftest = pd.DataFrame(M_tar_s_test, columns = ['p1', 'p2', 'x1', 'y1', 'x2', 'y2', 'm1', 'm2'])
# an example of filtering
dftrain11 = dftrain.index[((df.m1 <= 23) & (df.m1 > 0) & (df.m2 <= 23) & (df.m2 > 0))].tolist()
dftest11 = dftest.index[((df.m1 <= 23) & (df.m1 > 0) & (df.m2 <= 23) & (df.m2 > 0))].tolist()

In [None]:
# Following codes define and compile the regression model for predicting positions and magnitudes of TNOs.

from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras.applications import MobileNet, MobileNetV2, ResNet50, ResNet50V2

base_model = tensorflow.keras.applications.MobileNet(
    input_shape=(2,64,64),
    include_top=False,
    weights=None,
    input_tensor=None,
    pooling=None,
    classifier_activation="linear"
)

x = base_model.output
# flat_cnn = Flatten(name='flat1')(x)
flat_cnn = GlobalAveragePooling2D()(x)

# two separate output branches for position and magnitude regression
# (x_1st_image, y_1st_image, x_2nd_image, y_2nd_image)
fc1 = Dropout(0)(flat_cnn)
output_pos = Dense(4, activation='linear')(fc1)
# (magnitude_1st_image, magnitude_2nd_image)
fc2 = Dropout(0)(flat_cnn)
output_mag = Dense(2, activation='linear')(fc2)

# Define the model with two outputs
model_MobileNet_rgs = Model(inputs=base_model.input, outputs=[output_pos, output_mag])

# Compile the model with a mean absolute error loss function
opt = tensorflow.keras.optimizers.Adam(learning_rate=0.001)
model_MobileNet_rgs.compile(optimizer=opt, loss='mae', metrics=['mae','mse'])

In [4]:
# execute the regression model training process

early_stopping = EarlyStopping(monitor='val_loss', patience=7, restore_best_weights=True)
    
batch_size = 1024
epochs = 30

history_MobileNet_rgs=model_MobileNet_rgs.fit(M_img_train[dftrain11], [M_tar_s_train[:,2:6][dftrain11], M_tar_s_train[:,6:8][dftrain11]], validation_data=(M_img_test[dftest11], [M_tar_s_test[:,2:6][dftest11], M_tar_s_test[:,6:8][dftest11]]),
          shuffle=True, epochs=epochs, batch_size=batch_size, callbacks=[early_stopping])

In [5]:
model_MobileNet_rgs.save('models/M_rgs')

**Following cells test the models**

In [None]:
# To test, load the selected models used in the paper
model_cls = tensorflow.keras.models.load_model('models/MobileNet_Ch051020M25_classification')
model_rgs = tensorflow.keras.models.load_model('models/MobileNet_Ch051020M25_regression')

In [None]:
# examples of filtering, to make a balanced data set
dft11 = df.index[((df.m1 <= 23) & (df.m1 > 0) & (df.m2 <= 23) & (df.m2 > 0))].tolist()
dft00 = df.index[((df.m1 == -1) & (df.m2 == -1))].tolist()[0:len(dft11)]
dft10 = df.index[((df.m1 <= 23) & (df.m2 == -1))].tolist()[0:ceil(len(dft11)/4)]
dft01 = df.index[((df.m1 == -1) & (df.m2 <= 23))].tolist()[0:ceil(len(dft11)/4)]
# concatenate the necessary filtered entries
dfindex = np.concatenate((dft11,dft10,dft01,dft00))

In [None]:
# use the trained classfication model
pred_bin = model_MobileNet_cls.predict(M_img_test[dfindex])

In [None]:
# confusion matrix using the output from the classfication model
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_recall_fscore_support as prfs
print(confusion_matrix(np.ndarray.flatten(M_tar_s_test[dfindex][:,0:2]), np.ndarray.flatten(np.round(pred_bin))))
print(prfs(np.ndarray.flatten(M_tar_s_test[dfindex][:,0:2]), np.ndarray.flatten(np.round(pred_bin))))