# OPS-SAT case serverside evaluation

ESA's [Kelvins](https://kelvins.esa.int) competition "[the OPS-SAT case](https://kelvins.esa.int/opssat/home/)" is a novel data-centric challenge that asks you to work with the raw data of a satellite and very few provided labels to find the best parameters for a given machine learning model. Compared to previous competitions on Kelvins (like the [Pose Estimation](https://kelvins.esa.int/pose-estimation-2021/) or the [Proba-V Super-resolution challenge](https://kelvins.esa.int/proba-v-super-resolution/)) where the test-set is provided and the infered results are submitted, for the OPS-SAT case, we will run inference on the Kelvins server directly! To help you understand what is happening with your submission, this notebook replicates all steps that are executed by the script on our server, including computation of the scoring metric. We hope that it will be useful to you to avoid/find any bugs and prepare the best submission you can!

# 1. Module imports

If you do not have a GPU, uncomment and run the next commands.

In [None]:
#import os
#os.environ["CUDA_VISIBLE_DEVICES"]="-1"

In [None]:
import tensorflow as tf
from tensorflow import keras

import numpy as np
from sklearn.metrics import cohen_kappa_score

from efficientnet_lite import EfficientNetLiteB0

# 2. Utility Functions

The next function is used to load evaluation data.

In [None]:
def get_images_from_path(dataset_path):
    """ Get images from path and normalize them applying channel-level normalization. """

    # loading all images in one large batch
    tf_eval_data = tf.keras.utils.image_dataset_from_directory(dataset_path, image_size=input_shape[:2], shuffle=False, 
                                                               batch_size=100000)

    # extract images and targets
    for tf_eval_images, tf_eval_targets in tf_eval_data:
        break

    return tf.convert_to_tensor(tf_eval_images), tf_eval_targets

# 3. Producing a submission (competitor side)

The network architecture used for OPS-SAT is **EfficientNetLite0**. We would like to thank Sebastian for making a Keras implementation of EfficientNetLite publicly available under the Apache 2.0 License: https://github.com/sebastian-sz/efficientnet-lite-keras. Our Version of this code has been modified to better fit our purposes. For example, we removed the ReLU "stem_activation" to better match a related efficientnet pytorch implementation. In any way, **you have to use the model architecture that we provide in our [starter-kit](https://gitlab.com/EuropeanSpaceAgency/the_opssat_case_starter_kit).**

In [None]:
input_shape = (200, 200, 3)   # input_shape is (height, width, number of channels) for images
num_classes = 8
#Loading model
model = EfficientNetLiteB0(classes=num_classes, weights=None, input_shape=input_shape, classifier_activation=None)
#Printing model summary.
model.summary()

With this model and the dataset provided, please do your best!

In [None]:
# load data, data augmentation, training, overfitting, transfer-learning etc.
#x_train, y_train = ...
#model.fit(x_train, y_train)

After your model has been trained, all parameters need to be exported in HDF5-format.

In [None]:
model.save_weights('test.h5')

The corresponding file should be around 13MB in size. You can now upload this on the corresponding [Kelvins submission page](https://kelvins.esa.int/opssat/submission/).

# 4. Evaluating your submission (server side)

## 4.1 submission validation

Our validation script needs to check whether the submitted HDF5-file (referred to by the `file` variable in the following) is compatible with the predefined model. This is done simply by loading in the parameters:

In [None]:
file = 'test.h5'

In [None]:
model = EfficientNetLiteB0(classes=num_classes, weights=None, input_shape=input_shape, classifier_activation=None)
model.load_weights(file)

If `model.load_weights(file)` throws an Exception, your submission is invalid. Otherwise, it will be passed on to the **scoring** script.

In [None]:
# the path to the evaluation dataset is a secret ;)
dataset_path = './'

In [None]:
# loading in hidden OPS_SAT data
images, targets = get_images_from_path(dataset_path)

In [None]:
# Constructing base model
model = EfficientNetLiteB0(classes=num_classes, weights=None, input_shape=input_shape, classifier_activation=None)

In [None]:
# Loading in weights
model.load_weights(file)

## 4.2 Computation of the Keras (unquantized) score

In [None]:
#The model shall be compiled before the inference.
model.compile()

In [None]:
predictions = np.zeros(targets.shape, dtype=np.int8)

In [None]:
# inference loop
for e, (image, target) in enumerate(zip(images, targets)):
    image = np.expand_dims(np.array(image), axis=0)
    output = model.predict(image)
    predictions[e] = np.squeeze(output).argmax()

In [None]:
#Keras model score
score_keras = 1 - cohen_kappa_score(targets.numpy(), predictions)
print(score_keras)

## 4.3 Computation of the float16 quantized score

The computation of the quantized score involves several steps of serialization and model conversion so that we can run inference on the tensorflow-lite interpreter. This closely resembles the actual capabilities of the OPS-SAT platform.

In [None]:
save_path = '.'
tflite_model_path = './tflite_mock_model.tflite'

In [None]:
# serialization of model in preparation for the tf-lite conversion
tf.saved_model.save(model, save_path)

In [None]:
# model conversion to 16bit float
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT] 
converter.target_spec.supported_types = [tf.float16]

tflite_model = converter.convert()

In [None]:
# serialization of tflite model in preparation for inference
with open(tflite_model_path, 'wb') as fp:
    fp.write(tflite_model)

In [None]:
# Inference with tf-lite interpreter
interpreter = tf.lite.Interpreter(tflite_model_path)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]
output_details = interpreter.get_output_details()[0]

In [None]:
predictions = np.zeros(targets.shape, dtype=np.int8)

In [None]:
# inference loop
for e, (image, target) in enumerate(zip(images, targets)):
    image = np.expand_dims(np.array(image, dtype = input_details["dtype"]), axis=0)
    interpreter.set_tensor(input_details["index"], image)
    interpreter.invoke()
    output = interpreter.get_tensor(output_details["index"])[0]
    predictions[e] = np.squeeze(output).argmax()

In [None]:
#Quantized tf lite model score
score_float16 = 1 - cohen_kappa_score(targets.numpy(), predictions)
print(score_float16)

Depending on the workload of out server, the computation of the **unquantized** and the **float16 quantized** score might take **several minutes** so please be patient.

If no exception occured, your submission will be scored in the [Leaderboard](https://kelvins.esa.int/opssat/leaderboard/leaderboard).

Your position in the Leaderboard is determined by the float16 score. We report the Keras score nevertheless, as it is interesting for us to study the quantization error.

The Leaderboard will show the **best float16 score** of each team and the time at which their best and their last submission was evaluated. Thus, in order to check whether your last submission was evaluated, you should **check the "Last Submission" column** in the Leaderboard. Please understand that we do not provide scores for individual submissions to prevent excessive probing of the test set. 

Lastly, we would like to state that we did our best to test this evaluation system, but it is the first time we are doing this setup on Kelvins and the risk that something goes wrong can never be fully eliminated :(

Thus, **if even after about half an hour the "Last Submission" column is not updated** there was probably an unaccounted exception within the scoring. Should this be the case, please let us know by opening a Thread in the [Discussion board](https://kelvins.esa.int/opssat/discussion/). To avoid unnecessary back and forth we appreciate if you would state in this thread:

* name of your team
* time of your submission
* any error messages or observations that you believe might help

If this happens, your submission has been received, so there is no need to provide it nor should you, since the Discussion board is public! We will debug your submission on our side and get back to you.

# 5. Differences between this notebook, server-script and satellite

Some differences between this notebook, the server-side script and the actual satellite exist, mostly related to the underlying hardware and software dependencies. 

The main difference between our server-script and this notebook are additional checks and Kelvins specific commands (housekeeping) that we omitted here for clarity. Moreover, the test-set used to compute your score is **reduced by 50% during the competition**. After the submission period of the competition ends, **your best scoring submission will be re-evaluated on 100% of the held-out test-set**. These scores will be published in a separate Results leaderboard that will be used to determine the final ranking.

During our tests we found only a negligible numerical difference when evaluating models using tensorflow on different hardware/software and in our opinion it is not needed to replicate the exact environment to achieve meaningful results. If you nevertheless want to be as close as possible to our server setup, we are using

* `python 3.9.10`
* `tensorflow 2.7.0`
* `numpy 1.21.1`
* `scikit-learn 1.0.2`

running on a **Debian-based Linux** distribution.

OPS-SAT itself will have a slightly different execution environment, which has no relevance for the competition. Should you win the competition and get the chance to fly your model in space, we will work on that together. Good luck!