# OPS-SAT case starter-kit notebook

ESA's [Kelvins](https://kelvins.esa.int) competition "[the OPS-SAT case](https://kelvins.esa.int/opssat/home/)" is a novel data-centric challenge that asks you to work with the raw data of a satellite and very few provided labels to find the best parameters for a given machine learning model. Compared to previous competitions on Kelvins (like the [Pose Estimation](https://kelvins.esa.int/pose-estimation-2021/) or the [Proba-V Super-resolution challenge](https://kelvins.esa.int/proba-v-super-resolution/)) where the test-set is provided and the infered results are submitted, for the OPS-SAT case, we will run inference on the Kelvins server directly! This notebooks contains examples on how you can load your data and train an **EfficientNetLite0** model by only using the 80-labeled images provided. Therefore, the directory `images`, containing unlabeld patches and included in the training dataset is not used for this notebook. However, competitors are encouraged to use these patches to improve the model accuracy.

# 1. Module imports

If you do not have a GPU, uncomment and run the next commands.


In [25]:
import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
file = '/home/phillip/PycharmProjects/the_opssat_case_starter_kit'

Other imports.

In [26]:
import tensorflow as tf
from tensorflow import keras

import numpy as np
from sklearn.metrics import cohen_kappa_score

from efficientnet_lite import EfficientNetLiteB0

# 2. Utility Functions

You can use this function to load your training data.

In [27]:
def get_images_from_path(dataset_path, _seed: int = 42):
    """ Get images from path and normalize them applying channel-level normalization. """

    # loading all images in one large batch
    tf_train_data = tf.keras.utils.image_dataset_from_directory(dataset_path, image_size=input_shape[:2], shuffle=False, validation_split=0.4, subset="training", seed=_seed)
    tf_eval_data = tf.keras.utils.image_dataset_from_directory(dataset_path, image_size=input_shape[:2], shuffle=False, validation_split=0.4, subset="validation", seed=_seed)
    
    # extract images and targets
    for tf_eval_images, tf_eval_targets in tf_eval_data:
        break
        
    for tf_train_images, tf_train_targets in tf_train_data:
        break

    return tf.convert_to_tensor(tf_train_images), tf_train_targets, tf.convert_to_tensor(tf_eval_images), tf_eval_targets

# 3. Loading the model

The network architecture used for OPS-SAT is **EfficientNetLite0**. We would like to thank Sebastian for making a Keras implementation of EfficientNetLite publicly available under the Apache 2.0 License: https://github.com/sebastian-sz/efficientnet-lite-keras. Our Version of this code has been modified to better fit our purposes. For example, we removed the ReLU "stem_activation" to better match a related efficientnet pytorch implementation. In any way, **you have to use the model architecture that we provide in our [starter-kit](https://gitlab.com/EuropeanSpaceAgency/the_opssat_case_starter_kit).**

In [28]:
input_shape = (200, 200, 3)   # input_shape is (height, width, number of channels) for images
num_classes = 8
model = EfficientNetLiteB0(classes=num_classes, weights=None, input_shape=input_shape, classifier_activation=None)
# model.summary()

block1a_ same
block2a_ ((1, 1), (1, 1))
block2a_ valid
block2b_ same
block3a_ ((2, 2), (2, 2))
block3a_ valid
block3b_ same
block4a_ ((1, 1), (1, 1))
block4a_ valid
block4b_ same
block4c_ same
block5a_ same
block5b_ same
block5c_ same
block6a_ ((2, 2), (2, 2))
block6a_ valid
block6b_ same
block6c_ same
block6d_ same
block7a_ same


# 4. Loading data

Uncomment next line and adjust with the path of your dataset. 

In [29]:
training_dataset_path=os.path.join(file, 'ops_sat_competiton_official_training')

In this notebook, classical supervised learning is used. Therefore, remember to remove the subdirectory `images` containing unlabeled patches before loading the dataset to perform training correctly.

In [30]:
#Loading dataset
x_train, y_train, x_eval, y_eval=get_images_from_path(training_dataset_path)

Found 80 files belonging to 8 classes.
Using 48 files for training.
Found 80 files belonging to 8 classes.
Using 32 files for validation.


# 5. Model training

We provide now an example on how you can train your model by using standard supervised learning. Training loss (`SparseCategoricalCrossentropy`) and `Accuracy` are shown for simplicity and for an easier interpretation of the training outcome, despite your submission will be evaluated by using the metric **1 - Cohen's kappa** [metric](https://en.wikipedia.org/wiki/Cohen's_kappa). For more information on scoring, please refer to [Scoring](https://kelvins.esa.int/opssat/scoring/).

In [31]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), metrics=[keras.metrics.SparseCategoricalAccuracy()])

With this model and the dataset provided, please do your best!

In [32]:
# load data, data augmentation, training, overfitting, transfer-learning etc.
history=model.fit(x_train, y_train, epochs=15, verbose=2, batch_size=8)

Epoch 1/15
4/4 - 9s - loss: 4.8472 - sparse_categorical_accuracy: 0.1875 - 9s/epoch - 2s/step
Epoch 2/15
4/4 - 2s - loss: 2.6980 - sparse_categorical_accuracy: 0.4688 - 2s/epoch - 605ms/step
Epoch 3/15
4/4 - 2s - loss: 3.9445 - sparse_categorical_accuracy: 0.3750 - 2s/epoch - 612ms/step
Epoch 4/15
4/4 - 2s - loss: 3.8576 - sparse_categorical_accuracy: 0.4062 - 2s/epoch - 604ms/step
Epoch 5/15
4/4 - 2s - loss: 2.7203 - sparse_categorical_accuracy: 0.4375 - 2s/epoch - 605ms/step
Epoch 6/15
4/4 - 2s - loss: 1.5641 - sparse_categorical_accuracy: 0.5312 - 2s/epoch - 617ms/step
Epoch 7/15
4/4 - 3s - loss: 1.7935 - sparse_categorical_accuracy: 0.4062 - 3s/epoch - 663ms/step
Epoch 8/15
4/4 - 3s - loss: 2.7110 - sparse_categorical_accuracy: 0.5938 - 3s/epoch - 642ms/step
Epoch 9/15
4/4 - 3s - loss: 1.8729 - sparse_categorical_accuracy: 0.6250 - 3s/epoch - 688ms/step
Epoch 10/15
4/4 - 3s - loss: 1.9725 - sparse_categorical_accuracy: 0.4375 - 3s/epoch - 704ms/step
Epoch 11/15
4/4 - 3s - loss: 1.6

Calculating the **1 - Cohen's kappa** score of the trained model on the trained dataset.

In [33]:
predictions = np.zeros(len(y_eval), dtype=np.int8)

# inference loop
for e, (image, target) in enumerate(zip(x_eval, y_eval)):
    image = np.expand_dims(np.array(image), axis=0)
    output = model.predict(image)
    predictions[e] = np.squeeze(output).argmax()

#Keras model score
score_keras = 1 - cohen_kappa_score(y_eval.numpy(), predictions)
print("Score:",score_keras)

Score: 1.0


# 6. Saving and loading trained model

The trained model can be now saved by using HDF5-format that is the only accepted for submission. The name `test.h5` will be used.

In [34]:
#Saving model
model.save_weights('test.h5')

The trained model can be also loaded for further testing. 

In [35]:
model = EfficientNetLiteB0(classes=num_classes, weights=None, input_shape=input_shape, classifier_activation=None)
model.load_weights('test.h5')

block1a_ same
block2a_ ((1, 1), (1, 1))
block2a_ valid
block2b_ same
block3a_ ((2, 2), (2, 2))
block3a_ valid
block3b_ same
block4a_ ((1, 1), (1, 1))
block4a_ valid
block4b_ same
block4c_ same
block5a_ same
block5b_ same
block5c_ same
block6a_ ((2, 2), (2, 2))
block6a_ valid
block6b_ same
block6c_ same
block6d_ same
block7a_ same


The model will be now compiled and tested again. You should get the same score as before saving and loading. 

In [36]:
#Model shall be compiled before testing.
model.compile()

#Creating empty predictions
predictions = np.zeros(len(y_train), dtype=np.int8)

# inference loop
for e, (image, target) in enumerate(zip(x_train, y_train)):
    image = np.expand_dims(np.array(image), axis=0)
    output = model.predict(image)
    predictions[e] = np.squeeze(output).argmax()

#Keras model score
score_keras = 1 - cohen_kappa_score(y_train.numpy(), predictions)
print("Score:",score_keras)

Score: 0.9090909090909091
