## 1. Data preprocessing

In [None]:
!pip install pywin32-ctypes==0.2.0

In [None]:
# Equivalent to 'import win32api' from pywin32.
from win32ctypes.pywin32 import win32api

win32api.LoadLibraryEx(sys.executable, 0, win32api.LOAD_LIBRARY_AS_DATAFILE)

In [None]:
from pywin32 import win32com.client as win32


In [None]:
pip install pypiwin32 

In [None]:
import os

SOURCE_URL = 'https://storage.googleapis.com/dm-turtle-recall/images.tar'
IMAGE_DIR = './turtle_recall/images'
TAR_PATH = os.path.join(IMAGE_DIR, os.path.basename(SOURCE_URL))
EXPECTED_IMAGE_COUNT = 13891

%sx mkdir --parents "{IMAGE_DIR}"
if len(os.listdir(IMAGE_DIR)) != EXPECTED_IMAGE_COUNT:
  %sx wget --no-check-certificate -O "{TAR_PATH}" "{SOURCE_URL}"
  %sx tar --extract --file="{TAR_PATH}" --directory="{IMAGE_DIR}"
  %sx rm "{TAR_PATH}"

print(f'The total number of images is: {len(os.listdir(IMAGE_DIR))}')

Read in the train, test, and sample submission CSV files as pandas dataframes:

In [None]:
import pandas as pd
import requests
import io
import urllib.parse

BASE_URL = 'https://storage.googleapis.com/dm-turtle-recall/'


def read_csv_from_web(file_name):
  url = urllib.parse.urljoin(BASE_URL, file_name)
  content = requests.get(url).content
  return pd.read_csv(io.StringIO(content.decode('utf-8')))


# Read in csv files.
train = read_csv_from_web('train.csv')
test = read_csv_from_web('test.csv')
sample_submission = read_csv_from_web('sample_submission.csv')

# Convert image_location strings to lowercase.
for df in [train, test]:
  df.image_location = df.image_location.apply(lambda x: x.lower())
  assert set(df.image_location.unique()) == set(['left', 'right', 'top'])

In [None]:
train.head()

In [None]:
test.image_location.nunique()

In [None]:
train_top = train[train.image_location == 'top']
train_left = train[train.image_location == 'left']
train_right = train[train.image_location == 'right']

print(train_top.shape, train_left.shape, train_right.shape)

In [None]:
test.head()

In [None]:
test_top = test[test.image_location == 'top']
test_left = test[test.image_location == 'left']
test_right = test[test.image_location == 'right']

print(test_top.shape, test_left.shape, test_right.shape)

In [None]:
train_left_ids = list(train_left.image_id)
test_left_ids = list(test_left.image_id)

In [None]:
sample_submission.head()

In [None]:
train.shape, test.shape, sample_submission.shape

How many unique turtles are in the training set?

In [None]:
print(f"There are {train.turtle_id.nunique()} unique turtles in the train set.")

How many images are there for each individual turtle in the training set?

In [None]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

train_images_per_turtle = pd.value_counts(train['turtle_id'])
print('The mean number of training images per turtle is '
      f'{round(np.mean(train_images_per_turtle), 2)}, '
      f'and the median is {int(np.median(train_images_per_turtle))}.')
sns.histplot(train_images_per_turtle)
plt.xlabel('Images per train turtle')
plt.show()

We can plot the number of images per `turtle_id`:


In [None]:
images_per_turtle = pd.value_counts(train['turtle_id'])
plt.figure(figsize=(3, 21))
sns.barplot(x=images_per_turtle, y=images_per_turtle.index,
            palette='Blues_r', orient='horizontal')
plt.show()

<a name="Approach"></a>
## 4. Approaching the modelling problem

Since we want to match images to labels (classes), we are dealing with an **image classification** problem. Image classification is generally most successfully approached using deep **convolutional neural networks** (CNNs).

### 4.1 Convolutional neural networks

At the highest level, CNNs take images as inputs and return probabilities that the image belongs to each of the possible classes.

In slightly more detail, CNNs hierarchically extract features from images using convolutional layers, which are usually followed by pooling layers that summarise the information in the extracted feature maps:
- Lower CNN layers capture low-level image features (edges, blobs)
- Layers deeper in the CNN capture higher-level features and objects (scale patterns, turtle eyes)
- Fully connected layers can then consolidate these extracted patterns and objects

<p align="center">
  <img src="https://storage.googleapis.com/dm-turtle-recall/tutorial_images/cnn.png" width="1100"/>
</p>

We won't go into a detailed explanation of CNNs here; if you'd like to learn more about their inner workings, we recommend [CS231n: Convolutional Neural Networks for Visual Recognition](https://cs231n.github.io/).

### 4.2 Transfer learning for image classification

To approach this turtle face classification problem, we could specify a CNN architecture, initialise it, and proceed to train its parameters from scratch.

An alternative approach, which is very popular in the computer vision (image processing) world, is called **transfer learning**:
- Instead of training your model on your task from scratch, you first identify a model that has been **pre-trained** on some other image task
- You can then adapt it (**"fine-tune"** it) to your specific task

This is a popular approach because it often gets you two advantages:
- **Improved performance**. The pre-training does a lot of the heavy lifting of learning to understand images. The most popular pre-trained models were trained on huge datasets for a long period of time, and you can simply piggy back on what they've learned.
  - The features that these models learn are often reusable for many image-related tasks.
  - This is because lower level visual features such as edges, colour blobs, simple composite shapes etc. are pervasive regardless of the specifics of the image task.
- **Lower data and computational requirements**. Using a pre-trained model often means you don't need as much training data, time, or compute to reach a certain level of performance.

There are plenty of pre-trained computer vision models available, the most popular probably being VGG, ResNet, EfficientNet, etc.

Here, we will make use of a pre-trained ResNetX model made available via the **Haiku** library (a tool for building neural networks in **JAX**).


<p align="center">
  <img src="https://storage.googleapis.com/dm-turtle-recall/tutorial_images/jax.png" width="900"/>
</p>

The rest of this tutorial will implement an image classification neural network using JAX, which is a language or framework for numerical computation and deep learning that is frequently used by researchers and engineers at DeepMind. Since JAX is still less common than other tools such as Keras, Tensorflow, and PyTorch, the next section contains a quick introduction to some of the core features of JAX and Haiku.

**Note**: It is not at all necessary to use JAX for this challenge, but we thought this might be a nice opportunity to introduce more people to this powerful and elegant way of writing machine learning code.

If you are already familiar with JAX, feel free to skip the next section :)

<a name="Model"></a>
## 6. Fine-tuning a ResNet model using JAX and Haiku

Time to get started with training our turtle facial recognition model!

First, let's install and import some key libraries:


In [None]:
import jax
import jax.numpy as jnp  # JAX version of numpy with a very similar API.
try:
  import haiku as hk
except ModuleNotFoundError:
  !pip install dm-haiku
  import haiku as hk
try:
  import optax
except ModuleNotFoundError:
  !pip install optax
  import optax
try:
  import immutabledict
except ModuleNotFoundError:
  !pip install immutabledict
  import immutabledict
import functools
from PIL import Image  # Image utilities.
import tqdm

Create three mappings and get the paths to the training set image files.
1. `labels` : turtle ID --> unique integer labels
1. `label_lookup` : unique integer labels --> turtle ID
1. `image_to_turtle` :  image IDs to turtle IDs (training set only).


In [None]:
turtle_ids = sorted(np.unique(train.turtle_id)) + ['new_turtle']
labels = dict(zip(turtle_ids, np.arange(len(turtle_ids))))
label_lookup = {v: k for k, v in labels.items()}
print(label_lookup)
num_classes = len(labels) 
image_to_turtle = dict(zip(train.image_id, train.turtle_id))

image_files_left = [os.path.join(IMAGE_DIR, f) for f in os.listdir(IMAGE_DIR)
              if f.split('.')[0] in train_left.image_id.values]

image_files_other = [os.path.join(IMAGE_DIR, f) for f in os.listdir(IMAGE_DIR)
              if f.split('.')[0] in (train_right.image_id.values or train_top.image_id.values)]
image_files = image_files_left + image_files_other

image_ids = [os.path.basename(f).split('.')[0] for f in image_files]
image_turtle_ids = [image_to_turtle[id] for id in image_ids]

Load the training images into memory - takes a little while!

*   Crops each image around the centre and resizes to `(224, 224)`


In [None]:
import PIL
def crop_and_resize(pil_img, rotate=False):
  """Crops square from center of image and resizes to (224, 224)."""
  if rotate:
    # делаем все вправо
    pil_img = pil_img.rotate(180)

  w, h = pil_img.size
  crop_size = min(w, h)
  crop = pil_img.crop(((w - crop_size) // 2, (h - crop_size) // 2,
                       (w + crop_size) // 2, (h + crop_size) // 2))
  return crop.resize((224, 224))



tqdm.tqdm._instances.clear()
loaded_images = [crop_and_resize(Image.open(f)) for f in tqdm.tqdm(image_files)]

In [None]:
def crop_and_resize(pil_img):
  """Crops square from center of image and resizes to (224, 224)."""
  w, h = pil_img.size
  crop_size = min(w, h)
  crop = pil_img.crop(((w - crop_size) // 2, (h - crop_size) // 2,
                       (w + crop_size) // 2, (h + crop_size) // 2))
  return crop.resize((224, 224))



tqdm.tqdm._instances.clear()
loaded_images = [crop_and_resize(Image.open(f)) for f in tqdm.tqdm(image_files)]

Define a function to get a random batch of data from the training images:


*   Randomly select `batch_size` elements from the available images
*   Get the labels for the selected images
*   Optionally rebalance the dataset so that every label is sampled uniformly


Returns the batch of images of shape `(batch_size, 224, 224, 3)` and the integer labels of shape `(batch_size)`



In [None]:
probability_per_label = {
    label: 1 / label_count / len(train_images_per_turtle)
    for label, label_count in train_images_per_turtle.items()
}

probabilities = [
    probability_per_label[image_turtle_id]
    for image_turtle_id in image_turtle_ids
]
assert np.isclose(1., np.sum(probabilities))


def get_batch(batch_size, rebalance=False):
  if rebalance:
    probs = probabilities
  else:
    probs = None
  batch_image_idxs = np.random.choice(
      len(image_files), size=batch_size, replace=False, p=probs)
  input_images = [loaded_images[idx] for idx in batch_image_idxs]
  image_labels = [labels[image_turtle_ids[idx]] for idx in batch_image_idxs]
  return (jnp.stack([
      jnp.asarray(im, dtype=jnp.float32) / 255. for im in input_images
  ]), jnp.stack(image_labels).astype(jnp.int32))

In [None]:
batch_images, _ = get_batch(batch_size=32)

_, axes = plt.subplots(nrows=4, ncols=8, figsize=(12, 6))
axes = axes.flatten()
for img, ax in zip(list(batch_images), axes):
  ax.imshow(img)
  ax.xaxis.set_visible(False)
  ax.yaxis.set_visible(False)
plt.tight_layout()
plt.show()

###ResNet

Define our network: a ResNet50 model made available via the Haiku library. The output is of size `num_classes`:

In [None]:
@hk.without_apply_rng
@hk.transform_with_state
def resnet(x, is_training):
  return hk.nets.ResNet50(
      num_classes=num_classes, resnet_v2=True,
      bn_config={'decay_rate': 0.9})(x, is_training)

Next we define our loss and update functions.

The loss function computes the softmax cross entropy between the set of logits and labels and sums over the batch. We also add L2 regularisation on the model
parameters to help to alleviate overfitting.

The update function computes the gradients and updates the parameters using the `jax.grad` and `optax.apply_updates` utility functions.

In [None]:
@functools.partial(jax.value_and_grad, has_aux=True)
def loss_fn(params, state, inputs, labels):
  predicted, new_state = net.apply(params, state, inputs, is_training=True)
  predicted = jax.nn.log_softmax(predicted, axis=-1)
  labels_one_hot = jax.nn.one_hot(labels, num_classes=num_classes)
  loss = -(predicted * labels_one_hot).sum(axis=-1).mean()
  loss = loss + l2_regularisation(params) * 0.005
  return loss, new_state


@jax.jit
def update(params, state, opt_state, inputs, labels):
  (loss, new_state), grads = loss_fn(params, state, inputs, labels)
  updates, new_opt_state = opt.update(grads, opt_state)
  new_params = optax.apply_updates(params, updates)
  return new_params, new_state, new_opt_state, loss


def l2_regularisation(params):
  l2_norm = 0.
  for module_name, module_params in params.items():
    if 'batchnorm' not in module_name:
      l2_norm += sum(
          [jnp.sum(jnp.square(x)) for x in jax.tree_leaves(module_params)])
  return l2_norm

Now that our forward pass, loss function, and update function are defined, let's load in some pre-trained weights from a ResNet50 model trained on ImageNet:

In [None]:
import pickle

checkpoint_url = urllib.parse.urljoin(BASE_URL,
                                      'resnet50_imagenet_checkpoint.pystate')
checkpoint = pickle.loads(requests.get(checkpoint_url).content)

# Get model params and state from the checkpoint.
pretrained_params = checkpoint['experiment_module']['params']
pretrained_state = checkpoint['experiment_module']['state']
pretrained_opt_state = checkpoint['experiment_module']['opt_state']

Most of the parameters of the pre-trained model are the same, except for the final layer:
- The pretrained ResNet50 model was trained to predict 1000 output classes
- Our current classification task has 101 output classes

This means we can use all of the pretrained parameters expect for those in the final layer, which will have to be learned from scratch.

We'll also need to update the optimiser state `opt_state` to reflect this change in final layer params.

In [None]:
import tree


def update_params(path, values, scale_factor=-1e-5):
  if path[-2:] == ('res_net50/~/logits', 'b'):
    return scale_factor * jax.random.normal(
        jax.random.PRNGKey(0), (num_classes,))
  elif path[-2:] == ('res_net50/~/logits', 'w'):
    return scale_factor * jax.random.normal(
        jax.random.PRNGKey(0), (values.shape[0], num_classes))
  else:
    return values


def update_opt_state(path, values):
  if path[-2:] == ('res_net50/~/logits', 'b'):
    return jnp.zeros((num_classes,))
  elif path[-2:] == ('res_net50/~/logits', 'w'):
    return jnp.zeros((values.shape[0], num_classes))
  else:
    return values


pretrained_params = tree.map_structure_with_path(update_params,
                                                 pretrained_params)
pretrained_opt_state = tree.map_structure_with_path(update_opt_state,
                                                    pretrained_opt_state)

Fine tune the model using the pretrained parameters to warm start the model (here, we are using the Adam optimiser from `optax`):

In [None]:
batch_size = 32
finetuned_losses = []
n_steps = 400
opt = optax.adam(1e-3)
opt_state = opt.init(pretrained_params)

net = resnet

for step in range(n_steps):
  batch_images, batch_labels = get_batch(batch_size)
  pretrained_params, pretrained_state, opt_state, loss = update(
      pretrained_params, pretrained_state, opt_state, batch_images,
      batch_labels)
  finetuned_losses.append(loss)
  if step % 50 == 0:
    print(f"Loss at step {step}: {loss:.3f}.", flush=True)

Plot the learning curve from this run:

In [None]:
plt.plot(np.arange(len(finetuned_losses)), finetuned_losses)
plt.title("Fine-tuned model learning curve.")
plt.xlabel("Training steps")
plt.ylabel("Loss")
plt.show()

###CNN


We can also train a simple CNN network from scratch:

In [None]:
@hk.without_apply_rng
@hk.transform_with_state
def simple_cnn(x, is_training):
  def conv_block(x, channels, kernel):
    x = hk.Conv2D(
        output_channels=channels,
        kernel_shape=kernel,
        stride=2,
        padding='SAME',
        with_bias=True)(
            x)
    x = jax.nn.relu(x)
    return x

  for channels, kernel in zip([8, 16], [3, 5]):
    x = conv_block(x, channels, kernel)
  x = hk.Flatten()(x)
  x = hk.Linear(256)(x)
  x = jax.nn.relu(x)
  x = hk.Linear(num_classes)(x)
  return x

In [None]:
# Initialise new network params, state, and optimiser from scratch.

net = simple_cnn

image_for_init, _ = get_batch(1)
params, state = net.init(jax.random.PRNGKey(1), image_for_init, True)
opt = optax.adam(3e-4)
opt_state = opt.init(params)

###################   МОЕ ГЕНИАЛЬНОЕ ИЗМЕНЕНИЕ ##########################
batch_size = 64
#########################################################################
from_scratch_losses = []
n_steps = 1000

for step in range(n_steps):
  batch_images, batch_labels = get_batch(batch_size, rebalance=True)
  params, state, opt_state, loss = update(params, state, opt_state,
                                          batch_images, batch_labels)
  from_scratch_losses.append(loss)
  if step % 50 == 0:
    print(f"Loss at step {step}: {loss:.3f}.", flush=True)

Compare the learning curves from the two models:

In [None]:
figure, axes = plt.subplots(ncols=2, sharey=True, figsize=(10, 4))

axes[0].plot(np.arange(len(finetuned_losses)), finetuned_losses)
axes[0].set_title("Fine-tuning a pretrained model")
axes[0].set(xlabel="Training steps", ylabel="Loss")
axes[1].plot(np.arange(len(from_scratch_losses)), from_scratch_losses)
axes[1].set_title("Training from scratch")
axes[1].set(xlabel="Training steps", ylabel="Loss")

plt.show()

### 7. Performing inference using trained models

We can make predictions on new examples using a trained model by passing the model parameters, state and (preprocessed) inputs to the `apply` function:

In [None]:
@jax.jit
def predict(params, state, inputs):
  """Forward pass of model with log softmaxed output."""
  predicted, _ = net.apply(params, state, inputs, is_training=False)
  return jax.nn.log_softmax(predicted, axis=-1)


def get_image_by_image_id(image_id):
  """Function to get a model-ready image given an image ID"""
  all_image_files = os.listdir(IMAGE_DIR)
  all_image_ids = [
      os.path.basename(file).split('.')[0] for file in all_image_files
  ]
  if image_id not in all_image_ids:
    raise ValueError(f'Could not find image with ID {image_id}')
  image_filepath = all_image_files[all_image_ids.index(image_id)]
  image = Image.open(os.path.join(IMAGE_DIR, image_filepath))
  image = crop_and_resize(image)
  return jnp.stack([jnp.asarray(image, dtype=jnp.float32) / 255.])

## 8. Generating test set predictions

We can generate predictions for the entire test set by simply calling `predict` on each example. Let's first load the test images and apply the same cropping and resizing as before:

In [None]:
tqdm.tqdm._instances.clear()
test_image_files = [os.path.join(IMAGE_DIR, f) for f in os.listdir(IMAGE_DIR)
                    if f.split('.')[0] in test.image_id.values]
test_image_ids = [os.path.basename(f).split('.')[0] for f in test_image_files]
loaded_test_images = [crop_and_resize(Image.open(f)) for f in tqdm.tqdm(test_image_files)]

The following utilities will perform batch inference and format the results. For submission we need a csv with an `image_id` column, and separate columns for each of our top 5 predictions.

In [None]:
import collections

def batch_list(list_to_batch, batch_size):
  """Chunk up a list into batches, potentially with a smaller final batch."""
  return [list_to_batch[i:i + batch_size] for i in range(
      0, len(list_to_batch), batch_size)]

def predict_on_set(batched_image_ids,
                   batched_images,
                   params, state):
  """Returns top 5 predictions on batched images as a submission dataframe."""
  model_predictions = []

  for ids, images in zip(batched_image_ids, batched_images):
    # Stack images for batch inference.
    images = jnp.stack(
        [jnp.asarray(im, dtype=jnp.float32) / 255. for im in images])

    # Make predictions and sort logits to find top 5 predictions.
    logits = predict(params, state, images)
    logits = jax.device_get(logits)
    top_5_predictions = np.argsort(logits)[:, -5:][:, ::-1]
    print(top_5_predictions)

    # Format results.
    for image_id, predictions in zip(ids, top_5_predictions):
      row = {}
      predicted_turtle_ids = [label_lookup[label] for label in predictions]
      row['image_id'] = image_id
      for prediction_idx, prediction in enumerate(predicted_turtle_ids):
        row[f'prediction{prediction_idx + 1}'] = prediction
      model_predictions.append(row)

  return pd.DataFrame(model_predictions).set_index('image_id')

# Batch up test images and IDs.
############### ВНИМАНИМАНИЕ! ЗДЕСЬ ТОЖЕ ПРИШЛОСЬ ПОМЕНЯТЬ! ####################
eval_batch_size = 64
################################################################################
test_batched_images = batch_list(loaded_test_images, eval_batch_size)
test_batched_image_ids = batch_list(test_image_ids, eval_batch_size)

In [None]:
net = simple_cnn
predictions_from_scratch = predict_on_set(test_batched_image_ids,
                                          test_batched_images,
                                          params, state)
net = resnet
predictions_from_pretrained = predict_on_set(test_batched_image_ids,
                                             test_batched_images,
                                             pretrained_params,
                                             pretrained_state)

In [None]:
predictions_from_scratch

<a name="Submit"></a>
## 9. Submitting our predictions

Now our predictions are ready, we can save them to file and download them, ready to submit to Zindi:

In [None]:
from google.colab import files

predictions_from_scratch.to_csv('submission.csv')
files.download('submission.csv')

### The evaluation metric: Mean Average Precision (`MAP@5`)

We are using Mean Average Precision at 5, where 5 refers to the number of predictions submitted for each turtle.


#### Precision @k (`P@k`)

- Defined as `true_positives_in_top_k_predictions / k `, this captures how many relevant items are present in the top `k` recommendations of your system.

- For example, let's assume the the prediction of one row is as follows:

  ```actual = "t_id_ROFhVsy2"```

  ``` predicted = ["t_id_ROFhVsy2", "t_id_UVQa4BMz", "t_id_a4VYrmyA", "new_turtle", "t_id_4ZfTUmwL"]```

  Then, for different values of `k`:

  `P@1  = 1/1 `

  `P@2 = 1/2 `

  `P@3 = 1/3`

  `P@4 = 1/4`

  `P@5 = 1/5`

  <br>

#### Average precision @ k (`AP@k`)
- Defined as the mean of `P@i` for `i=1, ..., K`.

<br>

#### Mean Average Precision @k (`MAP@K`)
- Defined as the mean of the `AP@K` for all the turtles.

- For our metric, `MAP@5`: sum `AP@5` for all the turtles and divide that value by the number of turtles

Here are functions that you can use to calculate the `AP@5` and `MAP@5` for a given label and list of predictions:

In [None]:
def apk(actual, predicted, k=5):
  """Computes the average precision at k.

  Args:
    actual: The turtle ID to be predicted.
    predicted : A list of predicted turtle IDs (order does matter).
    k : The maximum number of predicted elements.

  Returns:
    The average precision at k.
  """
  if len(predicted) > k:
    predicted = predicted[:k]

  score = 0.0
  num_hits = 0.0

  for i, p in enumerate(predicted):
    if p == actual and p not in predicted[:i]:
      num_hits += 1.0
      score += num_hits / (i + 1.0)

  return score


def mapk(actual, predicted, k=5):
  """ Computes the mean average precision at k.

    The turtle ID at actual[i] will be used to score predicted[i][:k] so order
    matters throughout!

    actual: A list of the true turtle IDs to score against.
    predicted: A list of lists of predicted turtle IDs.
    k: The size of the window to score within.

    Returns:
      The mean average precision at k.
  """
  return np.mean([apk(a, p, k) for a, p in zip(actual, predicted)])

In [None]:
predictions = predictions_from_scratch[[
    "prediction1", "prediction2", "prediction3", "prediction4", "prediction5"
]]
y_predict = predictions.values.tolist()

# We don't actually know the true labels for the test set, so for the purposes
# of demonstration we just assume that all of the images in the test set are of
# a single turtle:
assumed_y = ["t_id_d6aYXtor"] * len(y_predict)

mapk_result = mapk(assumed_y, y_predict, k=5)
print("With made up test set labels, our mapk with k=5 is", mapk_result)

### Generalisability prize
Please note that there is an additional prize for generalisability: The likelihood of the approach and algorithm being able to generalise beyond the challenge dataset without frequent re-training, taking into account the approach and algorithm used.

<a name="Suggestions"></a>
## 8. Suggestions for improving the model

This tutorial provides a jumping off point for this task, but there are plenty of improvements you might consider making to the model presented in this colab. A few possible directions to explore:
- **Data-related**
  - Preprocessing the dataset in various ways - cropping, colour-correction, etc.
  - Augmenting the dataset - flipping, rotating, tinting, etc. the images to increase data size and potentially improve model generalisation
  - Making use of the extra images, listed in `extra_images.csv`, provided in the dataset
- **Model-related**
  - Outputting 'new_turtle' when the model is particularly uncertain
  - Trying different pre-trained models
  - Playing around with the model architecture
  - Trying model ensembling
  - Hyperparameter tuning
  - Other regularisation approaches

Thanks for reading, hope you enjoyed this tutorial and we're looking forward to seeing your entries!

<a name="Legal"></a>
## 9. License and Disclaimer

This is not an officially-supported Google product.

Copyright 2021 DeepMind Technologies Limited.

This notebook and code is licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at https://www.apache.org/licenses/LICENSE-2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.


**Model Parameters License**

The pre-trained model parameters are made available under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license. You can find details at: https://creativecommons.org/licenses/by/4.0/legalcode

**Data Set**

The data set of turtle images and associated labels have been provided by Zindi and Local Ocean Conservation.
