# Instructions

Build head keypoint detection deep learing model in pytorch.

<img src="static/aligner_example.png" alt="Drawing" style="width: 300px;"/>


Your model takes as an input X tensor and outputs predictions for blowhead and bonnet-tip (4 coordinates).

Read more about pytorch and model definition in the following resources:
https://towardsdatascience.com/pytorch-tutorial-distilled-95ce8781a89c?gi=ef974c787a5e
http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html


Now the architecture that you need is the following:

<img src="static/aligner_architecture.png" alt="Drawing" style="height:500px;, width: 300px;"/>

Apart from 4 coordinates this model also uses auxilary targets. It helps with training and serves the purpose of regularization. Those auxilary outputs are the callosity pattern and whale id (original task). 
You can read about it on our blog
https://blog.deepsense.ai/deep-learning-right-whale-recognition-kaggle/

and investigate `metadata.csv` to look at those target columns.

This is an alignment subtask so add `-s alignment` to the execution command

# Your Solution
Your solution function should be called solution. 

CONFIG is a dictionary with all parameters that you want to pass to your solution function

In [1]:
import torch.nn as nn

In [2]:
CONFIG = {'input_size':(3, 256, 256),
         'classes':{'points':128,
                    'callosity':3 ,
                    'whale_id':447}
         }

def solution(input_shape, classes):
    class PyTorchAligner(nn.Module):
        def __init__(self, input_shape, classes):
            """
            input_shape: tuple representing shape
            classes: dictionary of ints with keys ['points','callosity','whale_id']
            """
            super(PyTorchLocalizer, self).__init__()
            self.features = nn.Sequential(
                """
                Feature extraction part of the neural network.
                Stack those layers to get architecture defined in the notes above.
                """

            )
            self.flat_features_nr = self._get_flat_features_nr(input_shape, self.features)

            self.point1_x = nn.Sequential(
                """
                Put your classification layers for point1_x
                """
            )

            self.point1_y = nn.Sequential(
                """
                Put your classification layers for point1_y
                """
            )

            self.point2_x = nn.Sequential(
                """
                Put your classification layers for point2_x
                """
            )

            self.point2_y = nn.Sequential(
                """
                Put your classification layers for point2_y
                """
            )
            
            self.callosity = nn.Sequential(
                """
                Put your classification layers for callosity
                """
            )

            self.whale_id = nn.Sequential(
                    """
                    Put your classification layers for whale id
                    """
            )

        def _get_flat_features_nr(self, in_size, features):
            """
            Linear layers need to know what is the size of the input.
            Implement a function that 
            """
            return flattened_features_size

        def forward(self, x):
            """
            Implement forward pass through the network
            """
            return [pred_p1x, pred_p1y, pred_p2x, pred_p2y, pred_callostiy, pred_whale_id]

        def forward_target(self, x):
            """
            We want to forget about the auxilary outputs here and only output the target predictions
            """
            return [pred_p1x, pred_p1y, pred_p2x, pred_p2y]
    
    return PyTorchAligner(input_shape, classes)