# Smart Calibration -- Demo

## Introduction
Image classification using a deep neural network. Example CIFAR10 images (32x32 pixels, 10 classes: plane, car, bird ..).
<img src="./tmp/cifar10.png" alt="Alexnet untrained" width="400"/>

Deep NN are trained to perform classification using training data and tested using test data.
<img src="./tmp/losses.png" alt="Alexnet untrained" width="500"/>
Almost always, test accuracy is lower than train accuracy. Why?

Influence function [Cook and Weisberg, 1982](https://books.google.nl/books?id=MVSqAAAAIAAJ&hl=nl) can be used to study this. Recent re-discovery in machine learning [Koh and Liang, 2017](http://proceedings.mlr.press/v70/koh17a.html) and in radio astronomy [Yatawatta, 2018](https://ieeexplore.ieee.org/abstract/document/8448481) [Yatawatta, 2019](https://academic.oup.com/mnras/article/486/4/5646/5484901).
What is it? *small changes* in input leads to *small changes* in output, we find this for the trained model.

Example: Untrained model, AlexNet. Noise like influence function. Accuracy 10%.
<img src="./figures/alexnet_untrained.png" alt="Alexnet untrained" width="400"/>

Trained model, AlexNet, 2.4 million parameters, test accuracy 65%. See the patterns, which indicate *bias*.
<img src="./figures/alexnet_trained.png" alt="Alexnet trained" width="400"/>

Trained model, ResNet18, 11 million parameters, test accuracy 85%. Almost noise like influence function.
<img src="./figures/resnet18_trained.png" alt="ResNet18 trained" width="400"/>


## Calibration
In CIFAR10 deep neural network training: we have *ground truth*  information, in radio astronomy we do not have this. Influence function *does not need* the ground truth, which is ideal for us.

A simple example ${\bf x}={\bf A}{\bf \theta}$, we know ${\bf A}$, we know a noisy ${\bf x}$, we need to find ${\bf \theta}$. In elastic net regression, we add regularization to find ${\bf \theta}$.

Instead of *hand tuning* the regularization, we train an agent to self-learn what these regularization paramters should be.
<img src="figures/enet_pipeline.png" alt="Elastic net regression agent and environment" width="500"/>

### Reinforcement learning
An AI agent learns by itself, by trial and error, how to perform a task. Given the *state*, perform an *action* and receive a *reward*. There is a long history, and with the addition of deep neural networks, RL has made rapid progress recently: beating humans in Chess, Go, for example. In the example above, we have 5 deep neural networks, and we train them by running various trials.

In [None]:
import sys,os
import gym
import torch
import numpy as np
sys.path.insert(0, os.path.abspath('./elasticnet'))

from enetenv import ENetEnv
from enet_td3 import Agent

%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt

In [None]:
if __name__ == '__main__':
    N=20 # rows = data points
    M=20 # columns = parameters, note, N<M, so no unique solution
    env = ENetEnv(M,N)
    # actions: 2
    agent = Agent(gamma=0.99, batch_size=64, n_actions=2, tau=0.005,
                  max_mem_size=1000, input_dims=[N+N*M], lr_a=1e-3, lr_c=1e-3,
                 update_actor_interval=2, warmup=100, noise=0.1)
    # note: input dims: N eigenvalues+ N*M size of design matrix, 
    # lr_a: learning rate actor, lr_c:learning rate critic
    scores=[]
    n_games= 200

    for i in range(n_games):
        score = 0
        done = False
        observation = env.reset()
        loop=0
        while (not done) and loop<2:
            action = agent.choose_action(observation)
            observation_, reward, done, info = env.step(action)
            score += reward
            agent.store_transition(observation, action, reward,
                                    observation_, done)
            agent.learn()
            observation = observation_
            loop+=1
        score=score.cpu().data.item()/loop
        scores.append(score)

        avg_score = np.mean(scores[-100:])
        print('episode ', i, 'score %.2f' % score,
                'average score %.2f' % avg_score)

plt.plot(scores)

## Influence maps
Visualizing influence function in radio astronomical data (using 1 min data below):
<img src="figures/influence_maps.png" alt="Influence maps" width="500"/>

Influence maps show what is hidden in the data: can we train a classifier to find out the hidden signals?

### Smart calibration
<img src="tmp/agent_pipeline.png" alt="Calibration pipeline" width="500"/>
We train an RL agent to automatically find out best calibration parameters: hyperparameters (regularization), sky model, resource allocation (number of CPU/GPU). We only use *a small amount of data* to make this tuning, and can automatically adjust the settings as the data are processed.

We can also train RL agents for other tasks: RFI mitigation, image synthesis, anomaly detection etc.

More information: [Code](https://github.com/SarodYatawatta/smart-calibration) [Paper](https://arxiv.org/abs/2102.03200)