# Training fine flow prediction
Assuming source image $I_s$ and target image $I_t$ are already coarsely aligned, this notebook will try to predict a fine flow $F_{s\rightarrow t}$ between them. 

TODO describe objective functions used in this project

In [1]:
%load_ext autoreload
%autoreload 2

We assume you already have a folder called `workspace` that contains zipped dataset.

In [2]:
%cd ../notebooks/workspace

/mnt/d/courses/RANSAC-Flow/notebooks/workspace


Import packages that we will use throughout this notebook.

In [3]:
from pytorch_lightning import Trainer
from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning.loggers import TensorBoardLogger

We enable logging here to make debug easier. 

In [4]:
import logging

logging.basicConfig(
    level=logging.INFO,
    format="[%(asctime)s] %(name)s :: %(levelname)s :: %(message)s",
    handlers=[logging.StreamHandler()],
)

logging.getLogger("ransacflow").setLevel(logging.DEBUG)
logging.getLogger('ransacflow.data').setLevel(logging.WARNING)

## Prepare dataset
We already pack some datasets used in the original paper as `LightningDataModule`. We will import it here.

In [5]:
from ransacflow.data import MegaDepthDataModule

image_size = 224
mega_depth = MegaDepthDataModule(
    "MegaDepth_cleansed.zip", image_size=image_size, train_batch_size=2
)

In [None]:
#TODO add some sanity check for the dataset here, previews

In [None]:
# TODO setup environments for the following training sessions, how?

In [None]:
# FIXME is it possible to share the Trainer object across all 3 stages

## Stage 1
Only train the **reconstruction loss**. 

It is based on the idea that source image $I_s$ warped with the predicted flow $F_{s\rightarrow t}$ should align well with the target image $I_t$. In the original work, they use the structural similarity (SSIM) as the perception model. 
$$ L_{\text{recon}}\left(I_s, I_t\right) = \sum_{(x,y)\in I_t} M_t^{\text{cycle}}(x,y) \left( 1 - \text{SSIM}\left\lbrace I_s(x^\prime, y^\prime), I_t(x,y) \right\rbrace \right) $$

FIXME wtf is M_t doing here?

In [38]:
# DEBUG somehow dataset needs to reload everythime for correct coords
from ransacflow.data import MegaDepthDataModule

mega_depth = MegaDepthDataModule(
    "MegaDepth_cleansed.zip",
    train_image_size=224,
    train_batch_size=2,
    val_image_size=480,
)

## parameter names
log_dir = "MegaDepth_logs"
##

from ransacflow.train import RANSACFlowModelStage1

ransac_flow = RANSACFlowModelStage1(
    alpha=0, beta=0, gamma=0, kernel_size=7, ssim_window_size=11, lr=2e-4,
)

# FIXME unify TB logging location and experiment name
trainer = Trainer(
    gpus=-1,
    fast_dev_run=6,
    max_epochs=200,
    logger=TensorBoardLogger(log_dir, name="stage1"),
    callbacks=[EarlyStopping(monitor="val_loss", min_delta=0.01, patience=3)],
)
trainer.fit(ransac_flow, mega_depth)


[2021-12-27 18:36:27,052] ransacflow.train :: DEBUG :: generate default pixel error ticks: [ 1.  2.  3.  5.  8. 13. 22. 36.]
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Running in fast_dev_run mode: will run a full train, val, test and prediction loop using 6 batch(es).
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type                  | Params
------------------------------------------------------------
0 | feature_extractor | FeatureExtractor      | 2.8 M 
1 | correlator        | NeighborCorrelator    | 0     
2 | flow              | FlowPredictor         | 1.8 M 
3 | matchability      | MatchabilityPredictor | 1.7 M 
4 | loss_rec          | ReconstructionLoss    | 0     
------------------------------------------------------------
6.2 M     Trainable params
0         Non-trainable params
6.2 M     Total params
24.948    Total estimated model params size (MB)


Training: 0it [00:00, ?it/s]

[2021-12-27 18:36:29,523] ransacflow.train :: DEBUG :: create new grid torch.Size([224, 224])


Validating: 0it [00:00, ?it/s]

[2021-12-27 18:36:31,927] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:32,010] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:32,023] ransacflow.train :: DEBUG :: create new grid torch.Size([480, 848])
[2021-12-27 18:36:32,301] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:32,579] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:32,855] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:33,126] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)
[2021-12-27 18:36:33,413] ransacflow.data.transform :: DEBUG :: init_ratio=2.25000, actual_ratio=(w=2.26415, h=2.25000)


prec@1.0=0.00433
prec@2.0=0.01279
prec@3.0=0.04709
prec@5.0=0.12327
prec@8.0=0.25774
prec@13.0=0.46932
prec@22.0=0.73675
prec@36.0=1.00000
tensor([0.0043, 0.0128, 0.0471, 0.1233, 0.2577, 0.4693, 0.7368, 1.0000],
       device='cuda:0')
tensor(0.7423, device='cuda:0')


RuntimeError: DEBUG, validation_epoch_end

All following command line interface are copied from the original implementation, temporarily.

In [None]:
    --nEpochs 200 
    --lr 2e-4
    --kernelSize 7 
--imgSize 224 
--batchSize 16 
    --lambda-match 0.0, alpha 
    --mu-cycle 0.0, beta 
    --grad 0.0, gamma  
    --trainMode flow 
--margin 88 

## Stage 2
Train jointly the **reconstruction loss** and **cycle consistency of the flow**.

Asides from the reconstruction loss mentioned in previous stage, we start to enforce cycle consistency of the flow by
$$ L_{\text{cycle}} = \sum_{(x,y) \in I_t} M_t^{\text{circle}} (x,y) \left\lVert \left(x^\prime, y^\prime \right), \bm{F}_{t\rightarrow s}(x,y) \right\rVert_2 $$

FIXME what happened with (x^\prime, y^\prime), F_{t->s}? Are they multiplied?

In [None]:
from ransacflow.train import RANSACFlowModelStage2

ransac_flow = RANSACFlowModelStage2(alpha=0, beta=1, gamma=0, kernel_size=7, lr=2e-4)

# FIXME unify TB logging location and experiment name
trainer = Trainer(
    max_epochs=50,
    logger=TensorBoardLogger("tb_logs", name="RANSAC-Flow_stage2"),
    callbacks=[EarlyStoppping(monitor="val_loss", min_delta=0.01, patience=3)],
)
trainer.fit(ransac_flow, MegaDepthDataModule)

In [None]:

    --nEpochs 50 
    --lr 2e-4 
    --kernelSize 7 
--imgSize 224 
--batchSize 16 
    --lambda-match 0.0, alpha
    --mu-cycle 1.0, beta
    --grad 0.0, gamma
    --trainMode flow 
--margin 88 

## Stage 3
Train all three losses together: **reconstruction loss**, **cycle consistency of the flow**, and **matchability loss**.

Matchability mask can be seen as pixel-wise weights for the reconstruction and cycle consistency loss. These losses encourage th matchability to be zero. To counteract this effect, the matchability loss encourages the matchability mask to be close to one.

FIXME equation for matchability
FIXME still doesn't understand what matchability actually implies, what is the difference between this and cycle loss?

In [None]:
from ransacflow.train import RANSACFlowModelStage3

ransac_flow = RANSACFlowModelStage3(alpha=0.01, beta=1, gamma=0, kernel_size=7, lr=2e-4)

# FIXME unify TB logging location and experiment name
trainer = Trainer(
    max_epochs=50,
    logger=TensorBoardLogger("tb_logs", name="RANSAC-Flow_stage3"),
    callbacks=[EarlyStoppping(monitor="val_loss", min_delta=0.01, patience=3)],
)
trainer.fit(ransac_flow, MegaDepthDataModule)


In [None]:
    --nEpochs 50 
    --lr 2e-4
    --kernelSize 7 
--imgSize 224 
--batchSize 16 
    --lambda-match 0.01, alpha
    --mu-cycle 1.0, beta
    --grad 0.0, gamma
    --trainMode flow+match 
--margin 88 


## Stage 4.1
This additional stage fine tune on SOMETHING MAGICAL, so the output image introduce less distortions.

TODO need to update description from the original paper

## Stage 4.2
This additional stage uses perceptual loss, 

TODO add description about why and how to use perceptual loss