# OpenGlue Pipeline with no pre-extraction
In this pipeline, we train the model without preforming feature extraction prior to training. Instead, feature extraction is done during training. 

# Training with no pre-extraction

#### Configuration Options

See [`CONFIGURATIONS.md`](CONFIGURATIONS.md) for configuration details. Please ensure all config options are set properly prior to training. For no pre-extraction, config/config.yaml will be used as default, but if you specify a different config file in the arguments, then it will be merged with config/config.yaml, overwritting such that the settings in the specified config file are kept over the conflicting ones in config/config.yaml.

#### Selecting a Feature Extractor

To choose a specific feature extractor for pre-extraction or realtime extraction, you specify the filepath to the config file for that feature extractor, containing neccessary configurations and a filepath to the model weights if applicable. 3 of these extractors, the superpoint extractors, use pretrained models, while the others do not. It is reccommended to use one of these 3: superpoint_magicleap, superpoint_kitti, or superpoint_coco. These are differentiated by the dataset that was used to train them. 2 sets of default extractors are included, one in config/features/ and one in config/features_online/<br />
In config/features/, the extractors are configured to run with a higher number of maximum keypoints. For the superpoint extractors, this is set at 2048 keypoints. In features online, the extractors are configured to run with a lower number of maximum keypoints, which helps reduce runtime and memory utilization. For the superpoint extractors, this is 1024 keypoints. <br/><br/>
For training without cached features, maximum keypoints are capped at 1024, so it is reccommended to choose a feature extractor from config/features_online/.

In [3]:
#From train.py
'''
    Modified strategy to make compatible with interactive python runtime
    Modified input from command line via argparse to directly passed variables more suitable for jupyter
'''

import torch
import shutup

shutup.please()
import os
from datetime import datetime
from omegaconf import OmegaConf
import pytorch_lightning as pl

#Modified from importing DDPPlugin (depreciates)
from pytorch_lightning.strategies import DataParallelStrategy

from data.megadepth_datamodule import MegaDepthPairsDataModule
from data.acrobat_affine_datamodule import AcrobatAffineDataModule
from models.matching_module import MatchingTrainingModule
from utils.train_utils import get_training_loggers, get_training_callbacks, prepare_logging_directory


def train(config_path, features_config_path):
    # Load config
    config = OmegaConf.load('config/config.yaml')  # base config
    feature_extractor_config = OmegaConf.load(features_config_path)
    if config_path != 'config/config.yaml':
        add_conf = OmegaConf.load(config_path)
        config = OmegaConf.merge(config, add_conf)

    pl.seed_everything(int(os.environ.get('LOCAL_RANK', 0)))

    # Prepare directory for logs and checkpoints
    # Added pathing option in config file to give ability to move experiment location out of log directory through 'experiments_root_path'
    if os.environ.get('LOCAL_RANK', 0) == 0:
        experiment_name = '{}__attn_{}__laf_{}__{}'.format(
            #config['data']['experiments_root_path'],
            feature_extractor_config['name'],
            config['superglue']['attention_gnn']['attention'],
            config['superglue']['laf_to_sideinfo_method'],
            str(datetime.now().strftime("%Y-%m-%d-%H-%M-%S"))
        )
        log_path = prepare_logging_directory(config, experiment_name, features_config=feature_extractor_config)
    else:
        experiment_name, log_path = '', ''

    # Init Lightning Data Module
    data_config = config['data']
    #dm = MegaDepthPairsDataModule(
    dm = AcrobatAffineDataModule(
        root_path=data_config['root_path'],
        #train_list_path=data_config['train_list_path'],
        #val_list_path=data_config['val_list_path'],
        #test_list_path=data_config['test_list_path'],
        batch_size=data_config['batch_size_per_gpu'],
        num_workers=data_config['dataloader_workers_per_gpu'],
        #target_size=data_config['target_size'],
        resize_shape=data_config['target_size'],
        warp_offset = 50
        #val_max_pairs_per_scene=data_config['val_max_pairs_per_scene'],
        #train_pairs_overlap=data_config.get('train_pairs_overlap')
        
    )

    # Init model
    model = MatchingTrainingModule(
        train_config={**config['train'], **config['inference'], **config['evaluation']},
        features_config=feature_extractor_config,
        superglue_config=config['superglue'],
    )

    # Set callbacks and loggers
    callbacks = get_training_callbacks(config, log_path, experiment_name)
    loggers = get_training_loggers(config, log_path, experiment_name)

    # Init trainer
    # Replace accelerator='ddp' with gpu, and replace plugins=DDPPlugin() with strategy=DataParallelStrategy() to comply with
    # Pytorch lightning updates and make model compatible with interactive python runtime
    trainer = pl.Trainer(
        gpus=config['gpus'],
        max_epochs=config['train']['epochs'],
        
        #Vital, with any other accelerator than gpu or no accelerator (while using DataParallelStrategy), 
        #resultant model will not be usable for inference test
        accelerator="gpu", 
        
        gradient_clip_val=config['train']['grad_clip'],
        log_every_n_steps=config['logging']['train_logs_steps'],
        limit_train_batches=config['train']['steps_per_epoch'],
        num_sanity_val_steps=0,
        callbacks=callbacks,
        logger=loggers,
        strategy=DataParallelStrategy(),
        #plugins=DDPPlugin(find_unused_parameters=False),
        precision=config['train'].get('precision', 32),
    )
    # If loaded from checkpoint - validate
    if config.get('checkpoint') is not None:
        trainer.validate(model, datamodule=dm, ckpt_path=config.get('checkpoint'))
    trainer.fit(model, datamodule=dm, ckpt_path=config.get('checkpoint'))

  from .autonotebook import tqdm as notebook_tqdm


##### Ensure torch in interactive python is working properly
Sometimes, torch running inside of an interactive python environment fails to recognize cuda devices. If you have a cuda device you expect to be detected, run this to make sure that torch finds it. If this does not display the expected number of devices, especially if it detects none, try restarting the container that the interactive environment is running on.

In [4]:
print('Number of devices found: ', torch.cuda.device_count())

Number of devices found:  1



Set the path to the config files you would like to use for training with realtime extraction and begin training in the interactive environment below. <br />Alternatively,
<b>To launch train as a script with local feature extraction throughout training, run: </b>  
```
python train.py --config='config/config.yaml' --features_config='config/features_online/sift.yaml'
```
Note: sift is an example, you may use any of the config files present or one you create yourself. The Superpoint configs are reccommended.
This will utilize DDPStrategy (Distributed Data Parallel), as opposed to DataParallelStrategy. In the interactive environment, DDPStrategy is incompatible, so DataParallelStrategy is used instead.

In [5]:

features_config_path = './config/features_online/superpoint_magicleap.yaml'
config_path = './config/config.yaml'
train(config_path, features_config_path)

Global seed set to 0
Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.


/host_Data/Data/MegaDepth/MegaDepth/extracted_features/phoenix3 Superpoint_960_720 SuperPointNet__attn_softmax__laf_none__2022-06-21-22-49-51
log path /host_Data/Data/MegaDepth/MegaDepth/extracted_features/phoenix3/Superpoint_960_720/SuperPointNet__attn_softmax__laf_none__2022-06-21-22-49-51
<All keys matched successfully>


[34m[1mwandb[0m: Currently logged in as: [33mjlunder[0m. Use [1m`wandb login --relogin`[0m to force relogin


GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                     | Type                      | Params
-----------------------------------------------------------------------
0 | local_features_extractor | SuperPointNet             | 1.3 M 
1 | superglue                | SuperGlue                 | 12.0 M
2 | augmentations            | AugmentationSequential    | 0     
3 | epipolar_dist_metric     | AccuracyUsingEpipolarDist | 0     
4 | camera_pose_auc_metric   | CameraPoseAUC             | 0     
-----------------------------------------------------------------------
12.0 M    Trainable params
1.3 M     Non-trainable params
13.3 M    Total params
53.032    Total estimated model params size (MB)


Epoch 0:   0%|          | 0/1 [00:00<?, ?it/s] (0, 480.0) (360.0, 960) (720.0, 0.0)


ValueError: not enough values to unpack (expected 4, got 3)

In [1]:
%load_ext autoreload

In [2]:
%reload_ext autoreload