<a href="https://colab.research.google.com/github/laralex/Sk-DL2021-FinalProject/blob/main/notebooks/training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Signal recovery from nonlinear distortion in optical communications
Deep Learning 2021 course at Skoltech, Russia.

The quickstart notebook:
* reproduce the experiments from scratch
* or resume training from a checkpoint
* or loading of only model

Suggested to use on Google Colab, but local Jupyter should work too even on CPU.

### ⚙️ Configure only there ⚙️

In [1]:
import torch

REPO_DOWNLOAD_DIR_NAME = 'DL2021-FinalProjectExperiments'
CONFIG_PATH = 'configs/GetDatasetFromGeneration.yaml' # relative to repository

ON_COLAB = True
MOUNT_GOOGLE_DRIVE = False # can't mount on Windows
GPUS_INDICES = 1 if torch.cuda.is_available() else None # None for CPU, list for multiple GPU
GOOGLE_DRIVE_DATASETS_PATH = 'Sk-DL2021-Datasets/' # relative to google drive

GIT_BRANCH = 'generation-rework'
GIT_PULL = True

In [2]:
# checkpoint

# source: id on Google Drive or path (relative to current dir or absolute)
# (optional) type: id/path, default path
# (optional) action: resume_training/load_model, default load_model
# (optional) version: number from version_N folder which training you resume
USE_CHECKPOINT = False
CHECKPOINT = {
    'source': '1_VZoPqrAeEyQksywBn2bQ59jKXFGO0d0', 
    'type': 'id', 
    'action': 'resume_training',
    'version': 23,
}

### Destination of downloads (repository, logs, checkpoints)

In [3]:
import os
import sys
from pathlib import Path
%reload_ext autoreload
%autoreload 1

git_root = !git rev-parse --show-toplevel
already_in_repo = os.path.exists(git_root[0])
current_dir = '/content' if ON_COLAB else '.'

if MOUNT_GOOGLE_DRIVE:
    from google.colab import drive
    drive.mount(f'{current_dir}/drive')
    repo_dir = f'{current_dir}/{REPO_DOWNLOAD_DIR_NAME}'
    root_dir = f'{current_dir}/drive/MyDrive/{REPO_DOWNLOAD_DIR_NAME}'
    datasets_dir = f'{current_dir}/drive/MyDrive/{GOOGLE_DRIVE_DATASETS_PATH}'
else:
    if already_in_repo:
        root_dir = git_root[0]
    else:
        root_dir = f'{current_dir}/{REPO_DOWNLOAD_DIR_NAME}'
    repo_dir = root_dir
    datasets_dir = None
print('Repo:', repo_dir)
print('Logs:', root_dir)
print('Pregenerated dataset:', datasets_dir)

Repo: D:/masters/courses/4-deep-learning/project/repo
Logs: D:/masters/courses/4-deep-learning/project/repo
Pregenerated dataset: None


### Downloads

In [4]:
if not already_in_repo:
    !git clone https://github.com/laralex/Sk-DL2021-FinalProject $repo_dir
%pushd $repo_dir 
!git checkout $GIT_BRANCH
if GIT_PULL:
    !git pull origin $GIT_BRANCH
%popd

try: 
    import pytorch_lightning
    import jsonargparse
except:
    !pip install pytorch_lightning
    !pip install jsonargparse

if USE_CHECKPOINT and CHECKPOINT['type'] == 'id':
    id = CHECKPOINT['source']
    download_path = f"{repo_dir}/downloads/downloaded_checkpoint.ckpt"
    !gdown --id $id -O $download_path
    CHECKPOINT['source'] = download_path

sys.path.append(root_dir)

D:\masters\courses\4-deep-learning\project\repo
M	configs/GetDatasetFromGeneration.yaml
M	data/split_step_generator.py
M	models/dataset_debugging_model.py
M	notebooks/training.ipynb


Already on 'generation-rework'


Already up to date.
D:\masters\courses\4-deep-learning\project\repo\notebooks
popd -> D:\masters\courses\4-deep-learning\project\repo\notebooks


From https://github.com/laralex/Sk-DL2021-FinalProject
 * branch            generation-rework -> FETCH_HEAD


### TensorBoard logs

In [5]:
%load_ext tensorboard
%tensorboard --logdir $root_dir/logs

Reusing TensorBoard on port 6006 (pid 7888), started 7 days, 3:38:11 ago. (Use '!kill 7888' to kill it.)

### Launch training

In [6]:
%aimport train
model = train.main(
        root_dir = root_dir,
        config_path = repo_dir + '/' + CONFIG_PATH, 
        gpu_indices = GPUS_INDICES,
        dataset_root_path = datasets_dir,
        checkpoint_kwargs = CHECKPOINT if USE_CHECKPOINT else None)

Global seed set to 42
GPU available: False, used: False
TPU available: False, using: 0 TPU cores


Pregenerating 4 batches



  | Name | Type | Params
------------------------------
------------------------------
1         Trainable params
0         Non-trainable params
1         Total params
0.000     Total estimated model params size (MB)


Pregenerating 0 batches


Validation sanity check: 0it [00:00, ?it/s]

val batch 319.324-0.000j
val batch -57.119+0.000j


Global seed set to 42


Training: 0it [00:00, ?it/s]

train batch 616.182-0.000j
train batch -45.599-0.000j


  x = x.to(self.memory)


train batch -661.584+0.000j


Validating: 0it [00:00, ?it/s]

val batch 319.324-0.000j
val batch -57.119+0.000j
val batch -570.359-0.000j
val batch -604.548-0.000j
train batch 730.178+0.000j
train batch 228.238-0.000j
train batch -125.495+0.000j


Validating: 0it [00:00, ?it/s]

val batch 319.324-0.000j
val batch -57.119+0.000j
val batch -570.359-0.000j
val batch -604.548-0.000j
train batch 68.519+0.000j
train batch -250.975+0.000j
train batch 1083.721+0.000j


Validating: 0it [00:00, ?it/s]

val batch 319.324-0.000j
val batch -57.119+0.000j
val batch -570.359-0.000j
val batch -604.548-0.000j
