Scopenet
========

Scopenet is a neural network model for enhancing microscopy images for certain types of specimens, such as blood or tissue biopsies, in such a way that they can be viewed without staining or inducing photo-toxicity. The project is inspired by the call to action in the Journal of Cell Science, [Harnessing artificial intelligence to reduce phototoxicity in live imaging](https://journals.biologists.com/jcs/article/137/3/jcs261545/342983/Harnessing-artificial-intelligence-to-reduce).

In order to train this model, data was captured with an [OpenFlexure Microscope](https://openflexure.org/). We randomly select several different areas on a blood smear and select the center 256x256 rectangle. This portion of the frame is the one with the least amount of optical abberations. We then select the best focus and illumination. The stage control and illumination control is performed with the [Sangaboard](https://taulab.eu/openflexure/5-sangaboard-v5.html) attached to the microscope. We capture the frame and label it as the "best" frame. We then shift the X, Y and Z stage to cause the same 256x256 to appear elsewhere on the frame, where it is out of focus and with slightly reduced lighting. We then randomly change the illumination conditions (which includes the absence of illumination change) and capture a second shapshot. This gives us two 256x256 frames - one under ideal conditions and one that is blurry, noisy, and not well light. Scopenet is trained to predict the frame under ideal conditions from the one that is faulty.

An existing dataset is hosted on Kaggle where it can be downloaded for free. You will need an account on Kaggle and a generated API token (which you can generate from your user settings). You can then place your key into `~/.kaggle/kaggle.json` (or `%HOMEPATH%/kaggle.json` on Windows). For more information on how to do this, visit [kagglehub's Github page](https://github.com/Kaggle/kagglehub?tab=readme-ov-file#option-3-read-credentials-from-kagglejson).

Once that's done, you can train this model.

In [3]:
from scopenet.config import open_config, Config

# Open the user-defined configuration file.
# It will use defaults if the config file does not exist.
# This will also download the dataset if it hasn't already been done.
config = open_config()
print(config)

Config(device='cuda:0', dataset_root='/home/tholbert/.cache/kagglehub/datasets/tay10r/scopenet/versions/1', batch_size=16, num_epochs=100)


In [6]:
from scopenet.net import Net

# Instantiate a blank model. This model is based on U-net, with added residual blocks in the bottleneck.
net = Net(in_channels=3)
print(net)

Net(
  (enc1): _Encoder(
    (conv): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (norm): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (enc2): _Encoder(
    (conv): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (norm): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (enc3): _Encoder(
    (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (enc4): _Encoder(
    (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (norm): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (enc5): _Encoder(
    (conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
    (norm): BatchNorm2d(256, eps=1e-05, momentum=0.1, affi

In [8]:
from pathlib import Path
from scopenet.dataset import Dataset

# Load the datasets. Each training sample is lazily loaded into memory.
# Once it is in memory, it remains there in order to speed up training.
# This means you may want to watch your system memory in case you don't have enough.
train_data = Dataset(Path(config.dataset_root) / 'train')
test_data = Dataset(Path(config.dataset_root) / 'test')
print(f'Training samples: {len(train_data)}, Test samples: {len(test_data)}')

Training samples: 640, Test samples: 320


In [10]:
import torch

# Open the compute device you'll be using to train.
device = torch.device(config.device)
print(f'Torch Device: {device}')
# Load the network onto that device.
net = net.to(device)

Torch Device: cuda:0


In [13]:
from scopenet.train import TrainingSession
from scopenet.ssim import SSIMLoss

# Instantiate a training session.
# This will optimize the model and run test it every epoch.
# We use SSIM loss instead of MSE, since it captures differences in focus better
session = TrainingSession(train_data,
                          test_data,
                          device=device,
                          model_name='scopenet',
                          batch_size=config.batch_size,
                          in_jupyter=True)
session.run_epochs(net, loss_fn=SSIMLoss(), num_epochs=5)

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]

  0%|          | 0/40 [00:00<?, ?it/s]

  0%|          | 0/20 [00:00<?, ?it/s]