<a href="https://colab.research.google.com/github/AllenInstitute/deepinterpolation/blob/master/examples/GoogleColab_example_Ophys__Finetuning_and_Inference_Workshop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We first install the DeepInterpolation package. This is using a branch that was optimized for Google Colab low memory and uses pre-installed tensorflow version.



In [None]:
!pip3 install git+https://github.com/AllenInstitute/deepinterpolation.git@fix/gpu_memory_threads

Collecting git+https://github.com/AllenInstitute/deepinterpolation.git@fix/gpu_memory_threads
  Cloning https://github.com/AllenInstitute/deepinterpolation.git (to revision fix/gpu_memory_threads) to /tmp/pip-req-build-em7oh0_5
  Running command git clone -q https://github.com/AllenInstitute/deepinterpolation.git /tmp/pip-req-build-em7oh0_5
  Running command git checkout -b fix/gpu_memory_threads --track origin/fix/gpu_memory_threads
  Switched to a new branch 'fix/gpu_memory_threads'
  Branch 'fix/gpu_memory_threads' set up to track remote branch 'fix/gpu_memory_threads' from 'origin'.
Collecting s3fs
  Downloading s3fs-2022.1.0-py3-none-any.whl (25 kB)
Collecting argschema==2.0.2
  Downloading argschema-2.0.2.tar.gz (24 kB)
Collecting mlflow==1.14.1
  Downloading mlflow-1.14.1-py3-none-any.whl (14.2 MB)
[K     |████████████████████████████████| 14.2 MB 7.1 MB/s 
[?25hCollecting marshmallow==3.0.0rc6
  Downloading marshmallow-3.0.0rc6-py2.py3-none-any.whl (42 kB)
[K     |██████████

We import the package FineTuning and Inference interface and some useful libraries for this notebook

In [None]:
from deepinterpolation.cli.fine_tuning import FineTuning
from deepinterpolation.cli.inference import Inference
import os
import glob
import datetime
import h5py

We connect a local folder to a public S3 bucket with the Allen Brain Observatory RAW movies stored as hdf5 files. This is using s3fs, an emulated file system. 

In [None]:
!apt install s3fs
!mkdir /content/s3  
!s3fs allen-brain-observatory /content/s3 -o public_bucket=1

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  cuda-command-line-tools-10-0 cuda-command-line-tools-10-1
  cuda-command-line-tools-11-0 cuda-compiler-10-0 cuda-compiler-10-1
  cuda-compiler-11-0 cuda-cuobjdump-10-0 cuda-cuobjdump-10-1
  cuda-cuobjdump-11-0 cuda-cupti-10-0 cuda-cupti-10-1 cuda-cupti-11-0
  cuda-cupti-dev-11-0 cuda-documentation-10-0 cuda-documentation-10-1
  cuda-documentation-11-0 cuda-documentation-11-1 cuda-gdb-10-0 cuda-gdb-10-1
  cuda-gdb-11-0 cuda-gpu-library-advisor-10-0 cuda-gpu-library-advisor-10-1
  cuda-libraries-10-0 cuda-libraries-10-1 cuda-libraries-11-0
  cuda-memcheck-10-0 cuda-memcheck-10-1 cuda-memcheck-11-0 cuda-nsight-10-0
  cuda-nsight-10-1 cuda-nsight-11-0 cuda-nsight-11-1 cuda-nsight-compute-10-0
  cuda-nsight-compute-10-1 cuda-nsight-compute-11-0 cuda-nsight-compute-11-1
  cuda-nsight-systems-10-1 cuda-nsight-systems-

This is the path to a single movie on S3 that we will make a copy locally

In [None]:
input_movie_path = '/content/s3/visual-coding-2p/ophys_movies/ophys_experiment_501254258.h5'
output_movie_path = '/content/ophys_experiment_501254258.h5'

We make a copy of a subset of a movie file so as to fit Google Colab more limited free file storage and provide faster local access. 

In [None]:
with h5py.File(input_movie_path, 'r') as file_handle:
  data = file_handle['data'][0:5001,:,:]
  with h5py.File(output_movie_path,'w') as file_handle_out:
    file_handle_out.create_dataset('data',data=data)
  del data

We download a pre-trained, optimized DeepInterpolation model. This is currently less validated than the much larger published model but it has the benefit of working well with Colab and is much smaller. So far our results with it are quite good. 

In [None]:
!wget -O /content/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5 https://www.dropbox.com/s/ljunvnl6lvmrzy7/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5?dl=0

--2022-02-17 22:44:52--  https://www.dropbox.com/s/ljunvnl6lvmrzy7/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5?dl=0
Resolving www.dropbox.com (www.dropbox.com)... 162.125.3.18, 2620:100:6018:18::a27d:312
Connecting to www.dropbox.com (www.dropbox.com)|162.125.3.18|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/ljunvnl6lvmrzy7/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5 [following]
--2022-02-17 22:44:53--  https://www.dropbox.com/s/raw/ljunvnl6lvmrzy7/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc5f999eb7b909ff7e6538744465.dl.dropboxusercontent.com/cd/0/inline/Bf4aJQWrDm5lRO

We initialize training objects.

In [None]:
# Initialize meta-parameters objects
finetuning_params = {}
generator_param = {}
generator_test_param = {}

input_movie_path = '/content/ophys_experiment_501254258.h5'

# It is recommended to use 10,000 frames for fine-tuning new files. Here we are limiting computation time for the workshop but this notebook can handle it.
nb_frame_training = 500
input_model_path = '/content/2021_07_31_09_49_38_095550_unet_1024_search_mean_squared_error_pre_30_post_30_feat_32_power_1_depth_4_unet_True-0125-0.5732.h5'
output_dir = '/content/output_folder'

# Those are parameters used for the Validation test generator.
# Here the test is done on the beginning of the data but
# this can be a separate file
generator_param["name"] = "OphysGenerator"  # Name of object (use SingleTifGenerator for tiff files)
generator_param["pre_frame"] = 30
generator_param["post_frame"] = 30
generator_param["data_path"] = input_movie_path
generator_param["batch_size"] = 1 # This is small because Colab GPUs do have very smaller memory. Increase on better cards. 
generator_param["start_frame"] = 0
generator_param["end_frame"] = -1
generator_param["total_samples"] = nb_frame_training
generator_param["pre_post_omission"] = 0  # Number of frame omitted before and after the predicted frame

generator_test_param["name"] = "OphysGenerator"  # Name of object (use SingleTifGenerator for single tiff files or MultiContinuousTifGenerator for an ordered serie of Tiffs)
generator_test_param["pre_frame"] = 30
generator_test_param["post_frame"] = 30
generator_test_param["data_path"] = input_movie_path
generator_test_param["batch_size"] = 1
generator_test_param["start_frame"] = 0
generator_test_param["end_frame"] = -1
generator_test_param["total_samples"] = 100  # This is use to measure validation loss
generator_test_param["pre_post_omission"] = 0  # Number of frame omitted before and after the predicted frame


# Those are parameters used for the training process
finetuning_params["name"] = "transfer_trainer"

# Change this path to any model you wish to improve
local_path = input_model_path
finetuning_params["model_source"] = {
  "local_path": local_path
}

# An epoch is defined as the number of batches pulled from the dataset before measuring validation loss.
# It is mostly for performance tracking 
# Because our datasets are VERY large. Often, we cannot
# go through the entirety of the data so we define an epoch
# slightly differently than is usual.
steps_per_epoch = 200
finetuning_params["steps_per_epoch"] = steps_per_epoch
finetuning_params[
"period_save"
] = 25
# network model is potentially saved during training between a regular
# nb of epochs. Useful to go back to models during training

finetuning_params["learning_rate"] = 0.0001
finetuning_params["loss"] = "mean_squared_error"
finetuning_params["output_dir"] = output_dir

# Those are not needed when working with local files so turning off. 
finetuning_params["use_multiprocessing"] = False
finetuning_params["caching_validation"] = False

args = {
"finetuning_params": finetuning_params,
"generator_params": generator_param,
"test_generator_params": generator_test_param,
"output_full_args": True
}

finetuning_obj = FineTuning(input_data=args, args=[])

print("Starting fine-tuning")

finetuning_obj.run()

print("Fine-tuning finished")




INFO:FineTuning:wrote /content/output_folder/2022_02_17_22_43_training_full_args.json
INFO:FineTuning:wrote /content/output_folder/2022_02_17_22_43_finetuning.json
INFO:FineTuning:wrote /content/output_folder/2022_02_17_22_43_generator.json
INFO:FineTuning:wrote /content/output_folder/2022_02_17_22_43_test_generator.json


Starting fine-tuning


  super(RMSprop, self).__init__(name, **kwargs)




INFO:FineTuning:created objects for training


Epoch 1/2
Epoch 2/2


INFO:FineTuning:fine tuning job finished - finalizing output model


Saved model to disk
Fine-tuning finished


In [None]:
print("Preparing data for inference")
# Initialize meta-parameters objects
inference_param = {}

# We are reusing the data generator for training here.
generator_param["start_frame"] = 0
generator_param["end_frame"] = 200


# This is the name of the underlying inference class called
inference_param["name"] = "core_inferrence"

# Where the output of the previous training is stored
local_path = glob.glob(os.path.join(output_dir, "*_transfer_model.h5"))[0]

inference_param["model_source"] = {
"local_path": local_path
}

base_file = os.path.splitext(os.path.basename(input_movie_path))[0]

unique_time = str(datetime.datetime.now()).replace(".","-").replace(":","-").replace(" ","-")

# Replace this path to where you want to store your output file
inference_param[
"output_file"
] = output_dir+'/'+base_file+'-denoised-on-'+unique_time+'.h5'

# This option is to add blank frames at the onset and end of the output
# movie if some output frames are missing input frames to go through
# the model. This could be present at the start and end of the movie.
inference_param["output_padding"] = True

# this is an optional parameter to bring back output data to a given
# precision. Read the CLI documentation for more details.
# this is available through
# 'python -m deepinterpolation.cli.inference --help'
inference_param["output_datatype"] = 'uint16'

args = {
"generator_params": generator_param,
"inference_params": inference_param,
"output_full_args": True
}

inference_obj = Inference(input_data=args, args=[])

print("Starting inference")
inference_obj.run()

print("Inference finished")

INFO:root:randomize should be set to False for inference.                         Overriding the parameter
INFO:Inference:wrote /content/output_folder/inference_full_args.json
INFO:Inference:wrote /content/output_folder/2022_02_17_22_43_inference.json
INFO:Inference:wrote /content/output_folder/2022_02_17_22_43_generator.json


Preparing data for inference
Starting inference


INFO:Inference:created objects for inference


Inference finished
