diff --git a/11_deep_learning/01-Image-Restoration.ipynb b/11_deep_learning/01-Image-Restoration.ipynb new file mode 100644 index 0000000..0364a8a --- /dev/null +++ b/11_deep_learning/01-Image-Restoration.ipynb @@ -0,0 +1,395 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### This notebook is adapted from **https://github.com/juglab/n2v**\n", + "In this notebook, we will denoise some Scanning Electron Microscopy Images using an approach called Noise2Void. \n", + "Through this notebook, you will see the complete workflow of an DL approach - (i) data preparation, followed by (ii) training the model and finally (iii) applying the trained model on the test, microscopy images. \n", + "You will find some questions (indicated as `Q`s) for discussion in blue boxes in this notebook :) \n", + "
\n", + "Prior to running this notebook, you need to have the correct environment configured. \n", + "\n", + "#### If not running on google colab\n", + "Open a fresh terminal window and run the following commands:\n", + "\n", + ">conda create -n 'dl-biapol' python=3.7 \n", + "conda activate dl-biapol \n", + "pip install tensorflow-gpu=2.4.1 keras=2.3.1 n2v jupyter scikit-image gputools\n", + "\n", + "Finally open this notebook using `jupyter notebook`\n", + "\n", + "#### If running on google colab\n", + "Go to `File>Upload Notebook` and drag and drop this notebook. \n", + "Go to `Runtime > Change Runtime Type > Hardware Accelerator = GPU` \n", + "Create an empty cell following this one and run:\n", + ">!pip install tensorflow-gpu==2.4.1 keras==2.3.1 n2v scikit-image gputools\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Get imports" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">We import all our dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from n2v.models import N2VConfig, N2V\n", + "import numpy as np\n", + "from csbdeep.utils import plot_history\n", + "from n2v.utils.n2v_utils import manipulate_val_data\n", + "from n2v.internals.N2V_DataGenerator import N2V_DataGenerator\n", + "from matplotlib import pyplot as plt\n", + "from tifffile import imread\n", + "import urllib, os, zipfile, ssl\n", + "ssl._create_default_https_context = ssl._create_unverified_context" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Download example data" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">Data by Reza Shahidi and Gaspar Jekely, Living Systems Institute, Exeter. \n", + "You could try opening the train.tif and validation.tif in Fiji." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# create a folder for our data.\n", + "if not os.path.isdir('./data'):\n", + " os.mkdir('./data')\n", + "\n", + "# check if data has been downloaded already\n", + "zipPath=\"data/SEM.zip\"\n", + "if not os.path.exists(zipPath):\n", + " #download and unzip data\n", + " data = urllib.request.urlretrieve('https://download.fht.org/jug/n2v/SEM.zip', zipPath)\n", + " with zipfile.ZipFile(zipPath, 'r') as zip_ref:\n", + " zip_ref.extractall(\"data\")\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Training Data Preparation" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">For training we load one set of low-SNR images and use the `N2V_DataGenerator` to extract training `X` and validation `X_val` patches. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "datagen = N2V_DataGenerator()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">We load all the `.tif` files from the `data` directory. \n", + "The function will return a list of images (numpy arrays)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "imgs = datagen.load_imgs_from_directory(directory = \"data/\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">Let us look at the images. \n", + "We have to remove the added extra dimensions to display them as `2D` images." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "plt.imshow(imgs[0][0,...,0], cmap='magma')\n", + "plt.show()\n", + "\n", + "plt.imshow(imgs[1][0,...,0], cmap='magma')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">We will use the first image to extract training patches and store them in `X`. \n", + "We will use the second image to extract validation patches." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Training" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "patch_shape = (96,96)\n", + "X = datagen.generate_patches_from_list(imgs[:1], shape=patch_shape)\n", + "X_val = datagen.generate_patches_from_list(imgs[1:], shape=patch_shape)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
Q: Why do you think input images chopped into smaller patches? .
\n", + " What could be different schemes for extracting patches? \n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">Using `N2VConfig` we specify some training parameters. \n", + "For example, `train_epochs` is set to $20$ to indicate that training runs for $20$ epochs. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "config = N2VConfig(X, unet_kern_size=3, \n", + " train_steps_per_epoch=int(X.shape[0]/128), train_epochs=20, train_loss='mse', batch_norm=True, \n", + " train_batch_size=128, n2v_perc_pix=0.198, n2v_patch_shape=(64, 64), \n", + " n2v_manipulator='uniform_withCP', n2v_neighborhood_radius=5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">`model_name` is used to identify the model. \n", + "`basedir` is used to specify where the trained weights are saved. \n", + "We shall now create our network model by using the `N2V` method." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
Q: Can you identify what is the default learning rate used while training the N2V model? .
\n", + " HINT Pressing Shift + Tab shows the docstring for a given function\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
Q: Is it advantageous to set a high batch size? How about a low batch size?
\n", + "Also, discuss what would setting BatchNorm = True imply? \n", + "
\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "model_name = 'n2v_2D'\n", + "basedir = 'models'\n", + "model = N2V(config, model_name, basedir=basedir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">Running `model.train` will begin the training for $20$ epochs. \n", + "Training the model will likely take some time. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "history = model.train(X, X_val)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
Q: What do you think is the difference between n2v_mse and n2v_abs? .
\n", + "Also, the last line which is printed out is Loading network weights from 'weights_best.h5'. What defines the best state of the model?\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Inference" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">We load the data we want to process within `input_val`. \n", + "The parameter `n_tiles` can be used if images are to big for the GPU memory. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "input_val = imread('data/validation.tif')\n", + "pred_val = model.predict(input_val, axes='YX')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + ">Let's see results on the validation data. (First we look at the complete image and then we look at a zoomed-in view of the image)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "plt.figure(figsize=(16,8))\n", + "plt.subplot(1,2,1)\n", + "plt.imshow(input_val,cmap=\"magma\")\n", + "plt.title('Input');\n", + "plt.subplot(1,2,2)\n", + "plt.imshow(pred_val,cmap=\"magma\")\n", + "plt.title('Prediction');" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "plt.figure(figsize=(16,8))\n", + "plt.subplot(1,2,1)\n", + "plt.imshow(input_val[200:300, 200:300],cmap=\"magma\")\n", + "plt.title('Input - Zoomed in');\n", + "plt.subplot(1,2,2)\n", + "plt.imshow(pred_val[200:300, 200:300],cmap=\"magma\")\n", + "plt.title('Prediction - Zoomed in');" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "
Q: We demonstrated microscopy image denoising with this N2V notebook where we first trained a model and later applied the trained model on validation images.
\n", + " Would you call this a supervised learning approach, an unsupervised learning approach or something else? Discuss! :)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.13" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": true, + "sideBar": false, + "skip_h1_title": false, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": true, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/11_deep_learning/02-Image-Semantic-Segmentation.ipynb b/11_deep_learning/02-Image-Semantic-Segmentation.ipynb new file mode 100644 index 0000000..bf8a876 --- /dev/null +++ b/11_deep_learning/02-Image-Semantic-Segmentation.ipynb @@ -0,0 +1,433 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "df043420", + "metadata": {}, + "source": [ + "### This notebook is adapted from https://github.com/dl4mia/04_instance_segmentation/blob/main/1_semantic_segmentation_2D.ipynb" + ] + }, + { + "cell_type": "markdown", + "id": "f42b61ff", + "metadata": {}, + "source": [ + "In this notebook, we will perform pixel-wise segmentation or semantic segmentation on some microscopy images using a standard model architecture called the U-Net. \n", + "Semantic segmentation means, we aim to assign every pixel of the input image one of several different classes (background, cell interior, cell boundary) without distinguishing objects of the same class.\n", + "You will find some questions (indicated as `Q`s) for discussion in blue boxes in this notebook :) \n", + "
\n", + "Prior to running this notebook, you need to have the correct environment configured. \n", + "\n", + "#### If not running on google colab\n", + "Open a fresh terminal window and run the following commands:\n", + "\n", + ">conda create -n 'dl-biapol' python=3.7 \n", + "conda activate dl-biapol \n", + "pip install tensorflow-gpu=2.4.1 keras=2.3.1 n2v jupyter scikit-image gputools\n", + "\n", + "Finally open this notebook using `jupyter notebook`\n", + "\n", + "#### If running on google colab\n", + "Go to `File>Upload Notebook` and drag and drop this notebook. \n", + "Go to `Runtime > Change Runtime Type > Hardware Accelerator = GPU` \n", + "Create an empty cell following this one and run:\n", + ">!pip install tensorflow-gpu==2.4.1 keras==2.3.1 n2v scikit-image gputools\n" + ] + }, + { + "cell_type": "markdown", + "id": "3c078f0c", + "metadata": {}, + "source": [ + "### Get Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "88cf29aa", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import matplotlib\n", + "matplotlib.rcParams[\"image.interpolation\"] = None\n", + "import matplotlib.pyplot as plt\n", + "%matplotlib inline\n", + "%config InlineBackend.figure_format = 'retina'\n", + "\n", + "from glob import glob\n", + "from tqdm import tqdm\n", + "from datetime import datetime\n", + "from tifffile import imread\n", + "from pathlib import Path\n", + "import skimage\n", + "from skimage.segmentation import find_boundaries\n", + "import tensorflow as tf\n", + "from csbdeep.internals.nets import common_unet, custom_unet\n", + "from csbdeep.internals.blocks import unet_block, resnet_block" + ] + }, + { + "cell_type": "markdown", + "id": "22059655", + "metadata": {}, + "source": [ + "### Data" + ] + }, + { + "cell_type": "markdown", + "id": "62cdd825", + "metadata": {}, + "source": [ + "> First we download some sample images and corresponding masks" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b96ab62", + "metadata": {}, + "outputs": [], + "source": [ + "from csbdeep.utils import download_and_extract_zip_file, normalize\n", + "\n", + "download_and_extract_zip_file(\n", + " url = 'https://github.com/mpicbg-csbd/stardist/releases/download/0.1.0/dsb2018.zip',\n", + " targetdir = 'data',\n", + " verbose = 1,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "57af5782", + "metadata": {}, + "source": [ + "> Next we load the data, generate from the annotation masks background/foreground/cell border masks, and crop out a central patch (this is just for simplicity, as it makes our life a bit easier when all images have the same shape)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "652b1f9c", + "metadata": {}, + "outputs": [], + "source": [ + "def crop(u,shape=(256,256)):\n", + " \"\"\"Crop central region of given shape\"\"\"\n", + " return u[tuple(slice((s-m)//2,(s-m)//2+m) for s,m in zip(u.shape,shape))]\n", + "\n", + "def to_3class_label(lbl, onehot=True):\n", + " \"\"\"Convert instance labeling to background/inner/outer mask\"\"\"\n", + " b = find_boundaries(lbl,mode='outer')\n", + " res = (lbl>0).astype(np.uint8)\n", + " res[b] = 2\n", + " if onehot:\n", + " res = tf.keras.utils.to_categorical(res,num_classes=3).reshape(lbl.shape+(3,))\n", + " return res\n", + "\n", + "# load and crop out central patch (for simplicity)\n", + "X = [normalize(crop(imread(x))) for x in sorted(glob('data/dsb2018/train/images/*.tif'))]\n", + "Y = [to_3class_label(crop(imread(y))) for y in sorted(glob('data/dsb2018/train/masks/*.tif'))]\n", + "\n", + "# convert to numpy arrays\n", + "X, Y = np.expand_dims(np.stack(X),-1), np.stack(Y)\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "d92f9390", + "metadata": {}, + "source": [ + "
Q: What does one-hot parameter in the to_3class_label function imply? .
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "8380363d", + "metadata": {}, + "source": [ + "> Plot an example image" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4be38ec6", + "metadata": {}, + "outputs": [], + "source": [ + "i = 3\n", + "fig, (a0,a1) = plt.subplots(1,2,figsize=(15,5))\n", + "a0.imshow(X[i,...,0],cmap='gray'); \n", + "a0.set_title('input image')\n", + "a1.imshow(Y[i]); \n", + "a1.set_title('segmentation mask')\n", + "fig.suptitle(\"Example\")\n", + "None;\n" + ] + }, + { + "cell_type": "markdown", + "id": "04790b83", + "metadata": {}, + "source": [ + "
Q: Plot some more images. What kind of data is shown? How variable is it? Do the segmentation masks look reasonable? .
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "d5218dc1", + "metadata": {}, + "source": [ + "> We now split the training data into ~ 80/20 training and validation data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "12418fa0", + "metadata": {}, + "outputs": [], + "source": [ + "from csbdeep.data import shuffle_inplace\n", + "\n", + "# shuffle data\n", + "shuffle_inplace(X, Y, seed=0)\n", + "\n", + "# split into 80% training and 20% validation images\n", + "n_val = len(X) // 5\n", + "def split_train_val(a):\n", + " return a[:-n_val], a[-n_val:]\n", + "X_train, X_val = split_train_val(X)\n", + "Y_train, Y_val = split_train_val(Y)\n", + "\n", + "print(f'training data: {len(X_train)} images and {len(Y_train)} masks')\n", + "print(f'validation data: {len(X_val)} images and {len(Y_val)} masks')\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "bee6ea5e", + "metadata": {}, + "source": [ + "> We now will construct a very simple 3-class segmentation model, for which we will use a UNet" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f7d22f6f", + "metadata": {}, + "outputs": [], + "source": [ + "from csbdeep.internals.nets import custom_unet\n", + "model = custom_unet(input_shape=(None,None,1), n_channel_out=3, kernel_size=(3,3), pool_size=(2,2), \n", + " n_filter_base=32, last_activation='softmax')\n", + "\n", + "model.summary()" + ] + }, + { + "cell_type": "markdown", + "id": "3b608d93", + "metadata": {}, + "source": [ + "
Q: What is the intuition behind the skip connections in a U-Net?
\n", + "2) How many parameters exist in this 3-class U-Net? (HINT: Look at the model summary)
\n", + "3) Why is the last_activation softmax?\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "9f4cd8c3", + "metadata": {}, + "source": [ + "### Training the model" + ] + }, + { + "cell_type": "markdown", + "id": "0efb7625", + "metadata": {}, + "source": [ + "> We now will compile the model, i.e. deciding on a loss function and an optimizer. \n", + "As we have a classification task with multiple output classes, we will use a simple categorical_crossentropy loss as loss function. \n", + "Furthermore, Adam with the a learning rate on the order of 1e-4 - 1e-3 is a safe default (General reading tip: http://karpathy.github.io/2019/04/25/recipe/ :)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e4a8f340", + "metadata": {}, + "outputs": [], + "source": [ + "model.compile(loss=tf.keras.losses.categorical_crossentropy, optimizer=tf.keras.optimizers.Adam(learning_rate=3e-4))" + ] + }, + { + "cell_type": "markdown", + "id": "df168f33", + "metadata": {}, + "source": [ + "> Before we train the model, we define some callbacks that will monitor the training loss etc" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a409f9b8", + "metadata": {}, + "outputs": [], + "source": [ + "from csbdeep.utils.tf import CARETensorBoardImage\n", + "\n", + "timestamp = datetime.now().strftime(\"%d-%H:%M:%S\")\n", + "logdir = Path(f'models/1_semantic_segmentation_2D/{timestamp}')\n", + "logdir.mkdir(parents=True, exist_ok=True)\n", + "callbacks = []\n", + "callbacks.append(tf.keras.callbacks.TensorBoard(log_dir=logdir))\n", + "callbacks.append(CARETensorBoardImage(model=model, data=(X_val,Y_val),\n", + " log_dir=logdir/'images',\n", + " n_images=3))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4975ba35", + "metadata": {}, + "outputs": [], + "source": [ + "model.fit(X_train, Y_train, validation_data=(X_val,Y_val),\n", + " epochs=100, callbacks=callbacks, verbose=1, batch_size=16)" + ] + }, + { + "cell_type": "markdown", + "id": "8aa07946", + "metadata": {}, + "source": [ + "### Prediction on unseen data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7ab81b7d", + "metadata": {}, + "outputs": [], + "source": [ + "i=1\n", + "\n", + "img = X_val[i,..., 0]\n", + "mask = Y_val[i]\n", + "plt.imshow(img)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5445d7b6", + "metadata": {}, + "outputs": [], + "source": [ + "mask_pred = model.predict(img[np.newaxis,...,np.newaxis])[0]\n", + "mask_pred.shape" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "51684a40", + "metadata": {}, + "outputs": [], + "source": [ + "from skimage.measure import label\n", + "\n", + "# threshold inner (green) and find connected components\n", + "lbl_pred = label(mask_pred[...,1] > 0.7)\n", + "\n", + "fig, ((a0,a1),(b0,b1)) = plt.subplots(2,2,figsize=(15,10))\n", + "a0.imshow(img,cmap='gray'); \n", + "a0.set_title('input image')\n", + "a1.imshow(mask); \n", + "a1.set_title('GT segmentation mask')\n", + "b0.axis('off')\n", + "b0.imshow(lbl_pred,cmap='tab20'); \n", + "b0.set_title('label image (prediction)')\n", + "b1.imshow(mask_pred); \n", + "b1.set_title('segmentation mask (prediction)')\n", + "fig.suptitle(\"Example\")\n", + "None;" + ] + }, + { + "cell_type": "markdown", + "id": "391387aa", + "metadata": {}, + "source": [ + "
Q: Can you spot the label image mistakes?
" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a9d251f8", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f1753517", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.13" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": true, + "sideBar": false, + "skip_h1_title": false, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": true, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}