<a href="https://colab.research.google.com/github/matjesg/deepflash2/blob/master/paper/4_experts_vs_uncertainties.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# deepflash2 - Relationship between uncertainty and expert agreement

> This notebook reproduces the results of the deepflash2 [paper](https://arxiv.org/abs/2111.06693) for the relationship between pixel-wise uncertainty and expert agreement.

- **Data and models**: Data and trained models are available on [Google Drive](https://drive.google.com/drive/folders/1r9AqP9qW9JThbMIvT0jhoA5mPxWEeIjs?usp=sharing). To use the data in Google Colab, create a [shortcut](https://support.google.com/drive/answer/9700156?hl=en&co=GENIE.Platform%3DDesktop) of the data folder in your personal Google Drive.

*Source files created with this notebook*:

`experts_vs_uncertainties.csv`

*References*:

Griebel, M., Segebarth, D., Stein, N., Schukraft, N., Tovote, P., Blum, R., & Flath, C. M. (2021). Deep-learning in the bioimaging wild: Handling ambiguous data with deepflash2. arXiv preprint arXiv:2111.06693.


## Setup

- Install dependecies
- Connect to drive

In [None]:
!pip install deepflash2

In [None]:
# Imports
import numpy as np
import pandas as pd
from pathlib import Path
import zarr
from deepflash2.all import *
from deepflash2.data import _read_msk

In [None]:
# Connect to drive
from google.colab import drive
drive.mount('/gdrive')

Drive already mounted at /gdrive; to attempt to forcibly remount, call drive.mount("/gdrive", force_remount=True).


## Settings

In [None]:
DATASETS = ['PV_in_HC', 'cFOS_in_HC', 'mScarlet_in_PAG', 'YFP_in_CTX', 'GFAP_in_HC']
OUTPUT_PATH = Path("/content") 
DATA_PATH = Path('/gdrive/MyDrive/deepflash2-paper/data')
TRAINED_MODEL_PATH= Path("/content/trained_models")
URL_MODEL_LIBRARY = 'https://github.com/matjesg/deepflash2/releases/download/model_library'
MODEL_NO = '1'
UNCERTAINTY_BINS = np.linspace(0, 1, 26)

## Analysis

1. Predict segmentations and uncertainties on the test set
2. Calculate expert agreement from the expert segmentations
3. Postprocess results 

See `deepflash2_figures-and-tables.ipynb` for plots of the data.

In [None]:
result_list = []
for dataset in DATASETS:

  # Set data path
  test_data_path = DATA_PATH/dataset/'test'

  # Download pretrained model ensemble
  ensemble_name = f'{dataset}_ensemble_{MODEL_NO}.pt'
  ensemble_trained_dir = TRAINED_MODEL_PATH/dataset 
  ensemble_trained_dir.mkdir(exist_ok=True, parents=True)
  ensemble_trained_path = ensemble_trained_dir/ensemble_name
  !wget -O {ensemble_trained_path.as_posix()} {URL_MODEL_LIBRARY}/{ensemble_name}
  
  # Create predictor instance
  ep = EnsemblePredictor('images', path=test_data_path, ensemble_path=ensemble_trained_path) 
  
  # Predict and save semantic segmentation masks
  ep.get_ensemble_results()
  
  # Load expert masks      
  gt_est = GTEstimator(exp_dir='masks_experts', path=test_data_path)
  exp_averages = {} 
  for m, exps in gt_est.masks.items():
    file_id = m.split('_')[0]
    exp_masks = [_read_msk(gt_est.mask_fn(exp,m), instance_labels=gt_est.instance_labels) for exp in exps]
    exp_averages[file_id] = np.mean(exp_masks, axis=0)

  for idx, r in ep.df_ens.iterrows():
    file_id = r.file.split('.')[0]

    # Get prediction from softmax
    pred = ep.g_pred[r.file][:]

    # Get uncertainty maps
    unc = ep.g_std[r.file][:]

    # Get expert average annotations
    exp_average = exp_averages[file_id]

    # Calculate "soft" error map
    error_map = np.abs(pred-exp_average)

    # Calculate error means (error rate)
    digitized = np.digitize(unc.flatten(), UNCERTAINTY_BINS)
    error_means = [error_map.flatten()[digitized == i].mean() for i in range(1, len(UNCERTAINTY_BINS))]

    # Calculate expert agreement
    expert_agreement = []
    for i in range(1, len(UNCERTAINTY_BINS)):
      bin_error = error_map.flatten()[digitized == i]
      expert_agreement.append((np.sum(bin_error==0) + np.sum(bin_error==1))/len(bin_error))

    df_tmp = pd.DataFrame({
      'dataset':dataset,
      'file':r.file,
      'uncertainty_bins': UNCERTAINTY_BINS[:-1],
      'error_rate': error_means,
      'expert_agreement': expert_agreement
      })
    result_list.append(df_tmp)

df = pd.concat(result_list).reset_index(drop=True)
df.to_csv(OUTPUT_PATH/'experts_vs_uncertainties.csv', index=False)

--2022-06-14 16:36:49--  https://github.com/matjesg/deepflash2/releases/download/model_library/PV_in_HC_ensemble_1.pt
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/dc383ced-481f-40e5-bd54-2688a2f78f67?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220614%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220614T163649Z&X-Amz-Expires=300&X-Amz-Signature=688c59135375d3881e2693a5866129bdfc40bdf93b907b26dbf09947be64bc22&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=237905465&response-content-disposition=attachment%3B%20filename%3DPV_in_HC_ensemble_1.pt&response-content-type=application%2Foctet-stream [following]
--2022-06-14 16:36:49--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/dc383ced-481f-40e5-bd5

Found 8 unique segmentation mask(s) from 5 expert(s)
--2022-06-14 16:42:47--  https://github.com/matjesg/deepflash2/releases/download/model_library/cFOS_in_HC_ensemble_1.pt
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/7cc1285a-f667-4102-8a1b-292a036f5164?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220614%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220614T164248Z&X-Amz-Expires=300&X-Amz-Signature=06b9d7a01ea080b5e60b3bb51791c025eec21aac682a1e0393a186800c81b64e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=237905465&response-content-disposition=attachment%3B%20filename%3DcFOS_in_HC_ensemble_1.pt&response-content-type=application%2Foctet-stream [following]
--2022-06-14 16:42:48--  https://objects.githubusercontent.com/github-product

Found 8 unique segmentation mask(s) from 5 expert(s)
--2022-06-14 16:49:08--  https://github.com/matjesg/deepflash2/releases/download/model_library/mScarlet_in_PAG_ensemble_1.pt
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/323fc5d1-606f-47a6-8608-5cc9ffd27ab3?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220614%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220614T164908Z&X-Amz-Expires=300&X-Amz-Signature=2191f0e3e330c425db9996bb13de48c858430d6b1dd719d297d1681364062641&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=237905465&response-content-disposition=attachment%3B%20filename%3DmScarlet_in_PAG_ensemble_1.pt&response-content-type=application%2Foctet-stream [following]
--2022-06-14 16:49:08--  https://objects.githubusercontent.com/gith

Found 8 unique segmentation mask(s) from 5 expert(s)
--2022-06-14 17:09:01--  https://github.com/matjesg/deepflash2/releases/download/model_library/YFP_in_CTX_ensemble_1.pt
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/4fcf52e6-d42f-4831-9177-6b5806669714?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220614%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220614T170901Z&X-Amz-Expires=300&X-Amz-Signature=ae2f509d5c0cb656dcc1e519a679013cea4db4a2e3209ccafc5a512b802c0e52&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=237905465&response-content-disposition=attachment%3B%20filename%3DYFP_in_CTX_ensemble_1.pt&response-content-type=application%2Foctet-stream [following]
--2022-06-14 17:09:01--  https://objects.githubusercontent.com/github-product

Found 8 unique segmentation mask(s) from 5 expert(s)
--2022-06-14 17:29:09--  https://github.com/matjesg/deepflash2/releases/download/model_library/GFAP_in_HC_ensemble_1.pt
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/237905465/702d97a4-b391-4d7e-aeda-2479504c85d1?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20220614%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20220614T172909Z&X-Amz-Expires=300&X-Amz-Signature=90dda60a282e00130eb48d7a5a3c04e15c583e7dd32e75784e3360d8c0ee6c91&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=237905465&response-content-disposition=attachment%3B%20filename%3DGFAP_in_HC_ensemble_1.pt&response-content-type=application%2Foctet-stream [following]
--2022-06-14 17:29:09--  https://objects.githubusercontent.com/github-product

Found 8 unique segmentation mask(s) from 3 expert(s)
