# Sampling Data Preparation


This python notebook deals with finding suitable training instances (around ROI fiber borders).

This procedure relies on [ImageJ](https://imagej.net/Welcome) functions (specifically **threshold** and **analyze particles**), to create the ROI border image masks (saved using the **Draw** command).

Specifically 2 ROI border masks were used:
1. Big ROI cells - larger sampling window;
2. Small ROI cells - smaller sampling window.

There are additionally two [Jython Scripts](http://imagej.net/Jython_Scripting_Examples). 
They are located at `data_prep/sample/Find_Roi_Small(Big)_Cells.py`.

---

#### After ROI Border Image masks are obtained, this module can be executed.


+ The output of this processing step is a set of sampling image masks -- in which every **non-zero** pixel represents a selected training instance.
+ Instances are selected by applying a sampling random window along every ROI border.
+ Additionally instances are randomly selected around debri pixels (pixels set as **BACKGROUND** in annotation image).


------






## Importing Modules

Here we import the needed python functions for sampling ROIs.

+ The main data processing is done using the [Numpy](http://www.numpy.org/) Scientific Computing Framework.

Also the Whole Package Help is displayed, for reference

In [1]:
import os
import glob
import numpy as np
import sys
print(sys.path, os.getcwd())

sys.path.append('../..')

print(help("dnn_seg"))

from dnn_seg.data_prep.utils.sample_set_funcs import get_good_patches,ConfigSample
from dnn_seg.net.utils.train_h import save_relevant

print('Every import is succesful !')

['C:\\Users\\kiks\\Documents\\dnn_seg\\notebooks', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\python37.zip', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\DLLs', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental', '', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib\\site-packages', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib\\site-packages\\win32', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib\\site-packages\\win32\\lib', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib\\site-packages\\Pythonwin', 'C:\\Users\\kiks\\miniconda3\\envs\\dnn_experimental\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\kiks\\.ipython'] C:\Users\kiks\Documents\dnn_seg\notebooks
Help on package dnn_seg:

NAME
    dnn_seg - # FrontalLobe_DNN_Seg

DESCRIPTION
    Main package for DNN Segmentation on Neuron Fibers.
    
    Containing 3 main modules
    
    1. Data Preparation
   

Using TensorFlow backend.


Every import is succesful !



## Experiment Basic Setup

+ Code to set the random seed
+ Save the current processing configuration (all conf files in current dir.)


In [2]:
# Set the random seed
np.random.seed(8)
#np.random.seed(None)


#  Sampling Experiment ID
experiment_id='sample_'+''.join(map(chr,np.random.randint(97,97+26,(5,))) )

conf_test_tosave=save_relevant('saved_confs',experiment_id,
            files=[ f for f in os.listdir('.') 
                   if os.path.isfile(f) and not f.endswith('.pyc')],
            just_return=True)






# EXPERIMENT SETUP PARAMETERS

Setting the actual sampling specific parameters:

1. Description Text;
2. Lookup Paths;


3. ROI Sampling Parameters;
4. Noise (Uniform) Sampling Parameters;
5. ROI Noise Sampling Parameters, not used currently


6. Training image shortcodes, (sampling is usually applied on 20 images, train subselection)

The testing sampling was uniformly distributed.


In [3]:

# Tagging/Saving current configuration
DESCRIPTION_TEXT='sample ONSET, onspecific settings'
save_conf=True


# lookup paths params-------
lookup_path= {}
lookup_path['groundtruth'] = r'C:\Users\kiks\Documents\dnn_seg\data\corrected_ml01'
lookup_path['interim'] = r'C:\Users\kiks\Documents\dnn_seg\data\interim_ml01'
lookup_path['bigc'] = r'C:\Users\kiks\Documents\dnn_seg\data\sample_bigcells'
lookup_path['smallc'] = r'C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells'
lookup_path['debri'] = '###################'
conf=ConfigSample()

# general sample params---
conf.save_loc = r'C:\Users\kiks\Documents\dnn_seg\data\save_train_instances'
conf.img_size=2048
conf.win_offs_big = 35
conf.win_sparsity_big = 0.02
conf.win_offs_small = 5
conf.win_sparsity_small = 0.02

# noise unif params-------
conf.win_offs_noise = 5
conf.win_sparsity_noise = 0.1
conf.noise_big_sparsity = 0.8
conf.noise_val = 255

# noise roi not used for now-----------------
conf.dbscan_eps = 10
conf.dbscan_mins = 20
conf.win_offs = 22
conf.win_noiseroi_spar = 0.08

# pixel in interims-------------
leave_pix_noise = 0.1
pix_remove_thres = 50


# sampling images params----
train_images=['sp14484-img04', 'sp14485-img05', 'sp13909-img05', 'sp14240-img03', 'sp14069-img04',
 'sp14250-img03', 'sp13750-img09', 'sp13750-img03', 'sp13880-img07',
 'sp14069-img01', 'sp13909-img11', 'sp13909-img07', 'sp14370-img10',
 'sp14240-img01', 'sp14245-img04', 'sp13726-img08', 'sp13880-img11',
 'sp14485-img03', 'sp14485-img09', 'sp14370-img07']


## RUNNING THE SETUP CONFIGURATION

1. Iterating all above images
   * Creating an empty image mask;
   * Iterating every ROI border point; 
   * Randomly choosing/drawing sample pixels (using a sample window centered at the border point;
   * Saving the obtained mask file.
   
2. Saving the experiment configuration files (current code folder).

In [4]:


if not os.path.exists(conf.save_loc):
    os.makedirs(conf.save_loc)



test_images = {}
for pic_type in 'groundtruth;interim;bigc;smallc'.split(';'):
    print ('{} image set- has {} images'.format(pic_type,
                        len(glob.glob(os.path.join(lookup_path[pic_type], '*.tif'))) ) )
    
    
    test_images[pic_type] = [filename for ind, filename in
                             enumerate(glob.glob(os.path.join(lookup_path[pic_type], 
                                                              '*.tif')))
         if filename[filename.rindex('sp'):filename.rindex('img') + 5] in train_images or True]
    test_images[pic_type].sort(key=lambda name: name[name.rindex('sp'):])


print( len(test_images['groundtruth']), test_images['groundtruth'],

get_good_patches(test_images, conf.save_loc, conf.win_offs,conf) )


if save_conf:
    save_relevant('saved_confs',experiment_id,
                  str_to_save=conf_test_tosave,descriptive_text=DESCRIPTION_TEXT)

groundtruth image set- has 31 images
interim image set- has 31 images
bigc image set- has 19 images
smallc image set- has 19 images
Iterating all images...
img is:  0 / 19 ,  38456  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13726-img01-interim.tif
img is:  1 / 19 ,  63422  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13726-img02-interim.tif
img is:  2 / 19 ,  49605  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13726-img03-interim.tif
img is:  3 / 19 ,  59186  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13726-img04-interim.tif
img is:  4 / 19 ,  51402  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13726-img08-interim.tif
img is:  5 / 19 ,  86677  big ROIs border points.
C:\Users\kiks\Documents\dnn_seg\data\sample_smallcells\mask_sp13909-img01-interim.tif
img is:  6 / 19 ,  79410  bi