In [1]:
from data import *

Using TensorFlow backend.


# data augmentation 

In deep learning tasks, a lot of data is need to train DNN model, when the dataset is not big enough, data augmentation should be applied.

keras.preprocessing.image.ImageDataGenerator is a data generator, which can feed the DNN with data like : (data,label), it can also do data augmentation at the same time.

It is very convenient for us to use keras.preprocessing.image.ImageDataGenerator to do data augmentation by implement image rotation, shift, rescale and so on... see [keras documentation](https://keras.io/preprocessing/image/) for detail.

For image segmentation tasks, the image and mask must be transformed **together!!**

## define your data generator

If you want to visualize your data augmentation result, set save_to_dir = your path

原始數據集是ISBI Challenge: Segmentation of neuronal structures in EM stacks
 http://brainiac2.mit.edu/isbi_challenge/

這個問題是要分離出細胞邊緣，可以視作一個二分類的問題(是邊緣/不是邊緣)。

最大的挑戰在於，影像張數太少，所以需要進行影像擴增(augmentation)。


In [5]:
#if you don't want to do data augmentation, set data_gen_args as an empty dict.
#data_gen_args = dict()

data_gen_args = dict(rotation_range=0.2,
                    width_shift_range=0.05,
                    height_shift_range=0.05,
                    shear_range=0.05,
                    zoom_range=0.05,
                    horizontal_flip=True,
                    fill_mode='nearest')
myGenerator = trainGenerator(20,'Membrane','image','label',data_gen_args,save_to_dir = "Membrane/aug")

扭曲圖像對於分類影響不大，所以可以使用

[參考論文](http://faculty.cs.tamu.edu/schaefer/research/mls.pdf)

![Before](https://img-blog.csdn.net/20170417205751603?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMjkzMTU4Mg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast)

![After](https://img-blog.csdn.net/20170417205820931?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQvdTAxMjkzMTU4Mg==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/SouthEast)

https://keras.io/preprocessing/image/

* rotation_range: Int. Degree range for random rotations.
* width_shift_range: Float, 1-D array-like or int

    float: fraction of total width, if < 1, or pixels if >= 1.
    1-D array-like: random elements from the array.
    int: integer number of pixels from interval (-width_shift_range, +width_shift_range)
    With width_shift_range=2 possible values are integers [-1, 0, +1], same as with width_shift_range=[-1, 0, +1], while with width_shift_range=1.0 possible values are floats in the half-open interval [-1.0, +1.0[.

* height_shift_range: Float, 1-D array-like or int

    float: fraction of total height, if < 1, or pixels if >= 1.
    1-D array-like: random elements from the array.
    int: integer number of pixels from interval (-height_shift_range, +height_shift_range)
    With height_shift_range=2 possible values are integers [-1, 0, +1], same as with height_shift_range=[-1, 0, +1], while with height_shift_range=1.0 possible values are floats in the half-open interval [-1.0, +1.0[.

* shear_range: Float. Shear Intensity (Shear angle in counter-clockwise direction in degrees)
* zoom_range: Float or [lower, upper]. Range for random zoom. If a float, [lower, upper] = [1-zoom_range, 1+zoom_range].
* fill_mode: One of {"constant", "nearest", "reflect" or "wrap"}. Default is 'nearest'. Points outside the boundaries of the input are filled according to the given mode:

    'constant': kkkkkkkk|abcd|kkkkkkkk (cval=k)
    'nearest': aaaaaaaa|abcd|dddddddd
    'reflect': abcddcba|abcd|dcbaabcd
    'wrap': abcdabcd|abcd|abcdabcd

* horizontal_flip: Boolean. Randomly flip inputs horizontally.

## visualize your data augmentation result

In [7]:
#you will see 60 transformed images and their masks in membrane/aug
num_batch = 3
for i,batch in enumerate(myGenerator):
    if(i >= num_batch):
        break
        

##>>>seq = ['one', 'two', 'three']
##>>> for i, element in enumerate(seq):   #把element in enumerate(seq)先一起看
##...     print i, element
##... 
##0 one
##1 two
##2 three


## create .npy data

If your computer has enough memory, you can create npy files containing all your images and masks, and feed your DNN with them.

In [8]:
image_arr,mask_arr = geneTrainNpy("Membrane/aug/","Membrane/aug/")
#np.save("data/image_arr.npy",image_arr)
#np.save("data/mask_arr.npy",mask_arr)