# How to import and use existing project modules

`Author: Jason Lai
Modified: 2018-Mar-01`

To take advantage of existing code in the 'src' directories of the project, the path must point to the 'src' directory (relative to the location of the script). This must precede any import calls to the project utils and functions. In other words, put it near the top of your python script.

In [1]:
import sys, os
sys.path.append( os.path.join( '..', 'src' ) ) # relative to location of script/notebook

After pointing the path to the src directory, it is possible to import various python modules written for this project. The general format under the current scheme used in for this repository is `from directory import module`

In [2]:
from utils import kaggle_io
from utils import kaggle_reader
from preprocess import extract_roi
from preprocess import split_data

### Example 1: Reading input with the `kaggle_io` module

To see the functionality of each of the scripts, it is recommended to look at the comments of the specific python file. As an example, consider the `kaggle_io` module in the `utils` directory. This utility I/O file is meant to read in data provided by Kaggle into more useable output formats. First define the location of an image and its mask (note: the path below may need to be changed depending on the location of this notebook file).

In [3]:
image_path = '../data/raw/stage1_train/00ae65c1c6631ae6f2be1a449902976e6eb8483bf6b0740d00530220832c6d3e/images/00ae65c1c6631ae6f2be1a449902976e6eb8483bf6b0740d00530220832c6d3e.png'
mask_path = '../data/raw/stage1_train/00ae65c1c6631ae6f2be1a449902976e6eb8483bf6b0740d00530220832c6d3e/masks/0fe691c27c3dcf767bc22539e10c840f894270c82fc60c8f0d81ee9e7c5b9509.png'

The inputs are currently paths pointing to an image. The `kaggle_io` module can convert the path to a "pixel matrix." This is a matrix of dimensions WxH where every element is equal to a color value. By default, it will store it as RGB values which take the form ( Red=0-255, Green=0-255, Blue=0-255 ). The mode (e.g. 'RGB', 'L') can be specified and changed.

In [4]:
pixel_matrix = kaggle_io.png_to_pixel_matrix( image_path, mode='RGB' )
#print( pixel_matrix )

For mask images, the `kaggle_io` can convert them into mask matrices, which are binary WxH matrices where 1 indicates a nuclei (i.e. non-black) pixel in the mask and 0 indicates a background (i.e. black) pixel.

In [5]:
mask_matrix = kaggle_io.mask_png_to_mask_matrix( mask_path )
#print( mask_matrix )

In addition to mask matrices, the `kaggle_io` module can also convert mask images to a nuclei_list (a one-dimensional list containing the indices of all nuclei/non-black pixels matching the Kaggle format of top->bottom, left->right) as well as to the Kaggle submission string format. In order to output to the Kaggle submission string format, the image_id also needs to be defined.

In [6]:
nuclei_list = kaggle_io.mask_png_to_nuclei_list( mask_path )
#print( nuclei_list )

image_id = '00ae65c1c6631ae6f2be1a449902976e6eb8483bf6b0740d00530220832c6d3e'
submit_string = kaggle_io.mask_png_to_kaggle_format( mask_path, image_id )
#print( submit_string )

The `kaggle_io` module can also convert various input formats to various output formats (e.g. mask_matrix to nuclei_list, nuclei_list to submit_string, etc.). See the in-code documentation for more information.

### Example 2: Extracting regions of interest with the `extract_roi` module

As a second example with a module that uses classes, consider the `extract_roi` module. This module uses the image_path and mask_path as input to extract the region of interest (ROI) with various options. Remember to import the module before using it (`from preprocess import extract_roi`). To use this module, the class must be called first.

In [7]:
extractor = extract_roi.ExtractROI()

Once the class has been called, the image and mask can be passed to the class. In the example below, the paths to these images are used; however, it is possible to pass other formats to the extractor such as the nuclei_list or pixel_matrix.

In [8]:
extractor.set_image_with_png( image_path )
extractor.set_mask_with_png( mask_path )

For the `extract_roi` module, there are various different ways to extract the ROI. For example, one could extract only the nuclei itself with additional padding. Alternatively, one could extract a fixed size box around the nuclei center.

In [9]:
extractor.extract_roi( padding=10 ) # nuclei dimensions plus 10 pixel padding
extractor.extract_roi( fixed_size=[ 20, 20 ] ) # 20x20 pixels around nuclei center

After extracting the ROI from the image, the class can either output the extracted ROI as a matrix or save the ROI as an image.

In [10]:
#print( extractor.get_roi() )
#extractor.save_roi_as_image( './example_roi.png' )

Nonetheless, this gives an general overview on how existing python scripts can be imported from the project directory and used in other scripts. Again, most of the actual documentation can be found in the actual python code itself, where the author was hopefully verbose with their comments.