# Classify images based on a trained net 

Here we use the best iteration of a selected Net parameters, based on set of experiments (`training_protocol.pynb`) used to compare the performance of different Net configurations (`Compared_network_configuration.pynb`). In this section, the selected Net will be used to predict lables on a given number of random points from a specified set of images that are contained in a folder called 'data'. For convienience, the folder `data`, within the region folder, contains all the images on which you want to use the Net to estimate the benthic composition. Within this folder, images are organised in subfolders (`img_dir` below) to allow diferent surveys to be processed by different Net, if desired. `img_dir` should contain the images within a folder called `images`. 

After running the classier on the new images, the scripts will produce a folder called `coverages` that contain a text file for everyimage with location of each point and resulting label classification. Use these files to estimate benthic coverage as described in the manuscript associated to this repository (see `Readme.md`).

## Set up workspace 

In [1]:
import os
import os.path as osp
import reef_learning.deeplearning_wrappers.catlin_classify as cc
import warnings
warnings.filterwarnings('ignore')

# Define data folder to process 

basedir='/media/data_caffe' #Base directory
region='hawaii' #name of folder for the region on which Nets are trained and tested
modeldir=osp.join(basedir, region,'ScaleLr_sweeper_scale_1.0_0.001') #directory of the selected Net based on experiment comparisons
img_dir="to_classify" # subdirectory containing the images desired to classify using the model defined above. Note that this folder is contained in a folder called data and should contain the images if a subfolder call images.
npoints=50 # Total number of points per image used to estimate benthic composition and abundance
gpuid=0 #GPU mask id. Change this if you use multiple GPU cards

## Deploy net on new images 

In [5]:
cc.classify_exp(basedir,region,img_dir,modeldir,npoints,force_rewrite=True)

DataLayer initialized with 417 images, 5 imgs per batch, and 224x224 pixel patches
Starting /media/data_caffe/hawaii/data/to_classify 660


  return(im[upper : upper + ps, left : left + ps, :])


Done /media/data_caffe/hawaii/data/to_classify 660


Once completed, a new folder will be created (`coverages`) that contain the location and classification of the set number of points on each image. Below is the resulting folder structure and a sample of the text files containing the classification of a given image. 

In [7]:
#Resulting file structure
for path, dirs, files in os.walk(osp.join(basedir, region,'data',img_dir)):
  print path
  for f in files[:5]:
    print f

/media/data_caffe/hawaii/data/to_classify
/media/data_caffe/hawaii/data/to_classify/coverages
38008151002.jpg.points.csv
44013028401.jpg.points.csv
44034010001.jpg.points.csv
44026218501.jpg.points.csv
44005156501.jpg.points.csv
/media/data_caffe/hawaii/data/to_classify/images
38044022501.jpg
44009056101.jpg
38045184301.jpg
38022039901.jpg
44013029501.jpg


In [22]:
# Sample from the resulted classification
import pandas as pd
from IPython.display import display
cov_example=os.listdir(osp.join(basedir, region,'data',img_dir,'coverages'))[0]

df=pd.read_csv(osp.join(basedir, region,'data',img_dir,'coverages',cov_example),
              skiprows=[0])
display(df[:5])

Unnamed: 0,row,col,labelcode
0,90,429,EAM_DHC
1,466,575,POR_Com_fi
2,208,878,EAM_DHC
3,362,537,EAM_DHC
4,352,913,EAM_DHC
