### This script is designed to take scans of multiple leaf images and output cropped images containing individual leaves. It is a Python3 script writting in a Jupyter Notebook. I run Jupyter through Anaconda.

#### Image Guidelines
1. Input scans are separated into individual leaves and those individual leaves are saved.
2. Images should be taken so that leaves do not overlap. Overlapping leaves will be treated as a single region.
2. Images should be taken so that the bounding box around a leaf do not overlap. Any portion of the image, including other leaf regions, that overlaps with a given bounding box will be included in the final crop. 

In [None]:
## Importing packages
import numpy as np
import cv2
import pandas as pd
import matplotlib
from matplotlib import pyplot as plt
from skimage import data
from skimage import morphology
from skimage.filters import threshold_otsu
from skimage.filters import sobel
from skimage.segmentation import clear_border
from skimage.measure import label, regionprops
from skimage.morphology import closing, square, remove_small_objects
from skimage.segmentation import slic
from skimage.color import label2rgb
from scipy import ndimage as ndi
import glob
import timeit
import seaborn as sns
%matplotlib inline

In [None]:
## Filepath containing the folder with your image scans. 
## Change for your particular file structure.
filepath = '~/scanner/2020_11_16/'

## This will go through your filepath and find all files with the .tif extension. 
## If your files have a different extension, you'll need to change that.
files = glob.glob(filepath + '*.tif')

## To check your files are accounted for
print(files)

In [None]:
## Create empty lists for appending data
total_leaf_area = []
leaf_img = []
leafID = []

## You can change the name variable to what you want your file prefix to be. 
name = "2020_11_16"

## Each unique region will get a number, starting with zero.
k = 0

In [None]:
## This is the main for loop. It will go through all of your files that you already imported
## A line will be outputted for each file that starts processing so you can keep track
## It should only take a few minutes to run completely. 
for file in files:
    print('Processing:' + file)
    I=cv2.imread(file)
    b,g,r=cv2.split(I)
    I2=cv2.merge((r,g,b))

## This section can be used to change sections of the scan to all white.   
## I use it to remove my labels at the top of my scan and add a white border around the image
## A white border is necessary if your leaf regions overlap with the edge of your image
    I2[:,-100:,:]=255
    I2[:,0:100,:]=255
    I2[0:2400,0:3800,:]=255
    I2[0:2000,0:-10]=255

## This section can be used to crop sections out of your scan.
## I use it to crop out the labels at the top so they are added as regions.
    img1=I2[2000:,:,:]

## This selects the blue channel for thresholding. 
## The channel you pick will depend on the colors in your image. 
    img=img1[:,:,2] 

## Perform threshold based on the blue channel
    thresh = threshold_otsu(img)
    bw = closing(img > thresh, square(3))
    bw=~bw  
    cleared = clear_border(bw)
    label_img = label(cleared)

## Plot the threshold figure if you want. Can be helpful for troubleshooting.
## I recommend keeping this commented out for running the full loop.
#     plt.figure()
#     plt.imshow(label_img)    
    
## For every region that's identified, this loop will save the bounding box for each region,
## name each region,
## save the approximate leaf area (in pixels) for each region,
## and save the final cropped image containing (presumably) a single leaf
## this will discard regions smaller than 10,000 pixels. This might need to be changed to fit your data
    for region in regionprops(label_img):
        if region.area >= 10000:
            minr, minc, maxr, maxc = region.bbox
            leaf1=img1[minr:maxr,minc:maxc,:]
            total_leaf_area.append(region.area)
            leaf_name = name + '_' + str(k)
            leafID.append(leaf_name)            
            k=k+1

## You'll need to change the path to where you want your crops to save            
            b,g,r = cv2.split(leaf1)
            leaf1_edit = cv2.merge((r,g,b))
            cv2.imwrite('~/scanner/individual_leaves/leaves_2020_11_16/' 
                       + leaf_name + '.tif', leaf1_edit)

## If you want to save each leaf crop within the notebook, you can do this here.
## May slow down the script, especially if you have a lot of leaves. Mostly useful for troubleshooting.
#             leaf_img.append(leaf1)
    
## You can also view each leaf crop within the notebook. 
## Again, this can really slow down the script. 
## Really only use for troubleshooting on a few leaves, as there's limited memory for opening large figures.
#             plt.figure()
#             plt.imshow(leaf1)

### I recommend checking your output files. Occassionally, there will be small regions that aren't full leaves that should be manually removed before additional processing.

In [None]:
## Create a dataframe containing area information for each leaf crop
df=pd.DataFrame({'leafID':leafID})
df['leaf_area']=total_leaf_area

## Print dataframe
df

## Save dataframe to csv.
df.to_csv('leaf_areas.csv')