# Instructions
The following code was designed in order to calculate the number of cells within an image or set of images.

### A few pertinent features of the code:
This code calculates the number of counted cells on up to two channels, as well as overlapping cells across channels.
In addition to cell count, the locations of counted cells can be saved.
Lastly, a region of interest based upon a user-made mask can be specified and cell counting can be restricted to this region of interest.  The size of the region of interest in pixels is returned. 

### Folder Structure

For this code to work files should be organized in a specific manner.
Directory_Main (the path of which is to be defined by the user... actual directory name can be whatever you want) should contain the subfolders "Ch1","Ch2","ROI", and "SavedOutput" (these names matter).
If you are not using ROIs, this folder can be excluded.
Similarly, if only one channel is being examined, Ch1/Ch2 can be exluded.
Ch1 and Ch2 folders should contain .tif images with files of the same name, though some suffix identifying the channel (e.g. "_Ch1") is fine.  This is required in order for images for each channel of a multichannel image to be matched.
ROI should contain .tif images with ROI masks, described below.  These should be named similarly to files in Ch1 and Ch2.
Example filenames under Ch1, Ch2, and ROI: "Mouse1_Image1_Ch1.tif","Mouse1_Image1_Ch2.tif",""Mouse1_Image1_ROI.tif"

### Making ROIs
Regions of interest are best drawn using ImajeJ.  Binary masks are created such that the region of interest pixel values = 255, whereas other values are equal to 0
#1 - Open the image in ImageJ
#2 - Use your preferred selection tool (I like freehand selection tool) to outline the region of interest
#3 - Go to edit > selection > create mask
#4 - Save the mask as .tif file

### Requirements
The following will need to be installed in your Conda environment:

1 - python (3.6.5)

2 - jupyter

3 - imread

4 - mahotas(1.4.4)

5 - numpy(1.14.3)

6 - pandas(0.23.0)

7 - matplotlib(2.2.2). 

The following commands can be executed in your terminal to create the environment: 

conda config --add channels conda-forge

conda create -n EnvironmentName python=3.6.5 mahotas=1.4.4 pandas=0.23.0 matplotlib=2.2.2 jupyter imread

Different versions of these packages have not been tested.  Single channel .tif images are required (some image acquisition software exports single channel images with more than >2 dimensions.  If this is the case, images must first be converted to 2 dimensional space).  Images that are not already 8-bit are converted to 8 bit.  However, these images are not necessarily on a scale from 0-255.  It is therefore advised that all images are set to 8 bit before running this.  If this was not done, be sure that composite image running during optimization is of the exact same form.  Additionally, for measuring diameter this should be in pixel units.  To insure files are in pixel scale, in ImageJ go to Analyze -> SetScale -> Click to Remove Scale -> OK.

### What type of cells can this count?
Any semi-circular cell that is more or less filled by the fluorescent label will be cable of being counted.  If the cells are only visible by their perimeter than this code as is may not work. In this case, a process should be applied to either a) fill the internal component of the cell after thresholding or b) use smoothing or morphological opening to increase the internal of the cell. Although watershed can separate cells to some extent, a large degree ov overlap is unable to be overcome.

## Setting parameters:
All parameters required to be set by the user is set in the first cell of code.  
1 - Cell Diameter:  This is the average cell diameter in pixel units.  Can be obtained easily with ImageJ measurement tool.
2 - Threshold:  This is to be obtained using the optimization procedure or by setting by eye (not advised).
3 - Paricle Minimum: After thresholding this serves to erase excessively small points.  Proportion of average cell size permitted.  Average cell size is assumed to be square of diamater for rough approximation
4 - Use ROI:  Are you using an ROI or analyzing the entire image?
5- Use Watershed: Watershed procedure attempts to separate adjoining cells. 

### Viewing an example of how images are being processed
In order to view what is being done to a single image on a single channel, one can go to the last cell, entitled "Display Example Process for One Channel".  To run this cell of code only the cells up to and including 'Get Directory Information' must be run


# Specify Parameters and Options for Running
### This should be the only part of the code that is required to be changed

In [None]:
#Define directory of working environment (Directory.Main)
Directory_Main = "/Users/ZP/Images"

#Set Parameters
Ch1_CellDiam = 10 #Average Cell Diameter in pixel units.  Must be integer value.  
Ch2_CellDiam = 11 #Average Cell Diameter in pixel units.  Must be integer value.  
Ch1_Thresh = 70 #Set threshold to integer value based upon CellCounter_Optimization.ipynb
Ch2_Thresh = 19 #Set threshold to integer value based upon CellCounter_Optimization.ipynb
particle_minimum = 0.2 #After thresholding this serves to erase excessively small points.  Proportion of average cell size permitted.  Average cell size is assumed to be square of diamater for rough approximation.  0.2 has worked well for me
overlap_minimum = 0.5 #When measuring overlap of two channels, this is the minimum amount of overlap permitted, as a proportion of average cell size for channel with smaller diameter cells.  Average cell size is assumed to be square of diamater for rough approximation.

#Options (set to True/False)
UseROI = False #Only count cells within a region of interest?
Ch1_UseWatershed = True #Perform watershed for Ch1?
Ch2_UseWatershed = True #Perform watershed for Ch2?

# Load Necessary Packages

In [None]:
#Note that this code was written in python 3.6.5
import pylab 
import os
import fnmatch
import imread
import numpy as np
import mahotas as mh 
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

# Get Directory Information

In [None]:
#Define existing subdirectories
Directory_Ch1 = Directory_Main + "/" + "Ch1"
Directory_Ch2 = Directory_Main + "/" + "Ch2"
Directory_ROI = Directory_Main + "/" + "ROI"
Directory_Output = Directory_Main + "/" + "SavedOutput"


#Get filenames and reate output subdirectories based upon usage
if os.path.isdir(Directory_Ch1):
    FileNames_Ch1 = sorted(os.listdir(Directory_Ch1))
    FileNames_Ch1 = fnmatch.filter(FileNames_Ch1, '*.tif') #restrict files to .tif images
    Directory_Output_Ch1 = Directory_Output + "/" + "Ch1"
    try:
        os.mkdir(Directory_Output_Ch1)
    except FileExistsError: 
        pass
    
if os.path.isdir(Directory_Ch2):
    FileNames_Ch2 = sorted(os.listdir(Directory_Ch2))
    FileNames_Ch2 = fnmatch.filter(FileNames_Ch2, '*.tif') #restrict files to .tif images
    Directory_Output_Ch2 = Directory_Output + "/" + "Ch2"
    try:
        os.mkdir(Directory_Output_Ch2)
    except FileExistsError: 
        pass
    
if os.path.isdir(Directory_ROI):
    FileNames_ROI = sorted(os.listdir(Directory_ROI))
    FileNames_ROI = fnmatch.filter(FileNames_ROI, '*.tif') #restrict files to .tif images
    
if os.path.isdir(Directory_Ch1) and os.path.isdir(Directory_Ch2):
    Directory_Output_Merge = Directory_Output + "/" + "Merge"
    try:
        os.mkdir(Directory_Output_Merge) 
    except FileExistsError:
        pass

# Load Functions

## Function to Count Cells for One Channel

In [None]:
#Function to Count Cells
def Count(CellDiam,Thresh,Directory_Current,FileNames_Current,UseWatershed,x):
    
    #Load file
    Image_Current_File = Directory_Current + "/" + FileNames_Current[x] #Set directory location
    Image_Current_Gray = mh.imread(Image_Current_File,as_grey=True) #Load File as greyscale image
    print("Processing: " + FileNames_Current[x])

    #Convert image to uint8 if it isn't already
    if Image_Current_Gray.dtype!="uint8": 
        Image_Current_Gray=Image_Current_Gray.astype('uint8')
        print("Warning: Converted "+FileNames_Current[x]+" to uint8.  See notes in structions")
    else:    
        pass

    #Substract Background
    Image_Current_BG = mh.gaussian_filter(Image_Current_Gray,sigma=CellDiam*3) #Determine regional background
    Image_Current_BG = Image_Current_Gray - Image_Current_BG #Subtract background from orginal image
    Image_Current_BG[Image_Current_BG<0] = 0 #Set negative values = 0

    #Apply Gaussian Filter to Image
    Image_Current_Gaussian = mh.gaussian_filter(Image_Current_BG,sigma=CellDiam/12) #Currently, cell diameter is 12 standard deviations

    #Threshold image 
    Image_Current_T = Image_Current_Gaussian > Thresh #Threshold image

    #Erase any particles that are below the minimum particle size
    labeled,nr_objects = mh.label(Image_Current_T) #label particles in Image_Current_T
    sizes = mh.labeled.labeled_size(labeled) #get list of particle sizes in Image_Current_T
    too_small = np.where(sizes < (CellDiam*CellDiam*particle_minimum)) #get list of particle sizes that are too small
    labeled = mh.labeled.remove_regions(labeled, too_small) #remove particle sizes that are too small for labeling of Image_Current_T
    Image_Current_T = labeled != 0 #reconstitute Image_Current_T with particles removed

    #Get ROI and apply to thresholded image
    if UseROI:
        ROI_Current_File = Directory_ROI + "/" + FileNames_ROI[x] #Set directory location of ROI file
        ROI_Current = mh.imread(ROI_Current_File,as_grey=True) #Load File
        Image_Current_T[ROI_Current==0]=0 #Set values of thresholded image outside of ROI to 0
        roi_size = np.count_nonzero(ROI_Current)
    else:
        roi_size = Image_Current_Gray.size

    #Count cells and find locations
    if UseWatershed:

        #If there are pixels above the threshold proceed to watershed
        if Image_Current_T.max() == True:

            #Create distance transform from thresholded image to help identify cell seeds
            Image_Current_Tdist = mh.distance(Image_Current_T)

            #Define Sure Background for watershed
            #Background is dilated proportional to cell diam.  Allows final cell sizes to be a bit larger at end.  Will not affect cell number but can influence overlap
            #See https://docs.opencv.org/3.4/d3/db4/tutorial_py_watershed.html for tutorial that helps explain this
            Dilate_Iterations = int(CellDiam//2) 
            Dilate_bc = np.ones((3,3)) #Use square structuring element instead of cross
            Image_Current_SureBackground = Image_Current_T
            for j in range (Dilate_Iterations): 
                Image_Current_SureBackground = mh.dilate(Image_Current_SureBackground,Bc=Dilate_bc)

            #Create seeds/foreground for watershed
            #See https://docs.opencv.org/3.4/d3/db4/tutorial_py_watershed.html for tutorial that helps explain this
            Regmax_bc = np.ones((CellDiam,CellDiam)) #Define structure element regional maximum function.  Currently uses diamater
            Image_Current_Seeds = mh.regmax(Image_Current_Tdist,Bc=Regmax_bc) #Find regional maxima of distance transform
            seeds,nr_nuclei = mh.label(Image_Current_Seeds)

            #Define unknown region between sure foreground (the seeds) and sure background 
            Image_Current_Unknown = Image_Current_SureBackground.astype(int) - Image_Current_Seeds.astype(int)

            #Modify seeds to differentiate between background and unknown regions
            seeds+=1
            seeds[Image_Current_Unknown==1]=0

            #Perform watershed
            Image_Current_Watershed = mh.cwatershed(surface=Image_Current_SureBackground,markers=seeds)
            Image_Current_Watershed -= 1 #Done so that background value is equal to 0.
            Image_Current_Cells = Image_Current_Watershed

        #If there are no pixels above the threshold watershed procedure has issues.  Set cell count to 0.
        elif Image_Current_T.max() == False:
            Image_Current_Cells = Image_Current_T.astype(int)
            nr_nuclei = 0

    else:
        #Method for counting cells when watershed is not selected
        Image_Current_Cells, nr_nuclei = mh.label(Image_Current_T)

        
    return Image_Current_Cells, nr_nuclei, roi_size, Image_Current_Gray, Image_Current_Gaussian, Image_Current_T


#Function to count merged cells
def Merge(image,Smaller_CellDiam):  
   
    #Load Images
    Image_Current_Gray_Ch1 = mh.imread((Directory_Output_Ch1 + "/" + Ch1_Thresh_Files[x]), as_grey=True) #Load File as greyscale image
    Image_Current_Gray_Ch2 = mh.imread((Directory_Output_Ch2 + "/" + Ch2_Thresh_Files[x]), as_grey=True) #Load File as greyscale image    
    
    print('Processing: ' + str(Directory_Output_Ch1 + "/" + Ch1_Thresh_Files[x]))
    
    #Set Values of Cell Locations for Ch1 to 170 and Ch2 to 84
    #When two images are summed together, overlapping regions will equal 254
    #These values are initially set to 126 until they are shown to meet size requirements, and are then set to 255 
    Image_Current_Gray_Ch1[Image_Current_Gray_Ch1>0]=170
    Image_Current_Gray_Ch2[Image_Current_Gray_Ch2>0]=84
    Image_Current_Merge=Image_Current_Gray_Ch1 + Image_Current_Gray_Ch2
    Image_Current_Merge[Image_Current_Merge==254]=126
    
    #Define regions of potential overlap
    Image_Current_Overlap=(Image_Current_Merge==126).astype('uint8')
    
    #Erase any particles that are below the minimum particle size
    labeled,nr_objects = mh.label(Image_Current_Overlap) #label particles in Image_Current_Overlap
    sizes = mh.labeled.labeled_size(labeled) #get list of particle sizes in Image_Current_Overlap
    too_small = np.where(sizes < (Smaller_CellDiam*Smaller_CellDiam*overlap_minimum)) #get list of particle sizes that are too small
    labeled = mh.labeled.remove_regions(labeled, too_small) #remove particle sizes that are too small for labeling of Image_Current_T
    Image_Current_Overlap = labeled != 0 #reconstitute Image_Current_Overlap with particles removed
    
    #Count overlapping cells and add to count array
    Image_Current_Cells, nr_nuclei = mh.label(Image_Current_Overlap)
    
    #Finalize Image representing cell locations by defining location of overlapping cells
    Image_Current_Merge[Image_Current_Cells>0]=255
    
    #Save tif representing cell locations
    mh.imsave(
        filename = Directory_Output_Merge + "/" + Ch1_Thresh_Files[image] + "_merge.tif",
        array = Image_Current_Merge.astype(np.uint8)
    )

    return(nr_nuclei)

# Display Example Process for One Channel
The code helps display the processing results for a single image.  The original image, the background subtracted and smoothed image, the thresholded image, and the final cell locations are displayed.   By default displays results from first image of first channel.  Primarily used for troubleshooting.  Can't view ROI this way

In [None]:
#Specify Image to Look at Below
Channel = "Ch2" #Specify Ch1 or Ch2
x = 1 #Specify index in folder.  0 by default

#Set function parameters in accordance with channel to be counted
if Channel == "Ch1":
    CellDiam = Ch1_CellDiam
    Thresh = Ch1_Thresh
    Directory_Current = Directory_Ch1
    FileNames_Current = FileNames_Ch1
    UseWatershed = Ch1_UseWatershed 
elif Channel == "Ch2":
    CellDiam = Ch2_CellDiam
    Thresh = Ch2_Thresh
    Directory_Current = Directory_Ch2
    FileNames_Current = FileNames_Ch2
    UseWatershed = Ch2_UseWatershed

#call function to count cells
Image_Current_Cells, nr_nuclei, roi_size, Image_Current_Gray, Image_Current_Gaussian, Image_Current_T = Count(CellDiam,Thresh,Directory_Current,FileNames_Current,UseWatershed,x)

#Set colormap to grayscale for plotting
plt.gray()

#Display Image
plt.figure(figsize=(20,20))
plt.subplot(2,2,1)
plt.title('Original Image')
plt.imshow(Image_Current_Gray)
plt.subplot (2,2,2)
plt.title('Preprocessed Image')
plt.imshow(Image_Current_Gaussian)
plt.subplot(2,2,3)
plt.title('Image Threshold')
plt.imshow(Image_Current_T)
plt.subplot(2,2,4)
plt.title('Cell Locations')
plt.jet()
plt.imshow(Image_Current_Cells*(255/Image_Current_Cells.max()))


# Count Channel 1

In [None]:
#Initialize arrays to store data in
COUNTS = []
ROI_SIZE = []

#Loop through images and count cells
for x in range (len(FileNames_Ch1)):

    #call function to count cells
    Image_Current_Cells, nr_nuclei, roi_size, Image_Current_Gray, Image_Current_Gaussian, Image_Current_T = Count(Ch1_CellDiam,Ch1_Thresh,Directory_Ch1,FileNames_Ch1,Ch1_UseWatershed,x)
    #store summary data
    COUNTS.append(nr_nuclei)
    ROI_SIZE.append(roi_size)
    #save image of cell locations
    mh.imsave(
        filename = Directory_Output_Ch1 + "/" + FileNames_Ch1[x] + "_counts.tif",
        array = Image_Current_Cells.astype(np.uint8)
    )

#Save data to disk
DataFrame = pd.DataFrame(
{'FileNames_Ch1': FileNames_Ch1,
 'Thresh' : np.ones(len(FileNames_Ch1))*Ch1_Thresh,
 'UseROI' : np.ones(len(FileNames_Ch1))*UseROI,
 'AvgCellDiam' : np.ones(len(FileNames_Ch1))*Ch1_CellDiam,
 'ParticleMin' : np.ones(len(FileNames_Ch1))*particle_minimum,
 'Ch1_Counts': COUNTS,
 'Ch1_ROI_Size': ROI_SIZE
})
DataFrame.to_csv(Directory_Output + "/" + "Ch1_Counts.csv")


# Count Channel 2

In [None]:
#Initialize arrays to store data in
COUNTS = []
ROI_SIZE = []

#Loop through images and count cells
for x in range (len(FileNames_Ch2)):

    #call function to count cells
    Image_Current_Cells, nr_nuclei, roi_size, Image_Current_Gray, Image_Current_Gaussian, Image_Current_T = Count(Ch2_CellDiam,Ch2_Thresh,Directory_Ch2,FileNames_Ch2,Ch2_UseWatershed,x)
    #store summary data
    COUNTS.append(nr_nuclei)
    ROI_SIZE.append(roi_size)
    #save image of cell locations
    mh.imsave(
        filename = Directory_Output_Ch2 + "/" + FileNames_Ch2[x] + "_counts.tif",
        array = Image_Current_Cells.astype(np.uint8)
    )

#Save data to disk
DataFrame = pd.DataFrame(
{'FileNames_Ch1': FileNames_Ch2,
 'Thresh' : np.ones(len(FileNames_Ch2))*Ch2_Thresh,
 'UseROI' : np.ones(len(FileNames_Ch2))*UseROI,
 'AvgCellDiam' : np.ones(len(FileNames_Ch2))*Ch2_CellDiam,
 'ParticleMin' : np.ones(len(FileNames_Ch2))*particle_minimum,
 'Ch1_Counts': COUNTS,
 'Ch1_ROI_Size': ROI_SIZE
})
DataFrame.to_csv(Directory_Output + "/" + "Ch2_Counts.csv")

# Count Overlapping Cells from Channels 1 and 2

In [None]:
if len(os.listdir(Directory_Output_Ch1))>0 and len(os.listdir(Directory_Output_Ch2))>0:
    
    #Get list of Files to operate on
    Ch1_Thresh_Files = sorted(os.listdir(Directory_Output_Ch1))
    Ch2_Thresh_Files = sorted(os.listdir(Directory_Output_Ch2))
    
    #Restrict files in filelists to .tif files
    Ch1_Thresh_Files = fnmatch.filter(Ch1_Thresh_Files, '*.tif')
    Ch2_Thresh_Files = fnmatch.filter(Ch2_Thresh_Files, '*.tif')
    
    #Define smaller of two cells
    if Ch1_CellDiam < Ch2_CellDiam:
        Smaller_CellDiam = Ch1_CellDiam
    else:
        Smaller_CellDiam = Ch2_CellDiam
    
    #Check to make sure number of files for Ch1 and Ch2 match
    if len(Ch1_Thresh_Files) != len(Ch2_Thresh_Files):
        print('Different number of images detected for Ch1 and Ch2.  Aborting Count.')
    
    else:
       
        #Initialize arrays to store data in
        COUNTS = []
        
        #Loop through files to identify overlapping cells
        for x in range (len(Ch1_Thresh_Files)):
            COUNTS.append(Merge(x,Smaller_CellDiam))
        
        #Save count summary to disk
        DataFrame = pd.DataFrame(
            {'FileNames_Ch1': Ch1_Thresh_Files,
             'Ch2_Merge_Counts': COUNTS
            })
        DataFrame.to_csv(Directory_Output + "/" + "Merge_Counts.csv")

else:
    if len(os.listdir(Directory_Output_Ch1))==0:
        print('Ch1 must be counted before attempting to examine cell overlap')
    if len(os.listdir(Directory_Output_Ch2))==0:
        print('Ch2 must be counted before attempting to examine cell overlap')
    
    