# OGD Severity Analysis Workflow

### Purpose: To provide a workflow for Nance Lab UGs to analyze images from the OGD Severity Paper

Created by: Hawley Helmbrecht

Creation Date: 11/16/2020

Last Update: 12/4/2020 by Hawley Helmbrecht

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/Nance-Lab/textile/master?filepath=modules%2FOGD_Severity_UG_Flow.ipynb)

*Step 1: Import Necessary Packages*

In [97]:
import numpy as np
import pandas as pd
from scipy import ndimage

import skimage.filters
from skimage import morphology
from skimage.measure import label, regionprops, regionprops_table
from skimage.color import label2rgb
from skimage import io
from skimage import measure 

import wget
from urllib.request import HTTPError

*Step 2: Define User Inputs*

For this workflow you need to get the csv's of feature images folder by folder. So for each folder in the zipped file sent to you, copy in your correct folder location as a string into the folder location variable.  Remember, this provided formatting is for a Mac computer. Alter your folder location accordingly to how we learned over the summer based on your operating system

Then adjust the csv_name variable to the name would like this csv named as. I have used the name of the treatment group

You will not need to adjust the file_type_init and file_type_new variables

In [3]:
#replace the example path from my computer with the path to the image on your computer

zenodo_url = 'https://zenodo.org/record/4302580/files/'

Note: The cell images being read in are from the OGD Severity study by Rick Liao, Andrea Joseph, Mengying Zhang, Mike McKenna, Jeremy Filteau, and Hawley Helmbrecht within the Nance lab. It is an image taken from the thalamus with a costain DAPI/PI/Iba

In [4]:
folder_location = ''

csv_name = 'xyz.csv'

file_type_init = '.tif'
file_type_new = '.png'

*Step X: Defining treatment groups by  slice id*

In [5]:
nontreated_control = ['4-50-4', '4-50-7', '4-50-10', '4-50-15']
ogd05hrs = ['4-56-1', '4-56-2', '4-56-3', '4-56-4', '4-56-5']
ogd15hrs = ['4-56-6', '4-56-7', '4-56-8', '4-56-9', '4-56-10']
ogd3hrs = ['4-50-1' '4-50-5', '4-50-6', '4-50-12', '4-50-14']
ogd15hrsAZO = ['4-59-1', '4-59-2', '4-59-3', '4-59-4']
ogd3hrsSOD = ['4-50-2', '4-50-8', '4-50-9', '4-50-11', '4-50-13']


treatment_groups = [nontreated_control, ogd05hrs, ogd15hrs, ogd3hrs, ogd15hrsAZO, ogd3hrsSOD]

zoom = '40x'

regions_list = ['cortex', 'hippocampus', 'thalamus']

image_number = np.arange(1,6)

*Step X: Creating a File List from treatment groups and pup numbers*

In [6]:
ntc_file_list = []
ogd05hr_file_list = []
ogd15hr_file_list = []
ogd3hr_file_list = []
ogd15hrsAZO_file_list = []
ogd3hrsSOD_file_list = []


for slices in nontreated_control:
    for regions in regions_list:
        for count in image_number:
            ntc_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ntc_file_list.append(ntc_file_name)
            
for slices in ogd05hrs:
    for regions in regions_list:
        for count in image_number:
            ogd05hr_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ogd05hr_file_list.append(ogd05hr_file_name)
            
for slices in ogd15hrs:
    for regions in regions_list:
        for count in image_number:
            ogd15hr_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ogd15hr_file_list.append(ogd15hr_file_name)
            
for slices in ogd3hrs:
    for regions in regions_list:
        for count in image_number:
            ogd3hr_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ogd3hr_file_list.append(ogd3hr_file_name)

for slices in ogd15hrsAZO:
    for regions in regions_list:
        for count in image_number:
            ogd15hrsAZO_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ogd15hrsAZO_file_list.append(ogd15hrsAZO_file_name)

for slices in ogd3hrsSOD:
    for regions in regions_list:
        for count in image_number:
            ogd3hrsSOD_file_name = str(slices + '_' + zoom + '_' + regions + '_' +  str(count) + '.tif')
            ogd3hrsSOD_file_list.append(ogd3hrsSOD_file_name)

In [7]:
treatment_groups_file_lists = [ntc_file_list, ogd05hr_file_list, ogd15hr_file_list, ogd3hr_file_list, ogd15hrsAZO_file_list, ogd3hrsSOD_file_list]

# Only Run the following cell if you do not already have all the files downloaded and placed into the 'images' file in this directory

In [None]:
for file_lists in treatment_groups_file_lists:
    for names in file_lists:
        try:
            wget.download(zenodo_url + names + '?download=1', './' + names)
        
        except HTTPError:
            continue
    print('treatment group download succesful')

*Step 3: The Code*

#Defines a function that cleans the folder input of any files excpet the ones of the type we want

def folder_cleaner(folder, image_type):
    k=0
    for files in folder:
        if image_type in str(files):
            k+=1
        else:
            folder = np.delete(folder, np.argwhere(folder == str(files)))
    return folder

#Obtains the list of image names that the processing needs to be performed on

arr = os.listdir(folder_location)
file_list = np.asarray(arr)
file_list = folder_cleaner(file_list, file_type_init)

*Step 4: Getting Shape Features for our Images*

Here you will need to add your specific shape features into the Properties section in the props variable. Make sure the names are exactly as shown in the documentation for region props and as a string separated by commas

# If already downloaded images place them in a folder named ogdimages within your Downloads directory

In [15]:
properties_list = ('area', 'bbox', 'bbox_area', 'centroid', 'convex_area', 'eccentricity', 'equivalent_diameter', 'euler_number', 'extent', 'filled_area', 'label', 'major_axis_length', 'minor_axis_length', 'moments', 'moments_central', 'moments_hu', 'moments_normalized', 'orientation', 'perimeter', 'solidity')


In [39]:
j = 0
for file_lists in treatment_groups_file_lists:
    for names in file_lists:
        
        try:
            cell_im_location = str('./ogdimages/' + names)
            cell_im = io.imread(cell_im_location)
            blue_cell_im = cell_im[:,:, 1]
            green_cell_im = cell_im[:,:,0]
            thresh_otsu = skimage.filters.threshold_otsu(green_cell_im)
            binary_otsu = green_cell_im > thresh_otsu
            new_binary_otsu = morphology.remove_small_objects(binary_otsu, min_size=64)
            label_image = label(new_binary_otsu)
            new_binary_otsu = ndimage.binary_fill_holes(new_binary_otsu)
            label_image = label(new_binary_otsu)
            image_label_overlay = label2rgb(label_image, image=new_binary_otsu, bg_label=0)

            #Insert your features here where the current red features are. You may want more features than what I told you to explore. 
            #Feel free to add them here as well. The computational time is pretty efficient
            props = measure.regionprops_table(label_image, properties=(properties_list))

            if j == 0:
                df = pd.DataFrame(props)
                df['filename'] = names
            else:
                df2 = pd.DataFrame(props)
                df2['filename'] = names
                df = df.append(df2)
            
        except FileNotFoundError:
            continue

        j = 1

36087

In [77]:
#Adding in Additional metadata
df['pup_sex'] = 'male'
df['pup_age'] = 'p14'
df['pup_id'] = df.filename.str[:7]

#Getting rid of the underscore after single digit pup ids
for rows in range(df.shape[0]):
    pup_id_value = df.iloc[rows]['pup_id']
    if pup_id_value.endswith('_') == True:
        df['pup_id'] = df['pup_id'].replace([pup_id_value],pup_id_value[:6])
    else: 
        pass

In [93]:
#Getting Treatment Groups for pup_ids associated with cell features
df['treatment_group'] = df['pup_id']
for rows in range(df.shape[0]):
    pup_id_value = df.iloc[rows]['pup_id']
    if pup_id_value in ogd05hrs:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'ogd 0.5 hours')
    elif pup_id_value in ogd15hrs:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'ogd 1.5 hours')
    elif pup_id_value in ogd3hrs:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'ogd 3 hours')
    elif pup_id_value in ogd15hrsAZO:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'ogd 1.5 hours + AZO')
    elif pup_id_value in ogd3hrsSOD:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'ogd 3 hours + SOD')
    else:
        df['treatment_group'] = df['treatment_group'].replace([pup_id_value], 'non-treated control')

In [None]:
df

Unnamed: 0,area,bbox-0,bbox-1,bbox-2,bbox-3,bbox_area,centroid-0,centroid-1,convex_area,eccentricity,...,moments_normalized-3-2,moments_normalized-3-3,orientation,perimeter,solidity,filename,pup_sex,pup_age,pup_id,treatment_group
0,82,0,59,14,69,140,5.280488,63.548780,121,0.824525,...,0.001036,0.001296,0.193280,47.834524,0.677686,4-50-4_40x_cortex_1.tif,male,p14,4-50-4,non-treated control
1,120,0,94,15,106,180,7.741667,99.516667,143,0.639168,...,-0.000616,-0.000443,-0.341689,48.384776,0.839161,4-50-4_40x_cortex_1.tif,male,p14,4-50-4,non-treated control
2,73,0,246,9,259,117,3.397260,250.849315,90,0.742570,...,-0.000437,-0.002065,-0.999323,37.727922,0.811111,4-50-4_40x_cortex_1.tif,male,p14,4-50-4,non-treated control
3,76,4,7,14,19,120,9.026316,12.184211,93,0.699168,...,-0.001007,-0.001015,-1.203577,43.591883,0.817204,4-50-4_40x_cortex_1.tif,male,p14,4-50-4,non-treated control
4,81,8,21,19,33,132,13.061728,26.111111,91,0.525646,...,0.000027,-0.000098,-1.247603,34.556349,0.890110,4-50-4_40x_cortex_1.tif,male,p14,4-50-4,non-treated control
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
128,100,491,308,503,319,132,496.520000,313.190000,106,0.557040,...,0.000012,0.000155,0.315146,36.142136,0.943396,4-50-13_40x_thalamus_5.tif,male,p14,4-50-13,ogd 3 hours + SOD
129,80,494,397,505,406,99,498.987500,400.900000,87,0.680200,...,0.000039,-0.000168,-0.114205,33.556349,0.919540,4-50-13_40x_thalamus_5.tif,male,p14,4-50-13,ogd 3 hours + SOD
130,89,495,463,507,473,120,500.438202,467.460674,94,0.666259,...,0.000226,-0.000307,-0.325529,34.384776,0.946809,4-50-13_40x_thalamus_5.tif,male,p14,4-50-13,ogd 3 hours + SOD
131,95,498,14,508,27,130,502.873684,20.010526,100,0.779132,...,0.000599,-0.000747,-1.180466,36.142136,0.950000,4-50-13_40x_thalamus_5.tif,male,p14,4-50-13,ogd 3 hours + SOD


*Step 17: Saving as a CSV file*

The following code saves the pandas dataframe as a CSV that you can open with other software.  It will be saved in whatever directory you opened jupyter notebook or jupyuter lab from

In [95]:
df.to_csv(csv_name)

Now, I want you to repeat this process for each treatment group folder. Then it is up to you to begin visualizing the features with respect to the treatment groups

I recommend doing visualization in the following process:
1. By Generalized treatment group. (NT vs Injured vs Injured with Treatment)
2. By specific treatment groups (NT, OGD 0.5h, OGD 1.5h, OGD 3h ... etc)
3. Group by generalized treatment group and region (You will need to reorganize folders for this or add some lines to the code to provide you region based on file name)
4. Generally group regions without treatment group (hippocampus,  cortex, thalamus)
5. Specific treatment groups by region

And then in whatever way you were lead to based on your data

*Step 18: Print dependencies*

In [96]:
%load_ext watermark

%watermark -v -m -p numpy,pandas,scipy,skimage,matplotlib,wget

%watermark -u -n -t -z

Python implementation: CPython
Python version       : 3.7.8
IPython version      : 5.8.0

numpy     : 1.17.2
pandas    : 0.25.1
scipy     : 1.3.1
skimage   : 0.17.2
matplotlib: 3.3.1
wget      : 3.2

Compiler    : Clang 11.0.0 
OS          : Darwin
Release     : 19.4.0
Machine     : x86_64
Processor   : i386
CPU cores   : 8
Architecture: 64bit

Last updated: Fri Dec 04 2020 14:42:10MST

