# Computation of Haralick features (texture analysis)

In this Jupyter notebook we analyze the texture of the images using the following Haralick Features:

**Angular Second Moment (ASM) / Energy:** This measures the uniformity of an image. A higher value indicates that the image has more uniform textures or constant regions.

**Contrast:** Represents the difference between the highest and the lowest intensity value in the co-occurrence matrix. It measures the amount of local variations present in an image.

**Correlation:** Measures the joint probability occurrence of the specified pixel pairs. It provides information about the linear dependency of gray levels in the neighboring pixels.

**Sum of Squares / Variance:** It provides a measure of the squared differences from the mean intensity value.

**Inverse Difference Moment (IDM) / Homogeneity:** This measures the local homogeneity of an image. The values are high when the local textures are consistent or homogeneous.

**Sum Average:** Represents the average intensity value of the co-occurrence matrix.

**Sum Variance:** Measures the variance of the sum of the intensity values from the average value in the co-occurrence matrix.

**Sum Entropy:** Represents the randomness or complexity in the sum of the intensity values of the co-occurrence matrix.

**Entropy:** Provides a measure of the randomness or complexity in the image. Higher values indicate more complex textures.

**Difference Variance:** Represents the variance in the differences between the intensity values of pairs of pixels.

**Difference Entropy:** Measures the randomness or complexity in the differences between the intensity values of pairs of pixels.

**Informational Correlation 1 (Info Corr 1):** Represents the correlation between the occurrence of the specified pixel pairs and their average intensity values.

**Informational Correlation 2 (Info Corr 2):** Provides another measure of the correlation between the occurrence of the specified pixel pairs and their average intensity values. It's typically more sensitive to changes than Info Corr 1.

## Load libraries

In [None]:
import os
from tifffile import imread
import re
import mahotas as mh
import numpy as np
import pandas as pd

## Extract metadata from image names

We create a function to exract the matedata from the image names and index the results accordingly. 

In [None]:
def extract_metadata_from_filename(filename):
    """
    Extracts Day, Region, Protein, and AnimalID information from the given filename.
    
    Parameters:
        filename (str): The filename to extract metadata from.
    
    Returns:
        tuple: (animalid, day, region, protein)
    """
    # Extract AnimalID as the first string of the filename, for example, "Td012"
    animalid = filename.split("_")[0]
    
    # Define a regular expression pattern based on your file naming convention
    pattern = r"_(\d+D)_.*_(\w+)_(\w+)_"
    match = re.search(pattern, filename)
    
    if match:
        day, region, protein = match.groups()
        return animalid, day, region, protein
    else:
        raise ValueError(f"Pattern not matched for: {filename}")

# Example usage
data_directory = "E:/Research/Stroke_PDGFRb_Reactivity/Exp4-Pdgfra-Pdgfrb/Widefield_20x_ROIs-Stacks_Pdgfra-Pdgfrb/Images_Zplane"
files = [f for f in os.listdir(data_directory) if f.endswith('.tif')]

for file in files:
    # Extract metadata
    animalid, day, region, protein = extract_metadata_from_filename(file)
    
    # Load the image stack
    image_stack = imread(os.path.join(data_directory, file))
    
    # For now, let's just print the metadata and the shape of the image stack
    print(f"Filename: {file} | AnimalID: {animalid} | Day: {day} | Region: {region} | Protein: {protein} | Image Shape: {image_stack.shape}")


## Compute the haralick features

Next, we compute the Haralick features and store the results in a designated file

In [None]:
def compute_haralick_features(image):
    """
    Compute Haralick texture features for a 2D image using mahotas.
    
    Parameters:
        image (numpy.ndarray): 2D image.
        
    Returns:
        dict: Dictionary containing texture metrics for the image.
    """
    features = mh.features.haralick(image)
    feature_dict = {
        'angular_second_moment': features[0][0],
        'contrast': features[0][1],
        'correlation': features[0][2],
        'sum_of_squares': features[0][3],
        'inverse_difference_moment': features[0][4],
        'sum_average': features[0][5],
        'sum_variance': features[0][6],
        'sum_entropy': features[0][7],
        'entropy': features[0][8],
        'difference_variance': features[0][9],
        'difference_entropy': features[0][10],
        'info_corr_1': features[0][11],
        'info_corr_2': features[0][12]
    }
    return feature_dict

# List to store results
results = []

for file in files:
    # Extract metadata
    animalid, day, region, protein = extract_metadata_from_filename(file)
    
    # Load the image
    image = imread(os.path.join(data_directory, file))
    
    # Compute Haralick features for the image
    features = compute_haralick_features(image)
    
    # Store the results, flattening the features for easier dataframe manipulation
    results.append({
        'AnimalID': animalid,
        'Day': day,
        'Region': region,
        'Protein': protein,
        **features,
        'Filename': file  # Store the filename for traceability
    })
    
    print(f"Processed: {file} | AnimalID: {animalid} | Day: {day} | Region: {region} | Protein: {protein}")

# Convert results to a DataFrame for easier manipulation
df = pd.DataFrame(results)

save_path_csv = r"D:/Research/Stroke_PDGFR-B_Reactivity/Pdgfrb_Reactivity_DataAnalysis/Stroke_Pdgfrb_Reactivity/Data_Raw/Raw_Widefield_20x_ROIs_Pdgfrb_Haralick.csv"

# Save the results as CSV
df.to_csv(save_path_csv, index=False)

print(f"Haralick features results saved to {save_path_csv}")