### Authors: Prof. Dr. Soumi Ray, Ravi Teja Kothuru and Abhay Srivastav

### Acknowledgements:
I would like to thank my team mates Prof. Dr. Soumi Ray and Abhay Srivastav for their guidance and support throughout this project.

**Title of the Project:** Comparative Analysis of Image-Based and Feature-Based Approaches for Pneumonia Detection in Chest X-rays

**Description of the Project:** This project focuses on detecting pneumonia from chest X-ray images using Advanced Machine Learning and Deep Learning techniques (Rajpurkar et al., 2017; Wang et al., 2017). By leveraging a comprehensive dataset, including annotated images of pneumonia and normal cases, we aim to develop and compare image-based and feature-based approaches. Our goal is to identify the most effective method for accurate and interpretable pneumonia detection, contributing to improved patient outcomes through early diagnosis and treatment. This model will classify patients based on their chest X-ray images as either having pneumonia (1) or not having pneumonia (0).

**Objectives of the Project:** 

- **Image Analysis:** Develop and evaluate deep learning models to classify chest X-rays directly. This approach leverages deep learning models, particularly Convolutional Neural Networks (CNNs), to perform end-to-end image classification. The models directly process raw chest X-ray images to classify them as normal or pneumonia.

- **Feature Analysis:** Extract meaningful features from the images and use them to train and evaluate traditional machine learning models. In this approach, we first extract features from the chest X-ray images. These features are then used as inputs for traditional machine learning algorithms. The process includes steps such as feature extraction, selection, and transformation, followed by the application of machine learning techniques like Support Vector Machines (SVM), Random Forests.

**Name of the Dataset:** The dataset used in this project is the Chest X-ray dataset considered from the Research paper named **Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification**.

**Description of the Dataset:** The Diabetes Health Indicators Dataset contains healthcare statistics and lifestyle survey information about people in general along with their diagnosis of diabetes. The 35 features consist of some demographics, lab test results, and answers to survey questions for each patient. The target variable for classification is whether a patient has diabetes, is pre-diabetic, or healthy.

**Dataset Source:** 

- https://data.mendeley.com/datasets/rscbjbr9sj/2

**Type of the Dataset:**

- X-ray Images

**Description of Dataset:** 
The considered dataset has the following information for better reference:
- Separate folders to train and validate/test the model.
- Enough number of Chest X-ray images to train the model to detect and diagnose Pneumonia.
- The target variable for classification is whether patient has pneumonia or not.

**Goal of the Project using this Dataset:**
The goal of this project is to conduct a comprehensive comparative analysis of image-based and feature-based approaches for pneumonia detection using chest X-ray images. By evaluating the performance, robustness, and interpretability of deep learning and traditional machine learning models, we aim to identify the most effective method for accurately classifying chest X-rays as normal or pneumonia. This comparison will provide valuable insights into the strengths and limitations of each approach, ultimately contributing to improved detection and diagnosis of pneumonia, which can enhance patient outcomes and survival rates.

**Why did we choose this dataset?**
We selected this dataset based on several factors. For more detailed information, please refer to the following:
- The dataset is extensive, providing a large number of images suitable for evaluating and training deep learning models.
- It aligns well with the project's objectives by offering a challenging and realistic scenario for developing an image classification model using deep learning, specifically for Chest X-ray images.
- The dataset is annotated with images of two different diseases, enabling the development of a binary-class classification model.
- It is publicly available, facilitating easy access for research and development purposes.

**Size of dataset:**
- Total images size = 1.27 GB
- Dataset has 2 folders:
  -  **Train:**
    -  Normal (without Pneumonia) = 1349 images
    -  Pneumonia = 3884 images
  -  **Test:**
    -  Normal (without Pneumonia) = 234 images
    -  Pneumonia = 390 images
    
**Expected Behaviors and Problem Handling:**
- Classify Chest X-ray images with high accuracy.
- Handle variations in image quality, resolution, and orientation.
- Be robust to noise and artifacts in the images.
- Provide interpretable results.

**Issues to focus on:**
- Improving model interpretability and explainability.
- Optimizing model performance on a held-out test set.
- Following AI Ethics and Data Safety practices.

# Import all the required files and libraries

In [1]:
import os
import ssl

# Disable SSL certificate verification
ssl._create_default_https_context = ssl._create_unverified_context

# Automatically reload imported modules when their source code changes
%load_ext autoreload
%reload_ext autoreload
%autoreload 2

# Import python files from local to use the corresponding function
from cxr_image_features_extraction import CxrImageFeatureExtraction

# Perform Chest X-ray Images Feature Extraction

## Create an object of the Image Feature Extraction class

In [2]:
image_feature_extraction = CxrImageFeatureExtraction()

## Fetch the absolute paths of the normalized image dataset

In [3]:
# Define the path to the dataset
dataset_path = image_feature_extraction.get_base_path_of_dataset() + "_nrm"
print(f"Normalized Dataset Path = {dataset_path}")

# Fetch train, test, NORMAL and PNEUMONIA folder names
train_folder_name = str(image_feature_extraction.train_test_image_dirs[0])
test_folder_name = str(image_feature_extraction.train_test_image_dirs[1])

normal_img_folder_name = str(image_feature_extraction.normal_pneumonia_image_dirs[0])
pneumonia_img_folder_name = str(image_feature_extraction.normal_pneumonia_image_dirs[1])

# Define the paths to the train and test datasets
# Train
train_normal = os.path.join(dataset_path, train_folder_name + "_nrm", normal_img_folder_name + "_nrm")
train_pneumonia = os.path.join(dataset_path, train_folder_name + "_nrm", pneumonia_img_folder_name + "_nrm")

# Test
test_normal = os.path.join(dataset_path, test_folder_name + "_nrm", normal_img_folder_name + "_nrm")
test_pneumonia = os.path.join(dataset_path, test_folder_name + "_nrm", pneumonia_img_folder_name + "_nrm")

# Print the paths to the train and test datasets
print("\nNormalized Train Images")
print("************************")
print(f"NORMAL = {train_normal}")
print(f"\nPNEUMONIA = {train_pneumonia}")

print("\n\nNormalized Test Images")
print("***************************")
print(f"NORMAL = {test_normal}")
print(f"\nPNEUMONIA = {test_pneumonia}")

Normalized Dataset Path = /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm

Normalized Train Images
************************
NORMAL = /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/NORMAL_nrm

PNEUMONIA = /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/PNEUMONIA_nrm


Normalized Test Images
***************************
NORMAL = /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/Univers

## Convert all Normalized image folder absolute paths to a list

In [4]:
image_normalized_folders = [
    train_normal, train_pneumonia,
    test_normal, test_pneumonia
]

image_normalized_folders

['/Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/NORMAL_nrm',
 '/Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/PNEUMONIA_nrm',
 '/Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/test_nrm/NORMAL_nrm',
 '/Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/

### Extract First Order Features of all the images into an Excel file

# First-Order Features Definitions and Formulas

## 1. Mean
**Definition:** The average value of all pixel intensities in the image.

**Formula:**
$$ \text{Mean} = \frac{1}{N} \sum_{i=1}^{N} x_i $$
where \( x_i \) is the pixel value and \( N \) is the total number of pixels.

## 2. Median
**Definition:** The middle value of the sorted pixel intensities. If there is an even number of pixels, it is the average of the two middle values.

**Formula:**
If \( N \) is odd:
$$ \text{Median} = x_{\left(\frac{N+1}{2}\right)} $$
If \( N \) is even:
$$ \text{Median} = \frac{x_{\left(\frac{N}{2}\right)} + x_{\left(\frac{N}{2}+1\right)}}{2} $$

## 3. Standard Deviation
**Definition:** A measure of the spread or dispersion of pixel intensities from the mean.

**Formula:**
$$ \text{Standard Deviation} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (x_i - \text{Mean})^2} $$

## 4. Variance
**Definition:** The average of the squared differences from the mean, representing the spread of pixel intensities.

**Formula:**
$$ \text{Variance} = \frac{1}{N} \sum_{i=1}^{N} (x_i - \text{Mean})^2 $$

## 5. Skewness
**Definition:** A measure of the asymmetry of the pixel intensity distribution around the mean.

**Formula:**
$$ \text{Skewness} = \frac{N}{(N-1)(N-2)} \sum_{i=1}^{N} \left(\frac{x_i - \text{Mean}}{\text{Standard Deviation}}\right)^3 $$

## 6. Kurtosis
**Definition:** A measure of the "tailedness" of the pixel intensity distribution.

**Formula:**
$$ \text{Kurtosis} = \frac{N(N+1)}{(N-1)(N-2)(N-3)} \sum_{i=1}^{N} \left(\frac{x_i - \text{Mean}}{\text{Standard Deviation}}\right)^4 - \frac{3(N-1)^2}{(N-2)(N-3)} $$

## 7. Range
**Definition:** The difference between the maximum and minimum pixel intensities in the image.

**Formula:**
$$ \text{Range} = \text{Max} - \text{Min} $$

## 8. Entropy
**Definition:** A measure of the unpredictability or randomness of the pixel intensity values.

**Formula:**
$$ \text{Entropy} = -\sum_{i=1}^{k} p(x_i) \log_2 p(x_i) $$
where \( p(x_i) \) is the probability of pixel intensity \( x_i \) and \( k \) is the number of unique pixel values.

## 9. Energy
**Definition:** The sum of squared pixel values, indicating the overall intensity.

**Formula:**
$$ \text{Energy} = \sum_{i=1}^{N} x_i^2 $$

## 10. Uniformity
**Definition:** A measure of the sum of squared normalized pixel values, indicating how similar the pixel values are.

**Formula:**
$$ \text{Uniformity} = \sum_{i=1}^{N} \left(\frac{x_i}{255}\right)^2 $$

## 11. Root Mean Square (RMS) Value
**Definition:** The square root of the mean of the squared pixel values, providing a measure of pixel intensity magnitude.

**Formula:**
$$ \text{RMS} = \sqrt{\frac{1}{N} \sum_{i=1}^{N} x_i^2} $$

## 12. Maximum Pixel Value
**Definition:** The highest pixel intensity value in the image.

**Formula:**
$$ \text{Max} = \max(x_i) $$

## 13. Minimum Pixel Value
**Definition:** The lowest pixel intensity value in the image.

**Formula:**
$$ \text{Min} = \min(x_i) $$

## 14. Median Absolute Deviation (MAD)
**Definition:** A measure of the variability of pixel intensities around the median, which is robust to outliers.

**Formula:**
$$ \text{MAD} = \text{Median}\left(\left|x_i - \text{Median}(x)\right|\right) $$

## 15. Interquartile Range (IQR)
**Definition:** The range within which the central 50% of pixel values lie, representing the middle spread.

**Formula:**
$$ \text{IQR} = Q_3 - Q_1 $$
where \( Q_3 \) is the third quartile and \( Q_1 \) is the first quartile of the pixel values.

## 16. Mean Absolute Deviation (MAD)
**Definition:** The average of the absolute differences between each pixel value and the mean.

**Formula:**
$$ \text{Mean Absolute Deviation} = \frac{1}{N} \sum_{i=1}^{N} \left| x_i - \text{Mean} \right| $$

In [7]:
image_feature_extraction.update_first_order_features_to_excel_file(folders=image_normalized_folders)

Extracted First order features will be saved - /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/image_information/chest_xray_images_first_order_features.xlsx


Extracting first-order features from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/NORMAL_nrm


Folder: train_nrm/NORMAL_nrm: 100%|██████████████████████████████████████████████████████| 1349/1349 [00:42<00:00, 31.90it/s]


Extracting first-order features from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/PNEUMONIA_nrm


Folder: train_nrm/PNEUMONIA_nrm: 100%|███████████████████████████████████████████████████| 3883/3883 [00:51<00:00, 75.10it/s]


Extracting first-order features from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/test_nrm/NORMAL_nrm


Folder: test_nrm/NORMAL_nrm: 100%|█████████████████████████████████████████████████████████| 234/234 [00:08<00:00, 27.20it/s]


Extracting first-order features from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/test_nrm/PNEUMONIA_nrm


Folder: test_nrm/PNEUMONIA_nrm: 100%|██████████████████████████████████████████████████████| 390/390 [00:04<00:00, 89.62it/s]




All first-order features are extracted to the Excel file: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/image_information/chest_xray_images_first_order_features.xlsx
Please check the Excel file for further analysis and interpretation


### Extract Second Order GLCM Features of all the images and write into the existing Excel file


## 1. Contrast
$$
\text{Contrast} = \sum_{i,j} (i - j)^2 \cdot P(i,j)
$$
**Definition**: Measure of intensity contrast between a pixel and its neighbor over the whole image.

## 2. Correlation
$$
\text{Correlation} = \frac{\sum_{i,j} (i \cdot j \cdot P(i,j) - \mu_x \cdot \mu_y)}{\sigma_x \cdot \sigma_y}
$$
**Definition**: Measure of how correlated a pixel is to its neighbor over the whole image.

## 3. Energy
$$
\text{Energy} = \sum_{i,j} P(i,j)^2
$$
**Definition**: Measure of the sum of squared elements in the GLCM.

## 4. Homogeneity
$$
\text{Homogeneity} = \sum_{i,j} \frac{P(i,j)}{1 + |i - j|}
$$
**Definition**: Measure of how close the distribution of elements in the GLCM is to the GLCM diagonal.

## 5. Dissimilarity
$$
\text{Dissimilarity} = \sum_{i,j} |i - j| \cdot P(i,j)
$$
**Definition**: Measure of the dissimilarity between a pixel and its neighbor.

## 6. Entropy
$$
\text{Entropy} = -\sum_{i} P(i) \log(P(i))
$$
**Definition**: Measure of randomness in the GLCM.

## 7. Auto-correlation
$$
\text{Auto-correlation} = \sum_{i,j} i \cdot j \cdot P(i,j)
$$
**Definition**: Measure of the correlation between a pixel and its neighbor.

## 8. Cluster Prominence
$$
\text{Cluster Prominence} = \sum_{i,j} (i + j - \mu_x - \mu_y)^4 \cdot P(i,j)
$$
**Definition**: Measure of the peakedness of the clusters in the GLCM.

## 9. Cluster Shade
$$
\text{Cluster Shade} = \sum_{i,j} (i + j - \mu_x - \mu_y)^3 \cdot P(i,j)
$$
**Definition**: Measure of the skewness of the clusters in the GLCM.

## 10. Maximum Probability
$$
\text{Maximum Probability} = \max(P(i,j))
$$
**Definition**: Measure of the maximum probability in the GLCM.

## 11. Sum of Squares (Variance)
$$
\text{Variance} = \frac{\sum_{i,j} (i - \mu)^2 \cdot P(i,j)}{\sum_{i,j} P(i,j)}
$$
**Definition**: Measure of the dispersion of elements in the GLCM.

## 12. Sum Average
$$
\text{Sum Average} = \sum_{i} i \cdot P_{x+y}(i)
$$
**Definition**: Measure of the average of the sum of elements in the GLCM. Note: The range should be from 0 to \(2N-2\), but this is often implied.

## 13. Sum Entropy
$$
\text{Sum Entropy} = -\sum_{i} P_{x+y}(i) \log(P_{x+y}(i))
$$
**Definition**: Measure of the entropy of the sum of elements in the GLCM.

## 14. Sum Variance
$$
\text{Sum Variance} = \sum_{i} (i - \text{Sum Average})^2 \cdot P_{x+y}(i)
$$
**Definition**: Measure of the variance of the sum of elements in the GLCM.

## 15. Difference Entropy
$$
\text{Difference Entropy} = -\sum_{i} P_{x-y}(i) \log(P_{x-y}(i))
$$
**Definition**: Measure of the entropy of the differences in the GLCM.

## 16. Difference Variance
$$
\text{Difference Variance} = \sum_{i} (i - \mu_{x-y})^2 \cdot P_{x-y}(i)
$$
**Definition**: Measure of the variance of the differences in the GLCM.


In [6]:
image_feature_extraction.update_second_order_glcm_features_to_excel_file(folders=image_normalized_folders)

Extracted GLCM features will be saved to - /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/image_information/chest_xray_images_second_order_features_glcm.xlsx


Extracting second-order features GLCM from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/NORMAL_nrm


Folder: train_nrm/NORMAL_nrm: 100%|██████████████████████████████████████████████████████| 1349/1349 [00:50<00:00, 26.97it/s]


Extracting second-order features GLCM from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/train_nrm/PNEUMONIA_nrm


Folder: train_nrm/PNEUMONIA_nrm: 100%|███████████████████████████████████████████████████| 3883/3883 [02:05<00:00, 30.97it/s]


Extracting second-order features GLCM from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/test_nrm/NORMAL_nrm


Folder: test_nrm/NORMAL_nrm: 100%|█████████████████████████████████████████████████████████| 234/234 [00:07<00:00, 32.24it/s]


Extracting second-order features GLCM from: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/dataset/chest_xray_nrm/test_nrm/PNEUMONIA_nrm


Folder: test_nrm/PNEUMONIA_nrm: 100%|██████████████████████████████████████████████████████| 390/390 [00:11<00:00, 33.38it/s]




All first-order features are extracted to the Excel file: /Users/ravkothu/Documents/Personal_items_at_Oracle/Master_Degree/University_of_San_Diego/Online_Masters/MS_in_Applied_AI/Subjects_and_Resources/AAI-501_Introduction_to_AI/AAI-501_Final_Team_Project/pneumonia_detection/image_information/chest_xray_images_second_order_features_glcm.xlsx
Please check the Excel file for further analysis and interpretation


### Extract Second Order GLDM Features of all the images and write into the existing Excel file

The following GLDM (Gray Level Dependence Matrix) features are extracted from the given image:

#### 1. Small Dependence Emphasis (SDE)

*Definition:* Measures the distribution of small dependencies in the image.
*Formula:* $$\text{SDE} = \frac{\sum_{i=0}^{1} \sum_{j=0}^{G-1} p(i,j)}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 2. Large Dependence Emphasis (LDE)

*Definition:* Measures the distribution of large dependencies in the image.
*Formula:* $$\text{LDE} = \frac{\sum_{i=G-2}^{G-1} \sum_{j=0}^{G-1} p(i,j)}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 3. Gray Level Non-Uniformity (GLN)

*Definition:* Measures the non-uniformity of gray levels in the image.
*Formula:* $$\text{GLN} = \frac{\sum_{i=0}^{G-1} (\sum_{j=0}^{G-1} p(i,j))^2}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 4. Dependence Count Non-Uniformity (DCN)

*Definition:* Measures the non-uniformity of dependence counts in the image.
*Formula:* $$\text{DCN} = \frac{\sum_{j=0}^{G-1} (\sum_{i=0}^{G-1} p(i,j))^2}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 5. Dependence Count Entropy (DCE)

*Definition:* Measures the entropy of dependence counts in the image.
*Formula:* $$\text{DCE} = -\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j) \log_2 p(i,j)$$

#### 6. Gray Level Entropy (GLE)

*Definition:* Measures the entropy of gray levels in the image.
*Formula:* $$\text{GLE} = -\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j) \log_2 p(i,j)$$

#### 7. Dependence Count Mean (DCM)

*Definition:* Measures the mean of dependence counts in the image.
*Formula:* $$\text{DCM} = \frac{\sum_{j=0}^{G-1} \sum_{i=0}^{G-1} p(i,j) \cdot j}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 8. Gray Level Mean (GLM)

*Definition:* Measures the mean of gray levels in the image.
*Formula:* $$\text{GLM} = \frac{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j) \cdot i}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 9. Dependence Count Variance (DCV)

*Definition:* Measures the variance of dependence counts in the image.
*Formula:* $$\text{DCV} = \frac{\sum_{j=0}^{G-1} (\sum_{i=0}^{G-1} p(i,j) \cdot j - \text{DCM})^2}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 10. Gray Level Variance (GLV)

*Definition:* Measures the variance of gray levels in the image.
*Formula:* $$\text{GLV} = \frac{\sum_{i=0}^{G-1} (\sum_{j=0}^{G-1} p(i,j) \cdot i - \text{GLM})^2}{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)}$$

#### 11. Dependence Count Energy (DCE)

*Definition:* Measures the energy of dependence counts in the image.
*Formula:* $$\text{DCE} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)^2$$

#### 12. Gray Level Energy (GLE)

*Definition:* Measures the energy of gray levels in the image.
*Formula:* $$\text{GLE} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)^2$$

#### 13. Dependence Count Maximum (DCM)

*Definition:* Measures the maximum dependence count in the image.
*Formula:* $$\text{DCM} = \max_{j=0}^{G-1} \sum_{i=0}^{G-1} p(i,j)$$

#### 14. Gray Level Maximum (GLM)

*Definition:* Measures the maximum gray level in the image.
*Formula:* $$\text{GLM} = \max_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)$$

#### 15. Dependence Count Contrast (DCC)

*Definition:* Measures the contrast of dependence counts in the image.
*Formula:* $$\text{DCC} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} |i-j| \cdot p(i,j)$$

#### 16. Gray Level Contrast (GLC)

*Definition:* Measures the contrast of gray levels in the image.
*Formula:* $$\text{GLC} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} |i-j| \cdot p(i,j)$$

#### 17. Dependence Count Correlation (DCC)

*Definition:* Measures the correlation of dependence counts in the image.
*Formula:* $$\text{DCC} = \frac{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} (i-\mu_i)(j-\mu_j) \cdot p(i,j)}{\sigma_i \sigma_j}$$

#### 18. Gray Level Correlation (GLC)

*Definition:* Measures the correlation of gray levels in the image.
*Formula:* $$\text{GLC} = \frac{\sum_{i=0}^{G-1} \sum_{j=0}^{G-1} (i-\mu_i)(j-\mu_j) \cdot p(i,j)}{\sigma_i \sigma_j}$$

#### 19. Dependence Count Homogeneity (DCH)

*Definition:* Measures the homogeneity of dependence counts in the image.
*Formula:* $$\text{DCH} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} \frac{p(i,j)}{1+|i-j|}$$

#### 20. Gray Level Homogeneity (GLH)

*Definition:* Measures the homogeneity of gray levels in the image.
*Formula:* $$\text{GLH} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} \frac{p(i,j)}{1+|i-j|}$$

#### 21. Dependence Count Sum (DCS)

*Definition:* Measures the sum of dependence counts in the image.
*Formula:* $$\text{DCS} = \sum_{j=0}^{G-1} \sum_{i=0}^{G-1} p(i,j)$$

#### 22. Gray Level Sum (GLS)

*Definition:* Measures the sum of gray levels in the image.
*Formula:* $$\text{GLS} = \sum_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)$$

#### 23. Dependence Count Range (DCR)

*Definition:* Measures the range of dependence counts in the image.
*Formula:* $$\text{DCR} = \max_{j=0}^{G-1} \sum_{i=0}^{G-1} p(i,j) - \min_{j=0}^{G-1} \sum_{i=0}^{G-1} p(i,j)$$

24. Gray Level Range (GLR)

Definition: Measures the range of gray levels in the image. Formula: $$\text{GLR} = \max_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j) - \min_{i=0}^{G-1} \sum_{j=0}^{G-1} p(i,j)$$

Where $G$ is the maximum gray level in the image, and $p(i,j)$ is the probability of occurrence of gray level $i$ and its neighboring gray level $j$ in the image.

In [None]:
image_feature_extraction.update_second_order_gldm_features_to_excel_file(folders=image_normalized_folders)

Extracted GLDM features will be saved to - /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/image_information/chest_xray_images_second_order_features_gldm.xlsx


Extracting second-order features GLDM from: /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/dataset/chest_xray_nrm/train_nrm/NORMAL_nrm



Folder: train_nrm/NORMAL_nrm:   0%|                                                           | 0/1349 [00:00<?, ?it/s][A
Folder: train_nrm/NORMAL_nrm:   1%|▎                                                  | 8/1349 [00:00<00:16, 79.08it/s][A
Folder: train_nrm/NORMAL_nrm:   1%|▎                                                  | 8/1349 [00:16<00:16, 79.08it/s][A
Folder: train_nrm/NORMAL_nrm:   1%|▌                                                 | 16/1349 [00:21<35:19,  1.59s/it][A
Folder: train_nrm/NORMAL_nrm:   2%|▉                                                 | 24/1349 [00:40<43:27,  1.97s/it][A
Folder: train_nrm/NORMAL_nrm:   2%|█▏                                                | 32/1349 [01:04<51:25,  2.34s/it][A
Folder: train_nrm/NORMAL_nrm:   3%|█▍                                              | 40/1349 [01:32<1:00:34,  2.78s/it][A
Folder: train_nrm/NORMAL_nrm:   4%|█▋                                              | 48/1349 [02:05<1:10:16,  3.24s/it][A
Folder: train_n

Extracting second-order features GLDM from: /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/dataset/chest_xray_nrm/train_nrm/PNEUMONIA_nrm



Folder: train_nrm/PNEUMONIA_nrm:   0%|                                                        | 0/3883 [00:00<?, ?it/s][A
Folder: train_nrm/PNEUMONIA_nrm:   0%|▏                                              | 16/3883 [00:03<14:08,  4.56it/s][A
Folder: train_nrm/PNEUMONIA_nrm:   1%|▎                                              | 24/3883 [00:16<50:24,  1.28it/s][A
Folder: train_nrm/PNEUMONIA_nrm:   1%|▍                                              | 32/3883 [00:25<59:44,  1.07it/s][A
Folder: train_nrm/PNEUMONIA_nrm:   1%|▍                                            | 40/3883 [00:41<1:24:09,  1.31s/it][A
Folder: train_nrm/PNEUMONIA_nrm:   1%|▌                                            | 48/3883 [01:03<1:53:24,  1.77s/it][A
Folder: train_nrm/PNEUMONIA_nrm:   1%|▋                                            | 56/3883 [01:16<1:50:11,  1.73s/it][A
Folder: train_nrm/PNEUMONIA_nrm:   2%|▋                                            | 64/3883 [01:26<1:39:52,  1.57s/it][A
Folder: train_n

Extracting second-order features GLDM from: /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/dataset/chest_xray_nrm/test_nrm/NORMAL_nrm


Folder: test_nrm/NORMAL_nrm: 100%|███████████████████████████████████████████████████| 234/234 [13:19<00:00,  3.42s/it]


Extracting second-order features GLDM from: /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/dataset/chest_xray_nrm/test_nrm/PNEUMONIA_nrm


Folder: test_nrm/PNEUMONIA_nrm: 100%|████████████████████████████████████████████████| 390/390 [07:40<00:00,  1.18s/it]




All first-order features are extracted to the Excel file: /Users/raviteja/Documents/Teja_Career/Master_Degree/USD/MS_AAI/AAI-501/Final_Project/pneumonia-detection-in-chest-X-rays/image_information/chest_xray_images_second_order_features_gldm.xlsx
Please check the Excel file for further analysis and interpretation
