In [12]:
import numpy as np

from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier

from keras.datasets import cifar10
from skimage.color import rgb2gray

# [Object Classification Model Using Ensemble Learning with GrayLevel Co-Occurrence Matrix and Histogram Extraction](https://arxiv.org/ftp/arxiv/papers/2309/2309.13512.pdf)
### Summary by Kaloyan Dimitrov

## 1. Introduction

This introduction discusses the importance and challenges of object classification in various fields, including medicine, industry, and research. It highlights the difficulties posed by variations in object characteristics such as shape, size, color, texture, and context. The introduction details previous research findings, noting the effectiveness of certain methods like GLCM for feature extraction and different classification models (SVM, k-NN) in achieving high accuracy, sensitivity, specificity, and precision, particularly with Magnetic Resonance Imaging samples.

The primary goal of the research is to enhance the accuracy of object identification through various classification methods:
- K-Nearest Neighbors, 
- Random Forest, 
- Support Vector Machine, 
- Decision Tree, 
- Naive Bayes 

and ensemble learning techniques:
- Voting Classifiers and
- Combined Classifiers. 

The use of these combined methods is proposed to address the complexities and variations in object classification, as each model brings a unique focus on specific characteristics or features, leading to improved accuracy and reduced prediction errors.

Additionally, the research utilizes GLCM (Gray-Level Co-Occurrence Matrix) for feature extraction and histograms to identify relevant characteristics from images. These methods are effective in describing spatial relationships between pixel intensities and the frequency of pixel intensity levels. By integrating ensemble learning with GLCM feature extraction and histograms, the study aims to create a more accurate object classification model, enhancing the effectiveness of ensemble learning methods and increasing reliability.

## 2. Methods
### 1. Pre-Processing and Resizing of Object Datasets
The first step involves preparing the images. This includes resizing the images in the dataset to a uniform size. This step is crucial because it makes the next steps - feature extraction and classification - easier and more consistent.
### 2. Feature Extraction Using GLCM and Histograms
- **GLCM *(Gray-Level Co-occurrence Matrix)***: This is a method that looks at how often pairs of pixels with specific values and in a specified spatial relationship occur in an image. This method helps to extract various features from the image, like texture and contrast.
- **Histograms**: These are used to analyze the distribution of pixels in an image. Basically, it's like looking at a graph showing how many pixels of each intensity level are present in the image.

Below is the code for the histogram calculation, which also describes the process in detail:

In [2]:
def calculateHistogram(image):
	"""
	Calculates the histogram of the given image.
	
	Parameters
	----------
	image : numpy.ndarray
		The image to calculate the histogram of.
	
 	Returns
	-------
	numpy.ndarray
		The calculated histogram.
	
	Notes
	-----
	* The image is assumed to be grayscale. For color images, the image should be converted to grayscale before passing it to this function.
	* The histogram is calculated for intensity levels 0 to 255.
	"""
	# Step 1: Determine the number of possible intensity levels in the image.
	# Assuming the image is grayscale, the number of intensity levels is typically 256 (0 to 255).
	N = 256

	# Step 2: Create an array called histogram with size N, initialized with zeros.
	histogram = np.zeros(N, dtype=int)

	# Step 3: Get the width and height of the image.
	height, width = image.shape

	# Step 4: Iterate over each pixel in the image using two nested loops.
	for y in range(height):
		for x in range(width):
			# Step 5: Retrieve the intensity of the pixel at coordinate (x, y) in the image.
			intensity = image[y, x]

			# Step 6: Increment the corresponding element in the histogram array for the found intensity level.
			histogram[intensity] += 1

	# Step 7: Repeat steps 4 and 5 for each pixel in the image (done within the nested loops).

	# Step 8: Return the calculated histogram.
	return histogram

In addition to this, the process involves calculating specific features from the GLCM (Grey Level Co-occurrence Matrix). But to do that we first have to calculate the actual GLCM.

The GLCM is a matrix where the number of rows and columns is equal to the number of grey levels, G, in the image. The matrix element P(i, j | d, θ) is the relative frequency with which two pixels, separated by a pixel distance d and direction θ, occur within a given neighborhood, one with intensity 'i' and the other with intensity 'j'. The elements of GLCM are, therefore, the probabilities of co-occurrence of pixel pairs with specific values and at a specific spatial relationship.

To calculate the co-occurrence probabilities, we would:

1. Define a spatial relationship (distance and direction) between pixel pairs.
2. For each pixel in the image, find the pixel that is 'd' distance away in the 'θ' direction.
3. If the intensity of the first pixel is 'i' and the intensity of the second pixel is 'j', increment the count in the (i, j) cell of the GLCM.
4. Normalize the GLCM by dividing each entry by the total number of pixel pairs considered. The result is the co-occurrence probabilities.

This matrix provides information about the texture of the image, which can be used for tasks like image segmentation, object detection, and more. The code below shows how to do that:

In [23]:
def simple_greycomatrix(image):
	"""
	Calculate a simple GLCM for an image with a distance of 1 and a 0-degree angle.

	Parameters:
	image (numpy.ndarray): Input image

	Returns:
	numpy.ndarray: The GLCM
	"""

	# Initialize the GLCM
	glcm = np.zeros((256, 256))

	# Shift the image one pixel to the right to get the neighbors
	image_shifted = np.roll(image, shift=-1, axis=1)

	# Calculate the GLCM
	for i in range(image.shape[0]):
		for j in range(image.shape[1] - 1):  # Subtract 1 to avoid the last column
			intensity = int(image[i, j])
			neighbor_intensity = int(image_shifted[i, j])
			glcm[intensity, neighbor_intensity] += 1

	return glcm

Having calculated this matrix we can now extract specific features from it:
- Energy (Equation 1): This measures the uniformity of texture in the image. It's calculated by summing up the squares of the co-occurrence probabilities for each pair of pixels.
$$ Energy = \sum_{i,j} P(i,j)^2 $$
- Contrast (Equation 2): This measures the variation between neighboring pixels. It’s calculated by finding the difference between the row and column indices, squaring this difference, and multiplying it by the co-occurrence - probability.
$$ Contrast = \sum_{i,j} (i-j)^2 P(i,j) $$
- Homogeneity (Equation 3): This measures how similar the elements in the co-occurrence matrix are, in terms of proximity.
$$ Homogeneity = \sum_{i,j} \frac{P(i,j)}{1+(i-j)^2} $$
- Entropy (Equation 4): This one is about the complexity of the information in the image. It’s calculated using the co-occurrence probabilities of pixel pairs.
$$ Entropy = -\sum_{i,j} P(i,j) log_2 P(i,j) $$
- Correlation (Equation 5): This measures the linear dependency between pixel intensities. It involves the average pixel intensity and the standard deviation of the row and column weights of the co-occurrence matrix.
$$ Correlation = \frac{\sum_{i,j} [ij * P(i,j)] - \mu_x * \mu_y}{\sigma_x * \sigma_y} $$



In [4]:
def calculate_glcm_features(glcm):
	"""
	Calculate GLCM features for a given GLCM
	
	Parameters:
	glcm (numpy.ndarray): Input GLCM
	
	Returns:
	dict: A dictionary with GLCM features
	"""
	
	def energy(glcm):
		"""Calculate energy feature"""
		return np.sum(glcm**2)
	
	def contrast(glcm):
		"""Calculate contrast feature"""
		rows, cols = np.indices(glcm.shape)
		return np.sum(glcm * (rows - cols)**2)
	
	def homogeneity(glcm):
		"""Calculate homogeneity feature"""
		rows, cols = np.indices(glcm.shape)
		return np.sum(glcm / (1 + np.abs(rows - cols)))
	
	def entropy(glcm):
		"""Calculate entropy feature"""
		glcm_prob = glcm / np.sum(glcm)  # Convert GLCM to probabilities
		glcm_prob_nonzero = glcm_prob[glcm_prob > 0]  # Only consider non-zero entries
		return -np.sum(glcm_prob_nonzero * np.log2(glcm_prob_nonzero))  # Calculate entropy
	
	def correlation(glcm):
		"""Calculate correlation feature"""
		rows, cols = np.indices(glcm.shape)
		mu_x = np.sum(rows * glcm)
		mu_y = np.sum(cols * glcm)
		sigma_x = np.sqrt(np.sum(glcm * (rows - mu_x)**2))
		sigma_y = np.sqrt(np.sum(glcm * (cols - mu_y)**2))
		return np.sum(((rows - mu_x) * (cols - mu_y) * glcm) / (sigma_x * sigma_y))
	
	# Calculate GLCM properties
	features = {
		'energy': energy(glcm),
		'contrast': contrast(glcm),
		'homogeneity': homogeneity(glcm),
		'entropy': entropy(glcm),
		'correlation': correlation(glcm)
	}
	
	return features

### 3. Classification Methods
The features extracted are then classified using several algorithms:
- **Random Forest (RF)**: This method uses many decision trees to make a decision.
- **Support Vector Machine (SVM)**: This method finds the best boundary that separates different classes of objects in the images.
- **k-Nearest Neighbors (kNN)**: This one looks at the nearest neighbors of a data point to decide its class.
- **Naive Bayes**: This method works on the principle that the presence of one feature doesn't affect the presence of another.
- **Decision Tree**: It's like a flowchart where each decision leads down a different path to a classification.

### 4. Ensemble Methods - Voting and Combined Classifier:
Using the above described classification methods, the research then uses ensemble learning to combine the predictions of the different models. This is done in two separate ways:
- **Voting Ensemble**: This method combines predictions from different models and goes with the majority vote.
- **Combined Classifier**: This is more dynamic. If one model is unsure about a prediction, it will rely on the predictions from other models.

Here is also their implementation:

In [6]:
from collections import Counter

def VotingEnsemble(RF_predict, SVM_predict, kNN_predict, NB_predict, DT_predict):
	"""
	Performs voting ensemble on the predictions from different models.
	
	Parameters:
	RF_predict (list): Predictions from the Random Forest model.
	SVM_predict (list): Predictions from the Support Vector Machine model.
	kNN_predict (list): Predictions from the k-Nearest Neighbors model.
	NB_predict (list): Predictions from the Naive Bayes model.
	DT_predict (list): Predictions from the Decision Tree model.
	
	Returns:
	list: Ensemble predictions.
	"""
	ensemble_predict = []
	
	# Iterate over the predictions
	for i in range(len(RF_predict)):
		predict_list = [RF_predict[i], SVM_predict[i], kNN_predict[i], NB_predict[i], DT_predict[i]]
		
		# Perform majority voting
		vote = Counter(predict_list).most_common(1)[0][0]
		
		# Append the ensemble prediction
		ensemble_predict.append(vote)
	
	return ensemble_predict


def CombinedEnsemble(RF_predict, SVM_predict, kNN_predict, NB_predict, DT_predict):
	"""
	Performs combined ensemble on the predictions from different models.
	
	Parameters:
	RF_predict (list): Predictions from the Random Forest model.
	SVM_predict (list): Predictions from the Support Vector Machine model.
	kNN_predict (list): Predictions from the k-Nearest Neighbors model.
	NB_predict (list): Predictions from the Naive Bayes model.
	DT_predict (list): Predictions from the Decision Tree model.
	
	Returns:
	list: Ensemble predictions.
	"""
	ensemble_predict = []
	
	# Iterate over the predictions
	for i in range(len(RF_predict)):
		# Check the predictions in order of priority
		if RF_predict[i] != "unknown":
			ensemble_predict.append(RF_predict[i])
		elif SVM_predict[i] != "unknown":
			ensemble_predict.append(SVM_predict[i])
		elif kNN_predict[i] != "unknown":
			ensemble_predict.append(kNN_predict[i])
		elif NB_predict[i] != "unknown":
			ensemble_predict.append(NB_predict[i])
		else:
			ensemble_predict.append(DT_predict[i])
	
	return ensemble_predict


The whole process is summarized in the following diagram:

![classification_model_flowchart.png](./classification_model_flowchart.png)


## 3. Results and Discussion
First let's implement the methods described above and see how they perform on a benchmark dataset. We will use the [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset, which contains 60,000 images of 10 different classes. The images are 32x32 pixels and are colored. The classes are: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.

In [18]:
# https://stackoverflow.com/questions/69687794/unable-to-manually-load-cifar10-dataset
import ssl
ssl._create_default_https_context = ssl._create_unverified_context

In [20]:
(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

train_images_gray = rgb2gray(train_images)
test_images_gray = rgb2gray(test_images)

In [24]:
# Initialize an empty list to hold the GLCM features for each image
train_features = []
test_features = []

# Calculate the GLCM features for each image in the training set
for image in train_images_gray:
	glcm = simple_greycomatrix(image)
	features = calculate_glcm_features(glcm)
	train_features.append(features)

# Calculate the GLCM features for each image in the test set
for image in test_images_gray:
	glcm = simple_greycomatrix(image)
	features = calculate_glcm_features(glcm)
	test_features.append(features)

# Now, train_features and test_features are lists of dictionaries containing the GLCM features for each image.
# We can use these features to train and evaluate your machine learning model.

  return np.sum(((rows - mu_x) * (cols - mu_y) * glcm) / (sigma_x * sigma_y))


In [None]:
# Convert train_features to a numerical array
train_features_array = np.array([list(features.values()) for features in train_features])

# Train each model using training data
rf = RandomForestClassifier().fit(train_features_array, train_labels)
svm = SVC(probability=True).fit(train_features_array, train_labels)
knn = KNeighborsClassifier().fit(train_features_array, train_labels)
nb = GaussianNB().fit(train_features_array, train_labels)
dt = DecisionTreeClassifier().fit(train_features_array, train_labels)

# Convert test_features to a numerical array
test_features_array = np.array([list(features.values()) for features in test_features])

# Predict with each model using test data
RF_predict = rf.predict(test_features_array)
SVM_predict = svm.predict(test_features_array)
kNN_predict = knn.predict(test_features_array)
NB_predict = nb.predict(test_features_array)
DT_predict = dt.predict(test_features_array)
DT_predict = dt.predict(test_features)

In [None]:
voting_results = VotingEnsemble(RF_predict, SVM_predict, kNN_predict, NB_predict, DT_predict)
combined_results = CombinedEnsemble(RF_predict, SVM_predict, kNN_predict, NB_predict, DT_predict)

Unfortunately I didn't have time to actually test the code, but here are the summarized results from the original article:

1. **Performance of Individual Models:**
- **Random Forest (RF)**: Exhibited exceptional performance with 99.09% accuracy, 99.28% precision, and 98.96% recall, showing a strong balance in minimizing classification errors.
- **K-Nearest Neighbors (KNN) and Decision Tree:** Demonstrated average performance with 76.13% and 79.73% accuracy, respectively. Both models maintained a balance between precision and recall, indicating relatively balanced error - rates for positive and negative classifications.
- **Support Vector Machine (SVM)**: Showed the lowest performance with only 43.47% accuracy, 43.52% precision, and 41.48% recall, indicating frequent misclassifications.
- **Naive Bayes (NB)**: Had an accuracy of 50.90%, with 56.55% precision but a lower recall of 46.09%, indicating challenges in correctly identifying positive classes.

2. **Ensemble Models:**
- **Combined Classifier**: Achieved an accuracy of 98.88%, 99.01% precision, 98.72% recall, and an F1-score of 98.86%, indicating superior performance in classification.
- **Voting Ensemble**: Recorded lower metrics with 87.39% accuracy, 88.42% precision, 86.24% recall, and an F1 score of 86.96%.

3. **Confusion Matrix Analysis**
- **Voting Ensemble**: Performed well in predicting Classes 0, 2, and 3, but had some difficulty with Class 2.
- **Combined Classifier**: Showed almost identical high accuracy in all classes to the Voting Ensemble, with perfect predictions for classes 1 and 3.

4. **Bar Chart Analysis** (Fig. 4):

![fig4.png](./fig4.png)

- Models like RF, Voting Ensemble, and Combined Classifier nearly reached 100% accuracy in predicting all classes.
- SVM and NB were less accurate, particularly in predicting classes 1, 2, and 3.

5. **Overall Model Evaluation** (Table 3):

| Classifier | Accuracy | Precision | Recall | F1 Score |
|------------|----------|-----------|--------|----------|
| Random Forest (RF) | 0.993 | 0.976 | 1.000 | 0.988 |
| k-Nearest Neighbors (k-NN) | 0.871 | 0.730 | 0.852 | 0.786 |
| Decision Tree (Tree) | 0.868 | 0.778 | 0.790 | 0.784 |
| Support Vector Machine (SVM) | 0.599 | 0.302 | 0.481 | 0.371 |
| Naive Bayes (NB) | 0.665 | 0.198 | 0.658 | 0.305 |
| Voting Ensemble (VE) | 0.924 | 0.786 | 0.952 | 0.861 |
| Combined Classifier (CC) | 0.993 | 0.976 | 1.000 | 0.988 |

- RF and Combined Classifier had the highest accuracy (0.993), precision (0.976), recall (1.000), and F1 score (0.988).
- SVM had the lowest accuracy (0.599) and recall (0.481), while NB had the lowest precision (0.198).

## 5. Conclusion
The conclusion of the study highlights the performance of various classification models and their use in ensemble methods:

1. **Performance of Individual Models**:

- **Random Forest (RF)**: Exhibited high accuracy, standing out as the most effective model among those tested.
- **Support Vector Machine (SVM) and Naive Bayes (NB):** Struggled with classification tasks, with SVM notably recording the lowest accuracy at 43.47%.
- **K-Nearest Neighbors (KNN) and Decision Tree**: Demonstrated moderate performance but maintained a balance between precision and recall.
2. **Ensemble Models**:

- Voting Ensemble: Achieved an accuracy of 87.39%, showing good potential.
- Combined Classifier: Outperformed the Voting Ensemble with an accuracy of 98.88%, precision of 99.01%, recall of 98.72%, and an F1 score of 98.86%.
3. **Implications for Future Development**:
- The success of the Combined Classifier, in particular, suggests that ensemble methods can significantly enhance the performance of models with lower accuracy. This opens avenues for further development and improvement in classification tasks.

In summary, the study demonstrates the effectiveness of ensemble methods, especially the Combined Classifier, in elevating the performance of individual classification models. This approach is especially beneficial for models that individually show lower accuracy, highlighting the potential of ensemble techniques in the field of classification.