# Computationally measuring image texture 

![Texture](texture_patches.jpeg)

In [None]:
!pip install mahotas
!pip install pycountry
!pip install countryinfo
!pip install pingouin
#pip install openpyxl

In [None]:

import pandas as pd
import mahotas as mt
import glob
import os
import PIL
from PIL import ImageOps
import numpy as np
import pycountry
from countryinfo import CountryInfo
import pingouin as pg
from IPython.display import Image, display
import seaborn as sns
sns.set()
import pingouin as pg
import plotly.express as px
from collections import Counter
from scipy.stats import entropy
from PIL import Image 

# What are Haralick image texture features?

An image texture is a small-scale image feature that is determined by the spatial arrangement of pixel intensities. A texture does <b>not</b> depend on what an image depicts, but the two are often associated (images that depict large numbers of small particles will often have a similar texture, for instance).

Haralick features are a set of image texture descriptors derived from the Gray Level Co-occurrence Matrix (GLCM), which is a method of examining the texture of an image by considering the spatial relationship of pixels. The GLCM is a matrix that represents how often different combinations of pixel brightness values (gray levels) occur in an image. From the GLCM, various statistical measures (Haralick features) can be calculated to describe the texture of the image. There are 14 commonly used Haralick features, each capturing different aspects of texture:

1. Angular Second Moment (ASM): Measures image homogeneity. Higher values indicate more homogeneity or uniformity in the image texture.

2. Contrast: Measures the local variations in the gray-level co-occurrence matrix. Higher contrast values indicate greater disparities in pixel intensities.

3. Correlation: Evaluates the joint probability occurrence of the specified pixel pairs. High correlation indicates a predictable relationship between pixel values.

4. Sum of Squares: Variance: Reflects the variance of the image intensities. It's a measure of the spread or dispersion of pixel values.

5. Inverse Difference Moment (IDM): Also known as Homogeneity. It's high when the image has less contrast, indicating more homogeneity.

6. Sum Average: The average value of the sum of gray levels of pixel pairs. It's a measure of the overall brightness.

7. Sum Variance: Measures the variance of the sum of the GLCM. It assesses the variance in the sum average.

8. Sum Entropy: Measures the randomness or complexity in the sum of gray levels. Higher values indicate more complexity.

9. Entropy: Quantifies the disorder or complexity of the image. Higher entropy values imply more complex texture patterns.

10. Difference Variance: Measures the variance in the difference between the gray levels of the pixel pairs.

11. Difference Entropy: Measures the complexity or randomness of the differences between the gray levels of the pixel pairs.

12. Information Measures of Correlation I & II: These two features provide information about the complexity of the image texture as seen in the GLCM. They measure how correlated a pixel is to its neighbor over the whole image.

13. Maximal Correlation Coefficient (MCC): This measures the correlation between the probabilities of the pixel pairs. It requires eigenvalue calculations and is often more computationally intensive.

## Image entropy

### 1. Shannon entropy

Let's start with entropy. You'll remember that entropy measures how predictable a probability distribution is. The formula for entropy is:

$$
H(X) = - \sum_{i=1}^{n} p(x_i) \log_2 p(x_i)
$$

Here, $H(X)$ gives the entropy in bits, where $x_i \in X$ is an item in a discrete probability distribution. But how can we apply this to an image? The first step comes with recognising that images have what's known as a $bit$ encoding. This gives the levels of intensity a pixel of an image can take. The most common $bit$ encoding is 8-$bit$ encoding, which allows 256 ($2^8 = 256$) levels of intensity. This means that every image can be thought of as a histogram, where the bars of the histogram represent the counts of the pixels at each level of intensity. In this coding scheme, the $0^{th}$ intensity is black and the $255^{th}$ intensity is white.

![Virginia Woolf](woolf.jpeg)

Let's take an example image. This 8-$bit$ image consists of 50,400 pixels, and has $width \times height$ dimensions of $180 \times 280$ pixels. Extracting the intensty of each of these pixels gives the following counts:

![Woolf histogram](vwoolf_histogram.jpeg)

If we were to pick a pixel at random from this image, what intensity has the highest prolability of being picked? By thinking of the image in this way, we can represent it as a probability distribution across pixel intensities––and this allows us to use calculate the entropy of an image. If a specific intensity dominates, the image will have low entropy (it is very predictable); if a lot of intensities occur with equal frequency, it will have high entropy.

<b>The Shannon entropy of this image is 7.65 $bits$.</b>


### 2. Haralick entropy

Though Shannon entropy is useful, it needs to be refined to capture image texture. This is because image texture is defined across the <b>differences</b> between pixels. The Haralick implementation of entropy captures this by way of what's known as the gray level cooccurrence matrix (GLCM). This looks at how often a transition between two pixel intensities occurs in four directions: horizontal, vertical, and two diagonals.

![GLCM](GCLM.jpeg)

The actual GLCM is defined as a $N \times N$ matrix, where $N$ is the number of intensities and each entry is the number of times that the $(i,j)$ pair occurs. There are four GLCMs for every image, but these are often averged to give a single GLCM. 

The Haralick entropy formula works by getting the expected value of the surprise of every co-occurring pair of pixel intensities $(i,j)$ in all the GLCMs in the image. Here, $N$ is the number of possible pixel intensities, which in an 8-$bit$ image is 256. 

$$
H = -\sum_{i=0}^{N_g-1} \sum_{j=0}^{N_g-1} P(i, j) \log_2 P(i, j)
$$

Note that if there is no GLCM where $i$ and $j$ are adjacent, then $P(i,j)$ will be zero. (Edge and corner pixels that do not have eight neighbours are ignored.)

![Virginia Woolf](woolf.jpeg)

<b>The Haralick entropy of this image is 12.4 $bits$.</b>

## Image contrast

Contrast measures the difference in intensity between the brightest and darkest parts of an image. It's formula is:

$$
\text{Contrast} = \sum_{i=0}^{N_g-1} \sum_{j=0}^{N_g-1} P(i, j) (i - j)^2
$$

Like Haralick entropy, all possible pairs of pixel intensities $(i,j)$ are assigned a probability $P(i,j)$, with this probability only being greater than zero if a pixel with intensity $i$ occurs next to a pixel with intensity $j$ in a GCLM. This is then multiplied by the square of the difference in intensity, so that big differences are 'rewarded' and small differences are 'punished'. In other words, we take the expected value of the differences in intensity between adjacent pixels. 

![Eliot](eliot.jpg)

<b>The Haralick contrast of this image is 667.19 $bits$.</b>

## How to compute Haralick features using the `mahotas` library

The `mahotas` library computes all 14 Haralick features on any suitably processed image. This processing is usually done using `PIL` (the python image library) and `numpy`. The steps are:

* Open the image using the `Image.open` command
* Convert the image to `RGBA` format using `.convert("RGBA")`
* Convert to 8-$bit$ using `.convert("L")`
* Convert to grayscale using `ImageOps.grayscale`
* Convery to `numpy` array using `np.asarray`

Once this has been done, the Haralick features can be extracted using `mahotas`:

* `features = mt.features.haralick(i, return_mean = True, compute_14th_feature=True)`

In summary, the steps are:

```
img = Image.open("path_to_image.png")
img = img.convert("RGBA"),
img = img.convert("L")
img = ImageOps.grayscale(img)
img = np.asanyarray(img)
```

## Exercise: Get the Haralick entropy and contrast of each of these images

![Sam](beckett.jpeg)

![Jim](joyce.jpeg)

# Can we predict image textures from features that are not themselves visual?

Here, we are going to make a risky claim and test it. Specifically, we are going to test whether we we can predict the visual features of national flags from the demographic and geographical characteristics of the countries they represent. This will come in the form of three hypotheses:

1. $H_1$: The greater the ethnic diversity of a country, the higher the entropy of its flag
2. $H_2$: The more borders a country has, the lower the contrast of its flag
3. $H_3$: Contrast will trump entropy as the optimised variable in most flags

How can we test this? 

1. Get a [dataset of national flags](https://flagpedia.net/)
2. Get the entropy and contrast of each flag
3. Measure the ethnic diversity and number of borders of each country
4. Statistically test whether ethnic diversity predicts entropy and number of borders predicts contrast
5. Evaluate the distribution of flags in the entropy-contrast space

## Compute Haralick features of all our flags and create a dataframe

In [None]:
filenames = glob.glob('*.png')  # list of all .png files in the directory

names = []
images = []

for i in filenames:
    names.append(os.path.basename(i)[:-4])
    img = Image.open(i)
    img = img.convert("RGBA")
    img = img.convert("L")
    img = ImageOps.grayscale(img)
    img = np.asanyarray(img)
    images.append(img)

In [None]:
haralick = [mt.features.haralick(i, return_mean = True, compute_14th_feature=True) for i in images]


features = ['angular_2nd_momentum', 'contrast', 'correlation', 'SS_variance', \
            'Inverse_diff_moment', 'sum_average', 'sum_variance', 'sum_entropy', \
            'entropy','difference_variation', 'difference_entropy', 'info_corr_1', \
            'info_corr_2', 'max_corr_coeff']

h_df = pd.DataFrame(haralick, columns = features)
h_df['short_names'] = names

# Get full country names

In [None]:
full_name = []

for i in h_df['short_names']:
    try:
        full_name.append(pycountry.countries.get(alpha_2=i).name)
    except:
        full_name.append(np.nan)
        
h_df['full_name'] = full_name

In [None]:
h_df

# Get number of borders

In [None]:
borders = []


for i in h_df['full_name']:
    try:
        a = CountryInfo(i)
        borders.append(len(a.borders()))
    except:
        borders.append(np.nan)

h_df['borders'] = borders

In [None]:
h_df

## Get data on ethnic diversity

In [None]:
ethnic = pd.read_csv('ethnic_fractions.csv')

In [None]:
h_df = pd.merge(h_df, ethnic, on='full_name', how='left')


In [None]:
data = h_df[['ethnic fractionalization', 'borders', 'entropy', 'contrast', 'full_name']]

In [None]:
px.scatter(h_df, x = 'ethnic fractionalization', y = 'entropy', hover_data = ['full_name'], trendline = 'ols')

In [None]:
lm = pg.linear_regression(h_df['ethnic fractionalization'], h_df['entropy'], remove_na = True)

In [None]:
lm

In [None]:
px.scatter(h_df, x = 'borders', y = 'contrast', hover_data = ['full_name', 'borders'], trendline = 'ols')

In [None]:
lm = pg.linear_regression(h_df['borders'], h_df['contrast'], remove_na = True)

In [None]:
lm

In [None]:
mid_e = (data['entropy'].min() + data['entropy'].max()) / 2
mid_c = (data['contrast'].min() + data['contrast'].max()) / 2

data['entropy'] = data['entropy'] - mid_e
data['contrast'] = data['contrast'] - mid_c

In [None]:
fig = px.scatter(data, x="entropy", y="contrast", hover_data = ['full_name', 'borders'])
fig.show()

## Write-up

A more detailed exploration of these results [can be found in this preprint](https://www.preprints.org/frontend/manuscript/d4f0a405c00c26799d2872250c47fd12/download_pub).