In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Feature Extraction from X-Ray Images Using Concentric Rings

In this project, we extract useful features from X-ray images of bone conditions to identify whether a patient has osteoporosis or not.

The basic idea is to convert each image into a set of numerical features based on how the pixel values are distributed in concentric rings (annular regions) around the center of the image. Using these extracted features, we train the model for classification tasks.


We divide each image into rings around the center and calculate 4 features from each ring:

* Mean – Average pixel value in the ring

* Standard Deviation – How much the pixel values vary

* Shannon Entropy – Measures the randomness or information content

* Information Gain (IG) – Measures how much entropy reduces if we split the ring's pixels into two parts (above and below mean)

If we divide the image into 10 rings, we get 10 x 4 = 40 features.

We repeat this for different ring settings:

* 5 rings → 20 features

* 8 rings → 32 features

* 10 rings → 40 features

Each image is converted into a row of numbers, ending with a label indicating the class.

In [None]:
import os
import cv2
import numpy as np
import pandas as pd
from scipy.stats import entropy

In [None]:
# Set ring division values
k = [5, 8, 10]

In [None]:
folders = [
    "/content/drive/MyDrive/python/normal",
    "/content/drive/MyDrive/python/osteoporosis"
]

## Information Gain Calculation

This function calculates the information gain from splitting an image based on a threshold, set as the mean pixel value of the image. The image is then divided into two groups based on this threshold value.

The function calculates the entropy before and after the split:

* Entropy Before the Split: This is the entropy of the entire image (using a histogram of pixel values).

* Entropy After the Split: This is the weighted sum of the entropies of the two groups formed after the split (group1 and group2).

The difference between the entropy before and after the split gives the information gain, which indicates how much uncertainty is reduced by this thresholding.

In [None]:
def find_ig(img):
    if len(img) == 0:
        return 0
    threshold = np.mean(img)
    group1 = img[img <= threshold]
    group2 = img[img > threshold]

    hist = np.histogram(img, bins=256)[0] + 1
    before = entropy(hist)

    hist1 = np.histogram(group1, bins=256)[0] + 1
    e1 = entropy(hist1)

    hist2 = np.histogram(group2, bins=256)[0] + 1
    e2 = entropy(hist2)

    weight1 = len(group1) / len(img)
    weight2 = len(group2) / len(img)
    after = weight1 * e1 + weight2 * e2

    return before - after

The following function computes features from concentric rings within an image.

* It first calculates the distance from the image center to the corners and determines k rings based on this distance.

* For each ring, the function extracts pixel values and computes: Mean, Standard Deviation, Entropy, and Information Gain (IG).

These features are then collected and returned as a list.

In [None]:
def extract_features(img, k):
    h, w = img.shape
    center_y, center_x = h // 2, w // 2

    corners = [(0, 0), (0, w), (h, 0), (h, w)]
    d = max([np.hypot(cy - center_y, cx - center_x) for cy, cx in corners])

    radii = np.linspace(d / k, d, k)

    rows, cols = np.indices(img.shape)
    dist = np.hypot(rows - center_y, cols - center_x)
    features = []
    prev = 0

    for radius in radii:
        pixels = img[(dist <= radius) & (dist > prev)].flatten()

        mean = np.mean(pixels)
        std = np.std(pixels)
        hist, _ = np.histogram(pixels, bins=256, range=(0, 1), density=True)
        e = entropy(hist + 1e-10)
        ig = find_ig(pixels)

        features.extend([mean, std, e, ig])
        prev = radius

    return features

In [None]:
for i in k:
    data = []

    for label, folder in enumerate(folders):
        for filename in os.listdir(folder):
            if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
                path = os.path.join(folder, filename)
                img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
                if img is None:
                    continue
                features = extract_features(img, i)
                features.append(label)  # Add class ID
                data.append(features)
    col_names = []
    for r in range(i):
        col_names.extend([f"mean_{r}", f"std_{r}", f"entropy_{r}", f"ig_{r}"])
    col_names.append("label")
    df.to_csv(f"features_{i}.csv", index=False)