<h1 style = "text-align:center"> <strong>Artificial Neural Network </strong></h1>

<h2 style = "text-align:center"> People Detection </h2>

<h5 style = "text-align:center"> Authors: <a href="https://github.com/AlvaroVasquezAI">Álvaro García Vásquez</a>, <a href="#">Luis Alfredo Cuamatzi Flores</a> and <a href="#">Fernando Daniel Portilla Posadas</a> </h5>

Description: This notebook shows how to implement an Artificial Neural Network to detect people in images.

<h3 style = 'text-align:center'> <strong>Dataset </strong></h3>

We created a dataset with 100 images of the city with people and animals, and 100 images of the city with animals but without people. After that, we split the dataset into four categories: Absent, Animal, Noise, and People (A, N, Noise, P). This was done with a program that divides each image into 128x128 grids, manually labeling each grid as A, N, Noise, or P. The program saves the grids in the corresponding folder. We also created a file .csv for both folders V1 and V2. Each file .csv contains NumberOfImage, NumberOfGrid, Class and TypeOfFile of each grid.

<strong>Features:</strong>
- Size: 1024x1024 pixels
- Format: PNG
- Channels: RGB
- Images generated with Artificial Intelligence

<strong>Folder structure before splitting the dataset:</strong>
- dataset
    - V1 (with people and animals)
        - 1.png
        - 2.png
        - ...
        - 100.png
    - V2 (without people but with animals)
        - 1.png
        - 2.png
        - ...
        - 100.png

<strong> Folder structure after splitting the dataset: </strong>
- dataset
    - V1 (with people and animals)
        - output
            - A (Absent)
                - grid_V1_numberOfImage_numberOfGrid_A.png
                - ...
            - N (Animal)
                - grid_V1_numberOfImage_numberOfGrid_N.png
                - ...
            - Noise (Noise)
                - grid_V1_numberOfImage_numberOfGrid_Noise.png
                - ...
            - P (People)
                - grid_V1_numberOfImage_numberOfGrid_P.png
        - 1.png
        - 2.png
        - ...
        - 100.png
    - V2 (without people but with animals)
        - output
            - A (Absent)
                - grid_V2_numberOfImage_numberOfGrid_A.png
                - ...
            - N (Animal)
                - grid_V2_numberOfImage_numberOfGrid_N.png
                - ...
            - Noise (Noise)
                - grid_V2_numberOfImage_numberOfGrid_Noise.png
                - ...
            - P (People)
                - grid_V2_numberOfImage_numberOfGrid_P.png
        - 1.png
        - 2.png
        - ...
        - 100.png


<h3 style='text-align:center'> <strong>Tool for labeling images:</strong> </h3>
<img src="resources/images/UI.png" alt="UI" style = "display:block; margin-left:auto; margin-right:auto; width:50%;">

<h3 style='text-align:center'> <strong>Image Features:</strong> </h3>

<table style='border: 1px solid black; width: 100%'>
<thead>
    <tr '>
        <th style='border: 1px solid black; padding: 8px; text-align: left;'>Feature</th>
        <th style='border: 1px solid black; padding: 8px; text-align: left;'>Description</th>
        <th style='border: 1px solid black; padding: 8px; text-align: left;'>Justification for Neural Network Training</th>
    </tr>
</thead>
<tbody>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Color Channels (R, G, B)</td>
        <td style='border: 1px solid black; padding: 8px;'>Intensity values for red, green, and blue channels in the image.</td>
        <td style='border: 1px solid black; padding: 8px;'>Fundamental for capturing visual features, crucial for identifying elements like clothing or skin tones.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>RGB Mean</td>
        <td style='border: 1px solid black; padding: 8px;'>Average of the RGB values across the image.</td>
        <td style='border: 1px solid black; padding: 8px;'>Provides a baseline color metric, useful for color normalization and background differentiation.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>RGB Mode</td>
        <td style='border: 1px solid black; padding: 8px;'>Most frequent RGB values in the image.</td>
        <td style='border: 1px solid black; padding: 8px;'>Identifies dominant colors which can signify important features within the scene.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>RGB Variance</td>
        <td style='border: 1px solid black; padding: 8px;'>Measure of the spread of RGB values.</td>
        <td style='border: 1px solid black; padding: 8px;'>Useful for understanding color diversity, which might indicate areas of interest or changes in scene content.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>RGB Standard Deviation</td>
        <td style='border: 1px solid black; padding: 8px;'>Standard deviation of RGB values.</td>
        <td style='border: 1px solid black; padding: 8px;'>Highlights areas with high color variability, important for detecting edges and contours.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Color Histogram</td>
        <td style='border: 1px solid black; padding: 8px;'>Distribution of pixel intensities in color channels.</td>
        <td style='border: 1px solid black; padding: 8px;'>Essential for analyzing the color distribution and for segmenting images based on color intensity.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Gray Level Co-occurrence Matrix Properties</td>
        <td style='border: 1px solid black; padding: 8px;'>Statistical features extracted from how often different combinations of pixel brightness values (gray levels) occur in an image.</td>
        <td style='border: 1px solid black; padding: 8px;'>Provides textural features which are critical for recognizing patterns and structures within images that might not be visible through color alone.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Local Binary Patterns</td>
        <td style='border: 1px solid black; padding: 8px;'>Method for texture description where each pixel is compared with its surrounding pixels.</td>
        <td style='border: 1px solid black; padding: 8px;'>Useful for texture classification, a fundamental aspect when differentiating between different objects and their surroundings.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Histogram of Oriented Gradients (HOG)</td>
        <td style='border: 1px solid black; padding: 8px;'>Counts occurrences of gradient orientation in localized portions of an image.</td>
        <td style='border: 1px solid black; padding: 8px;'>Effective for object detection in vision tasks, particularly useful for detecting human forms in various poses and lighting conditions.</td>
    </tr>
    <tr>
        <td style='border: 1px solid black; padding: 8px;'>Peak Local Max</td>
        <td style='border: 1px solid black; padding: 8px;'>Identifies local maxima in an image, points where the region around a pixel has lower intensity values.</td>
        <td style='border: 1px solid black; padding: 8px;'>Helps to detect key points, which are essential for tasks like feature matching and scene understanding.</td>
    </tr>
</tbody>
</table>

<h3 style='text-align:center'> <strong>Extracted Features:</strong> </h3>

In [11]:
import numpy as np
import skimage.feature
import skimage.measure
import skimage.filters
import matplotlib.pyplot as plt

In [12]:
class Image:
    def __init__(self, image, name): 
        self.image = self.setImage(image)
        self.name = self.setName(name)
        self.numberOfGrid = self.setNumberOfGrid(name)
        self.numberOfImageBelonging = self.setNumberOfImageBelonging(name)
        self.datasetBelonging = self.setDatasetBelonging(name)
        self.classBelonging = self.setClassBelonging(name)
        self.size = self.setSize(image)
        
        self.colorChannelsRGB = self.extractColorChannelsRGB()
        self.RGBMean = self.calculateRGBMean()
        self.RGBMode = self.calculateRGBMode()
        self.RGBVariance = self.calculateRGBVariance()
        self.RGBStandardDeviation = self.calculateRGBStandardDeviation()
        self.colorHistogram = self.calculateColorHistogram()

        self.grayLevelCooccurrenceMatrixProperties = self.calculateGrayLevelCooccurrenceMatrixProperties()
        self.localBinaryPatterns = self.calculateLocalBinaryPatterns()
        self.histogramOfOrientedGradients = self.calculateHistogramOfOrientedGradients()
        self.peakLocalMax = self.calculatePeakLocalMax()

        self.allFeatures = [self.colorChannelsRGB, self.RGBMean, self.RGBMode, self.RGBVariance, self.RGBStandardDeviation, self.colorHistogram, self.grayLevelCooccurrenceMatrixProperties, self.localBinaryPatterns, self.histogramOfOrientedGradients, self.peakLocalMax]


    def setImage(self, image):
        return image
    
    def setName(self, name):
        return name
    
    def setNumberOfGrid(self, name):
        return name.split("_")[3]
    
    def setNumberOfImageBelonging(self, name):
        return name.split("_")[2]
    
    def setDatasetBelonging(self, name):
        return name.split("_")[1]
    
    def setClassBelonging(self, name):
        return name.split("_")[4]
    
    def setSize(self, image):
        return image.shape
     
    def extractColorChannelsRGB(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]

        return [redChannel, greenChannel, blueChannel]
    
    def calculateRGBMean(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]
        redMean = np.mean(redChannel)
        greenMean = np.mean(greenChannel)
        blueMean = np.mean(blueChannel) 

        return [redMean, greenMean, blueMean]
    
    def calculateRGBMode(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]
        redMode = skimage.exposure.histogram(redChannel)[1].argmax()
        greenMode = skimage.exposure.histogram(greenChannel)[1].argmax()
        blueMode = skimage.exposure.histogram(blueChannel)[1].argmax()

        return [redMode, greenMode, blueMode]
    
    def calculateRGBVariance(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]
        redVariance = np.var(redChannel.flatten())
        greenVariance = np.var(greenChannel.flatten())
        blueVariance = np.var(blueChannel.flatten())

        return [redVariance, greenVariance, blueVariance]
    
    def calculateRGBStandardDeviation(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]
        redStandardDeviation = np.std(redChannel)
        greenStandardDeviation = np.std(greenChannel)
        blueStandardDeviation = np.std(blueChannel)

        return [redStandardDeviation, greenStandardDeviation, blueStandardDeviation]
    
    def calculateColorHistogram(self):
        redChannel = self.image[:,:,0]
        greenChannel = self.image[:,:,1]
        blueChannel = self.image[:,:,2]
        redHistogram = skimage.exposure.histogram(redChannel)[0]
        greenHistogram = skimage.exposure.histogram(greenChannel)[0]
        blueHistogram = skimage.exposure.histogram(blueChannel)[0]

        return [redHistogram, greenHistogram, blueHistogram]

    def calculateGrayLevelCooccurrenceMatrixProperties(self):
        image_gray = skimage.color.rgb2gray(self.image)
        image_gray_u8 = (image_gray * 255).astype(np.uint8)
        glcm = skimage.feature.graycomatrix(image_gray_u8, distances=[1], angles=[0], levels=256, symmetric=True, normed=True)
        contrast = skimage.feature.graycoprops(glcm, 'contrast')[0, 0]
        dissimilarity = skimage.feature.graycoprops(glcm, 'dissimilarity')[0, 0]
        homogeneity = skimage.feature.graycoprops(glcm, 'homogeneity')[0, 0]
        energy = skimage.feature.graycoprops(glcm, 'energy')[0, 0]
        correlation = skimage.feature.graycoprops(glcm, 'correlation')[0, 0]

        return [contrast, dissimilarity, homogeneity, energy, correlation]

    def calculateLocalBinaryPatterns(self):
        image_gray = skimage.color.rgb2gray(self.image)
        image_gray_u8 = (image_gray * 255).astype(np.uint8)
        
        return skimage.feature.local_binary_pattern(image_gray_u8, P=8, R=1, method='uniform')

    def calculateHistogramOfOrientedGradients(self):
        image_gray = skimage.color.rgb2gray(self.image)

        return skimage.feature.hog(image_gray, pixels_per_cell=(16, 16), cells_per_block=(1, 1), orientations=9, visualize=False)

    def calculatePeakLocalMax(self):
        image_gray = skimage.color.rgb2gray(self.image)

        return skimage.feature.peak_local_max(image_gray, min_distance=1, threshold_abs=0.1, num_peaks=10)
    
    def generateFeatureVector(self):
        featureVector = np.array([])

        featureVector = np.append(featureVector, self.RGBMean)
        featureVector = np.append(featureVector, self.RGBMode)
        featureVector = np.append(featureVector, self.RGBVariance)
        featureVector = np.append(featureVector, self.RGBStandardDeviation)
        featureVector = np.append(featureVector, np.concatenate([ histogram.flatten() for histogram in self.colorHistogram ]))
        featureVector = np.append(featureVector, self.grayLevelCooccurrenceMatrixProperties)
        featureVector = np.append(featureVector, self.histogramOfOrientedGradients)
        featureVector = np.append(featureVector, self.peakLocalMax)

        return featureVector
    

Try to extract the following features from just one image:

In [13]:
testImgPath = "dataset/V1/output/P/grid_V1_4_40_P.png"

image = skimage.io.imread(testImgPath)
ImageTest = Image(image, "grid_V1_4_40_P.png")



In [14]:
# All features
print("All features: ")
for feature in ImageTest.allFeatures:
    print(feature)

All features: 
[array([[ 37,  37,  37, ...,  54,  55,  55],
       [ 35,  36,  35, ...,  55,  55,  57],
       [ 34,  36,  36, ...,  52,  51,  54],
       ...,
       [136, 136, 130, ...,  25,  25,  25],
       [121, 128, 120, ...,  24,  24,  25],
       [112, 117, 117, ...,  25,  25,  25]], dtype=uint8), array([[ 31,  31,  31, ...,  43,  45,  45],
       [ 30,  31,  29, ...,  45,  46,  47],
       [ 31,  32,  31, ...,  41,  40,  44],
       ...,
       [111, 111, 106, ...,  22,  22,  21],
       [ 95, 101,  96, ...,  22,  21,  21],
       [ 85,  91,  93, ...,  22,  22,  21]], dtype=uint8), array([[33, 32, 34, ..., 35, 36, 37],
       [30, 31, 31, ..., 35, 35, 38],
       [31, 31, 31, ..., 34, 31, 36],
       ...,
       [91, 92, 87, ..., 26, 26, 27],
       [77, 83, 78, ..., 27, 26, 26],
       [68, 73, 74, ..., 26, 25, 25]], dtype=uint8)]
[48.37353515625, 39.5794677734375, 39.05059814453125]
[189, 175, 180]
[720.5682351589203, 527.0259114354849, 420.05267664417624]
[26.84340207870307

In [15]:
# Get each feature
print("Color channels RGB: ")
print(ImageTest.colorChannelsRGB)

print("RGB mean: ")
print(ImageTest.RGBMean)

print("RGB mode: ")
print(ImageTest.RGBMode)

print("RGB variance: ")
print(ImageTest.RGBVariance)

print("RGB standard deviation: ")
print(ImageTest.RGBStandardDeviation)

print("Color histogram: ")
print(ImageTest.colorHistogram)

print("Gray level cooccurrence matrix properties: ")
print(ImageTest.grayLevelCooccurrenceMatrixProperties)

print("Local binary patterns: ")
print(ImageTest.localBinaryPatterns)

print("Histogram of oriented gradients: ")
print(ImageTest.histogramOfOrientedGradients)

print("Peak local max: ")
print(ImageTest.peakLocalMax)

Color channels RGB: 
[array([[ 37,  37,  37, ...,  54,  55,  55],
       [ 35,  36,  35, ...,  55,  55,  57],
       [ 34,  36,  36, ...,  52,  51,  54],
       ...,
       [136, 136, 130, ...,  25,  25,  25],
       [121, 128, 120, ...,  24,  24,  25],
       [112, 117, 117, ...,  25,  25,  25]], dtype=uint8), array([[ 31,  31,  31, ...,  43,  45,  45],
       [ 30,  31,  29, ...,  45,  46,  47],
       [ 31,  32,  31, ...,  41,  40,  44],
       ...,
       [111, 111, 106, ...,  22,  22,  21],
       [ 95, 101,  96, ...,  22,  21,  21],
       [ 85,  91,  93, ...,  22,  22,  21]], dtype=uint8), array([[33, 32, 34, ..., 35, 36, 37],
       [30, 31, 31, ..., 35, 35, 38],
       [31, 31, 31, ..., 34, 31, 36],
       ...,
       [91, 92, 87, ..., 26, 26, 27],
       [77, 83, 78, ..., 27, 26, 26],
       [68, 73, 74, ..., 26, 25, 25]], dtype=uint8)]
RGB mean: 
[48.37353515625, 39.5794677734375, 39.05059814453125]
RGB mode: 
[189, 175, 180]
RGB variance: 
[720.5682351589203, 527.0259114354

In [18]:
# Generate feature vector
print("Feature vector: ")
vector = ImageTest.generateFeatureVector()
print(vector)

print("Size of feature vector: ")
print(vector.shape)

Feature vector: 
[ 48.37353516  39.57946777  39.05059814 ...  86.         111.
  16.        ]
Size of feature vector: 
(1160,)
