Skip to content

Performed feature extraction on pre-trained and fine-tuned Convolution Neural Networks for enhanced accuracy on dataset of BreakHis through application of novel Genetic Algorithm

License

Notifications You must be signed in to change notification settings

raj-1411/Deep-Convolutional-Neural-Networks-improvisation-with-applied-Genetic-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep-Convolutional-Neural-Networks-improvisation-with-applied-Genetic-Algorithm

Project Summary

Performed feature extraction on pre-trained and fine-tuned Convolution Neural Networks for enhanced accuracy on dataset of BreakHis through application of novel Genetic Algorithm

Project Description

This is a python-based project with a motivation to achieve enhanced accuracy on predictions made by highly complex and state-of-the-art CNNs with the use of Genetic Algorithm. For this project, popular BreakHis dataset was employed to verify feature extraction, classifying between benign and malignant types of tumors detected under histopathological images of affected tissue. For feature extraction, three Convolution Neural Networks are used with their pre-final layer modified for the need of the objective. As the models get trained and evaluated on respective datasets, features with the best accuracy on validation set is retained for further operation down the line. The final features undergo different levels in Genetic Algorithm to eliminate redundant and uncanny features from the total feature space. The filtered features now procure significant increment in accuracy on the validation dataset.

Dataset description

The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). To date, it contains 2,480 benign and 5,429 malignant samples (700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format).
The dataset is available at:
https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/

Classes of Division

In this project, the histopathological image samples of human breast tissue have been classified into two categories

  • Benign
  • Malignant

Convolution Neural Network models used

Three CNN models may be applied one at a time on the dataset for feature extraction

  • Visual Geometry Group (VGG-19)
  • ResNet-18
  • GoogLeNet

Classifiers used

Three types of classifiers are employed for fitness evaluation

  • Supoort Vector Machines (kernel<--rbf)
  • K Nearest Neighbours (neighbours<--2)
  • Multi-layer Perceptron

Genetic Algorithm Visualization:

  • GA roadmap
    
    ppt

Accuracy Plots Over Generations

Different extractors paired with MLP classifer for GA classification gives three plots of accuracy vs generations: Epoch-10 Generations-10

  • GoogLeNet with MLP
    
    image
  • VGG-19 with MLP
    
    image
  • ResNet-18 with MLP
    
    image

Dependencies

Since the entire project is based on Python programming language, it is necessary to have Python installed in the system. It is recommended to use Python with version >=3.9. The Python packages which are in use in this project are matplotlib, numpy, pandas,scikit-learn, torch and torchvision. All these dependencies can be installed just by the following command line argument

  • pip install requirements.txt

Code implementation

Data paths :

  Current directory ---->   data
                              |
                              |
                              |               
                              ------------------>  train
                              |                      |
                              |             -------------------------
                              |             |        |              |
                              |             V        V              V
                              |          class 1   class 2 ..... class n
                              |
                              |
                              |              
                              ------------------>   val
                                                     |
                                            -------------------------
                                            |        |              |
                                            V        V              V
                                         class 1   class 2 ..... class n
  • Where the folders train and val contain the folders for different classes of histopathological images of respective type of breast tissue tumor in .jpg/.png format.

Training and Evaluation :

  usage: main.py [-h] [-data DATA_PATH] [-classes NUM_CLASSES] [-ext EXT_TYPE] [-classif CLASSIF_TYPE]

  Application of Genetic Algorithm

  optional arguments:
    -h, --help            show this help message and exit
    -data DATA_PATH, --data_path DATA_PATH
                          Path to data
    -classes NUM_CLASSES, --num_classes NUM_CLASSES
                          Number of data classes
    -ext EXT_TYPE, --ext_type EXT_TYPE
                          Choice of extractor
    -classif CLASSIF_TYPE, --classif_type CLASSIF_TYPE
                          Choice of classifier for GA

Run the following for training and validation :

  python main.py -data data -classes n -ext resnet -classif MLP

Specific tokens :

  GoogLeNet: 'googlenet'
  VGG-19: 'vgg'
  ResNet-18: 'resnet'
  SVM: 'SVM'
  MLP: 'MLP'
  KNN: 'KNN'          

About

Performed feature extraction on pre-trained and fine-tuned Convolution Neural Networks for enhanced accuracy on dataset of BreakHis through application of novel Genetic Algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages