Skip to content

This is a pipeline which can be used to classify ‘N’ number of classes feed as images. It can accept input in the form of ’.csv’, ‘.json’ and unsplit dataset. It has training, testing, prediction and validation classes for their respective purposes with a preprocessing template. It has functionality to choose model, to store partially or fully t…

Notifications You must be signed in to change notification settings

alexHxun/image-classification-pipeline

 
 

Repository files navigation

Image Classification Pipeline

Introduction to Modules :

  • projects : contains project folders
  • main.py : triggers training, testing and prediction
  • prepare_dataset : contains code to read .csv, .json, .txt files and finally convert it into train & test dataset
  • preprocessing.py : contains code to preprocess the dataset
  • config.py : contains all the project configurations and stays inside every independent project directory
  • model.py : contains various models for training, save and returns compiled model to train.py
  • train.py : fit the selected model
  • test.py : test the trained model
  • predict.py : make predictions on explicitly provided inputs
  • utils.py : contains common utilities & functions which are in common use
  • validation.py : valdiate requirements before training the model

Projects :

  • This is the home directory for all the image classification projects.

projects/{project_name}/

  • This is dedicated directory for a project
  • The name of the project must be from pool of alpha-numeric characters and symbols ('_', '-').

Sub-directories & files needed for a project :

config.py : projects/{project name}/config.py

  • stores model configurations of the particular project

dataset directory : projects/{project name}/dataset

  • contains dataset in following sub-directories :

    • train : contains training dataset ----> projects/{project name}/dataset/test

    • test : contains testing dataset ----> projects/{project name}/dataset/train

  • there must be sub-directories for each class of data containing the corresponding images. The names of the sub-directories must be the names of their respective classes

  • we will be using ImageDataGenerator, available in keras to train our model on the available data, this way the process becomes much simpler in terms of code.

model directory : projects/{project name}/model

  • contains saved models, weights and tensorboard logs in following sub-directories :

    • saved models : contain saved models ----> projects/{project name}/model/saved models

    • weights : contain weight files ----> projects/{project name}/model/weights

    • tblogs : contain tblogs ----> projects/{project name}/model/tblogs

  • these directories must be created before starting training process

  • all the data, config, logs, etc should be inside this folder only for this particular project

  • this whole directory should be handle with care, all learning is stored inside this folder

predict directory : projects/{project name}/predict

  • contains images for prediction purpose in the following sub-directories :

    • input : contains input images ----> projects/{project name}/predict/input

    • output : - contains output images generated by prediction phase code ----> projects/{project name}/predict/output

Project Execution Command :

For Training :

  • python main.py train {project_name} (Initial model training)
  • python main.py train {project_name} resume (Resume last training model)
  • python main.py train {project_name} example_model.h5 (Resume training of explicitly called model)

~ All these commands perform model training

For Testing :

  • python main.py test {project_name} ~ This command performs testing on provided testing dataset

For Prediction :

  • python main.py predict {project_name} ~ This command make predictions on the provided data inputs in predict/input directory

To View Tensorboard :

  • tensorboard --logdir projects/{project_name}/model/tblogs

Requirements :

  • python3
  • keras
  • fire
  • cv2
  • PIL
  • numpy
  • pandas
  • matplotlib
  • importlib
  • tqdm
  • uuid
  • shutil
  • split_folders

About

This is a pipeline which can be used to classify ‘N’ number of classes feed as images. It can accept input in the form of ’.csv’, ‘.json’ and unsplit dataset. It has training, testing, prediction and validation classes for their respective purposes with a preprocessing template. It has functionality to choose model, to store partially or fully t…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%