Skip to content

Methods for data processing and obtaining predictions for Project 2 of The 2nd International Competition for Structural Health Monitoring.

Notifications You must be signed in to change notification settings

MatZar01/IC_SHM_P2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 

Repository files navigation

IC SHM 2021 Project 2 submission readme

This is a readme file describing the operation and processing of data for the purposes of implementing Project 2 solution from The 2nd International Competition for Structural Health Monitoring. The work was done by me (Mateusz Żarski, MSc) and my associate (Bartosz Wójcik, MSc) at the Institute of Theoretical and Applied Informatics of the Polish Academy of Sciences.

Table of contents

General Info

For the purposes of the competition solution, we have created a robust pipeline of deep learning models using Python 3 in which we utilized Detectron2 framework for image semantic segmentation and fork of our own framework - KrakN for training multiple image recognition models. Our pipeline of operations needed for performing all of the competition tasks, with addition of Task 0 for masking background, presents itself as follows:

Our pipeline

In total, we use four models of deep machine learning:

  • For masking the background - semantic segmentation,
  • For segmentation of construction elements,
  • For defect segmentation,
  • For damage state detection - image recognition.

A detailed description of the individual operations performed within the pipeline is described in the associated paper (will be uploaded after the competition ends). It also contains a description of the tests of other machine learning methods against which the solution from the project was compared.

Dependencies

Our solution requires the following dependencies (packages in the latest version as of January 18, 2022, unless specified otherwise):

  • TensorFlow == 1.12.0
  • Detectron2
  • Scikit-learn
  • Numpy == 1.16.2
  • OpenCV == 4.4.0
  • Matplotlib
  • H5Py
  • Progressbar
  • Imutils
  • Pillow

Python version 3.8.10 was used, but different versions will also probably work fine (but we didn't check them).

Also please note, that strings containing paths to folders in our Python scripts may need to be changed in order to run properly on your system (we did all the work on Linux machine, so check your backslash).

Directory structure

To use the solution we propose, a certain directory structure should be maintained, that is also consistent with out repository structure. The structure of the project with the names of the scripts is presented in the figure below:

Directory structure

In the diagram, folders are marked in blue and Python scripts are marked in yellow. How to use each of them will be described in the next section.

Usage

In order to use our solution with the dataset provided in Project or reproduce our results, cetrain steps have to be followed in order.

  1. Split dataset to training and testing subsets.

First, dataset has to be split using split_dataset.py script. It will produce the split in 4:1 ratio and known pseudo-random algorithms' seed and place images in dataset_reworked directory. The script uses .csv files with image names provided in the Project.

  1. Rework labels.

In the second steps labels have to be reworked with rework_labels.py so that they no longer are read by cv2 library as RGB images but as 8-bit 1 channel images instead. This step is performed to make the images management a little bit easier, as they will be read as arrays from now on.

  1. Make dataset for Task 0.

Now, bcg_remover.py have to be run in order to prepare the dataset to train background removal model. Images will be placed in dataset_reworked_no_bcg directory. The same images will later be used for Task 2.

  1. Make dataset for Task 1 and Task 3

In the last step of dataset preparation, separate datasets for defect and damage state detection have to be prepared. In order to do so, run comp_split.py and ds_dataset.py. In the result, yet another set of training/testing images will be created in ./semantic_segmentation/dataset_comp and ./damage_state_detection/dataset directories. After this step, the creation of datasets is finally finished. Details on transforming images to datasets are described in detail in the article.

  1. Train models for Task 0, 1 and 2.

In order to train models for tasks 0, 1 and 2 with the gathered data, use detectron_train.py script. Note, that you will have to set your task and dataset paths manually. After the training is complete, DL model will be saved in ./output_{task}_seg directory. This script also provides methods for prediction visualization, but if you want to perform evaluation, run detectron_eval.py and the resulting metrics will be saved in results directory.

Note, that the scripts use #%% symbols - thus they can be used in two ways: in single continoous run or in jupyter-style cell by cell runs.

  1. Train model for damage state detection.

In order to train model for the last task, first features from the dataset have to be extracted and saved to .hdf5 file with extract_features.py script. By default, ResNet 50 is used as feature extractor, but it can be changed to any pretrained CNN from tensorflow.keras.applications (we just found out that ResNet works best).

After features are extracted, train new classifier with train_model.py. The process can take up to 4 hours, as it uses only CPU and processes ~60k images. After training, new classifier will be saved as ResNet_clf.cpickle. It can later be used for inference as classifier for ResNet CNN and properly reworked images.

And thats it.

Use examples

Below is a video showing our solutions' pipeline in motion (it will redirect to external page).

Click me ;-)

Also, here are some images showing various tasks performed by our solution:

Task 0

Task 0: background masking

Task 1

Task 1: defect detection

Task 2

Task 2: element segmentation

Task 3

Task 3: damage state detection

About

Methods for data processing and obtaining predictions for Project 2 of The 2nd International Competition for Structural Health Monitoring.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages