<a href="https://colab.research.google.com/github/MajPaji/AI-corr-eye/blob/main/corr_eye_pro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project G19-0052 Computer vision on different corrosion mode recognition


It is difficult even for trained engineer eyes to recognize between different types of failure, but it is crucial to recognize these modes of failure to develop new products or modify the production steps to avoid these types of corrosion for current products. It needs to do many additional laboratory experiments like metallography, chemical analysis, and SEM to understand and recognize between different types of failure. 

In the R&D department, we have a long history of these kinds of measurements and failure analysis. This project aims to use deep learning and computer vision using TensorFlow to recognize between these failures. This could make a great reference for classification between different corrosion modes. This would avoid a lot of time and resources to recognize and understand failures. Data collected from 2016 to 2020 in the case studies that have recognized these failures. 

## Project objectives

There are two common failures in aluminum alloy pitting and IGC. The first aim is to recognize these modes of failure and later add more complicated types to recognize like Ti-effect, crevice, de-alloying, etc.

In [None]:
# importing the data set IGC and pitting (data_set_igc_pit.zip)
# data set file in content>gdrive>MyDrive>data_set_igc_pit.zip
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [14]:
# unzip the file to coolab tmp folder

import os, shutil
import zipfile

zip_file_dir = '/content/gdrive/MyDrive/igc-pit.zip'
zip_file = zipfile.ZipFile(zip_file_dir, 'r')
zip_file.extractall('/tmp/igc-pit')
zip_file.close()

## Data set for IGC and Pit failure cases

There were 368 images collected for IGC failure of the products and 737 for pitting failure of the products. In total there were 1105 images in this data set.

In [5]:
igc_dir = os.path.join('/tmp/igc-pit/igc-pit/igc')
pit_dir = os.path.join('/tmp/igc-pit/igc-pit/pit')

print(f'number of IGC failure images: {len(os.listdir(igc_dir))}')
print(f'number of Pit failure images: {len(os.listdir(pit_dir))}')

number of IGC failure images: 368
number of Pit failure images: 737


In [13]:
# divid the train file to train test split
# get 10% of the images for tesing and the rest for training

import splitfolders

splitfolders.ratio("/tmp/igc-pit/igc-pit", output="/tmp/igc-pit/output", seed=1337, ratio=(.8, .1, .1))

Copying files: 1105 files [00:00, 2032.29 files/s]


In [17]:
# move test files to training folders

igc_source_dir = os.path.join('/tmp/igc-pit/output/test/igc/')
igc_target_dir = os.path.join('/tmp/igc-pit/output/train/igc/')

file_names = os.listdir(igc_source_dir)

for f in file_names:
  shutil.move(os.path.join(igc_source_dir, f), igc_target_dir)


pit_source_dir = os.path.join('/tmp/igc-pit/output/test/pit/')
pit_target_dir = os.path.join('/tmp/igc-pit/output/train/pit/')

file_names = os.listdir(pit_source_dir)

for f in file_names:
  shutil.move(os.path.join(pit_source_dir, f), pit_target_dir)

# Data set training and validation

The data set is organized to have one set for training and the other for validation. 90% of the data will be used for training and 10% for validation.

There are:

* IGC cases for train is 332 and validation set number is 36
* Pit cases for train is 664 and validation set number is 73

In [23]:
igc_train_dir = os.path.join('/tmp/igc-pit/output/train/igc')
pit_train_dir = os.path.join('/tmp/igc-pit/output/train/pit')

igc_val_dir = os.path.join('/tmp/igc-pit/output/val/igc')
pit_val_dir = os.path.join('/tmp/igc-pit/output/val/pit')

print(f'number of IGC cases for train is {len(os.listdir(igc_train_dir))} and validation set number is {len(os.listdir(igc_val_dir))}')
print(f'number of Pit cases for train is {len(os.listdir(pit_train_dir))} and validation set number is {len(os.listdir(pit_val_dir))}')

number of IGC cases for train is 332 and validation set number is 36
number of Pit cases for train is 664 and validation set number is 73
