# Data Normalization and Featurization using ManufacturingNet
###### To know more about the manufacturingnet please visit: http://manufacturingnet.io/ 

## Let us import the package and other required libraries

In [2]:
import ManufacturingNet
import numpy as np

##### Let us download the dataset for the task

In [3]:
from ManufacturingNet import datasets

In [4]:
datasets.CWRUBearingData()

Downloading...
From: https://drive.google.com/uc?id=1nUjYdpJkEmjJTzG0j8EBZ9vQul0sedqk
To: /home/cmu/ManufacturingNet/tutorials/CWRUBearingData.zip
100%|██████████| 13.4M/13.4M [00:01<00:00, 11.9MB/s]


###### The data is downloaded in the root directoty. Ready to use!

The data has raw signals. There are total 2800 signals with each signal has 1600 values. We will first normalize and then use for feature extraction

##### Let us load the data

In [5]:
data = np.load('CWRU/signal_data.npy', allow_pickle = True)

In [6]:
print(data.shape)

(2800, 1600)


##### Let us first normalize the data. There are 3 types of normalization. We will see all of them one by one

Mean Normalization

In [6]:
from ManufacturingNet.preprocessing import MeanNormalizer

In [7]:
normalizer1 = MeanNormalizer(data, axis = 0)

In [8]:
normalized_data = normalizer1.get_normalized_data()

In [9]:
normalized_data

array([[ 0.13153407,  0.01315068, -0.0448077 , ..., -0.01209627,
         0.05209748,  0.04970108],
       [-0.08246213,  0.02606459,  0.13565075, ..., -0.24215809,
        -0.24862696, -0.1915978 ],
       [ 0.1459167 ,  0.15386776,  0.15871687, ..., -0.01165214,
         0.06925722,  0.04547518],
       ...,
       [-0.10122114, -0.05053481, -0.00952744, ...,  0.21535663,
         0.12591937,  0.03746052],
       [ 0.06095025,  0.06883685,  0.0719024 , ...,  0.25267312,
         0.14823252, -0.0234074 ],
       [ 0.11675116,  0.21315185,  0.25466715, ...,  0.041213  ,
         0.16024729,  0.28262297]])

Min Max Normalization

In [10]:
from ManufacturingNet.preprocessing import MinMaxNormalizer

In [11]:
normalizer2 = MinMaxNormalizer(data, axis = 0)

In [12]:
normalized_data = normalizer2.get_normalized_data()

In [13]:
normalized_data

array([[0.37149085, 0.40715883, 0.4523135 , ..., 0.48822004, 0.47005208,
        0.45995385],
       [0.36248818, 0.40767697, 0.46065664, ..., 0.47422802, 0.45361693,
        0.44652249],
       [0.37209592, 0.41280474, 0.46172305, ..., 0.48824705, 0.47098989,
        0.45971862],
       ...,
       [0.36169901, 0.40460362, 0.45394462, ..., 0.50205339, 0.47408658,
        0.4592725 ],
       [0.36852144, 0.40939309, 0.45770936, ..., 0.50432292, 0.47530604,
        0.45588443],
       [0.37086894, 0.41518336, 0.46615912, ..., 0.49146223, 0.47596267,
        0.47291892]])

Quantile Normlization

In [14]:
from ManufacturingNet.preprocessing import QuantileNormalizer

In [15]:
normalizer3 = QuantileNormalizer(data, axis = 0)

In [16]:
normalized_data = normalizer3.get_normalized_data()

In [17]:
normalized_data

array([[ 0.1108245 ,  0.00599514, -0.02958248, ..., -0.01660993,
         0.02888283,  0.02311589],
       [-0.07300938,  0.01887359,  0.06638019, ..., -0.14313853,
        -0.19000219, -0.10759061],
       [ 0.12317993,  0.14632583,  0.0786461 , ..., -0.01636567,
         0.0413727 ,  0.02082681],
       ...,
       [-0.08912435, -0.05751549, -0.01082144, ...,  0.10848383,
         0.08261477,  0.01648543],
       [ 0.05018933,  0.0615284 ,  0.03248063, ...,  0.12900702,
         0.09885559, -0.01648543],
       [ 0.09812521,  0.20544714,  0.12966973, ...,  0.01270892,
         0.10760065,  0.14928474]])

This is how we normalized the data with a couple of lines and its done! You can pass in a different axis valueto normalize the data along a different axis

## Let us now perform Feature Extraction

##### Lets import the featurizer first

In [18]:
from ManufacturingNet.featurization import Featurizer

In [19]:
f = Featurizer()

There are currently 20 features. Let us see what they are!

In [None]:
help(f)

##### Let us extract absolute mean feature from our data. Before that, lets see what it is

In [21]:
help(f.abs_mean)

Help on method abs_mean in module ManufacturingNet.featurization.featurization:

abs_mean(a, axis=0) method of ManufacturingNet.featurization.featurization.Featurizer instance
    The absolute mean value of a set of values is the arithmetic
    mean of all the absolute values in a given set of numbers.



In [22]:
feature_1 = f.abs_mean(data, axis = 1)
feature_1

array([0.04924405, 0.05428433, 0.05224068, ..., 0.29690971, 0.4750641 ,
       0.21335175])

##### Similarly, we can extract other features

In [23]:
feature_2 = f.mean(data, axis = 1)
feature_2

array([0.01370173, 0.01369508, 0.01348829, ..., 0.01581798, 0.01657961,
       0.01542829])

In [24]:
feature_3 = f.skew(data, axis = 1)
feature_3

array([-0.22483287, -0.24035794, -0.32800499, ..., -0.57577952,
        0.09202673, -0.46186613])

In [25]:
feature_4 = f.rms(data, axis = 1)
feature_4

array([0.06112483, 0.06719753, 0.06469002, ..., 0.61073543, 0.89905632,
       0.36928773])

In [26]:
feature_5 = f.peak_to_peak(data, axis = 1)
feature_5

array([ 0.35548062,  0.41431015,  0.40283631, ...,  8.37504533,
       12.001672  ,  4.10739467])

In [27]:
feature_6 = f.crestfactor(data, axis = 1)
feature_6

array([2.9419542 , 3.20074969, 3.32481677, ..., 7.11755664, 6.97214016,
       5.57819061])

This is how we can normalize and extract features from the given data. 