# Audio playback of data
This notebook attempts to playback data record from a CPU target to compensate for sensor non-linearity

The experiment is structured as follows 
1. Collect data, arranged based on condition
    * OK
    * Inner ring damage
    * Outer ring damage
2. Execute script for 
    * playing vibration data as audio
    * colllecting data from CPU via serial port
    * Storing data as waw files
    * uploading batch data to edge impulse 

## Data Sets and Download

The main characteristic of the data availible from the Paderborn university are

Synchronously measured motor currents and vibration signals with high resolution and sampling rate of __26 damaged bearing states__ and __6 undamaged (healthy) states__ for reference.

Supportive measurement of speed, torque, radial load, and temperature.

Four different operating conditions (see operating conditions).
20 measurements of 4 seconds each for each setting, saved as a MatLab file with a name consisting of the code of the operating condition and the four-digit bearing code (e.g. N15_M07_F10_KA01_1.mat).

Systematic description of the bearing damage by uniform fact sheets and a measuring log, which can be downloaded with the data.

In total, experiments with 32 different bearing damages in ball bearings of type 6203 were performed:
* Undamaged (healthy) bearings (6x), see Table 6 in (pdf).
* Artificially damaged bearings (12x), see Table 4 in (pdf).
* Bearings with real damages caused by accelerated lifetime tests, (14x) see Table 5 in (pdf)

## Dataset description & organisation
each experiments data is stored at the universitys homepage, labelled with the __Bearing Code__ (from the table below).__rar__ further below a number of utility finctions for retrieving det data is implemented

### Artificial damages
The dataset is composed of artificially created and wear based bearing errors.

<img src="doc/ad.png" width="500" height="250">

The artifiaical damages are manufactured using three different methods:
1. electric discharge machining, **EDM** (trench of 0.25 mm length
in rolling direction and depth of 1-2 mm),
2. drilling (diameter: 0.9 mm, 2 mm, 3 mm), and
3. manual electric engraving (damage length from
1-4 mm) 

ISO 15243 gives a methodology for the classification of
bearing damage and failures. The damages are categorized
into six main damage modes and their sub-modes. The six
main damage modes are: fatigue, wear, corrosion, electrical
erosion, plastic deformation, and fracture and cracking.

**Damage extent for a 6203 bearing**

|Damage level | Assigned % value | Limites for 6203 bearing |
|--- | --- | --- |
| 1 |0-2%|<2mm|
| 2 |2-5%|>2mm|
| 3 |5-15%|>4.5mm| 
| 4 |15-35%|>13.5mm|
| 5 |>35%|>31.5mm|

## Dataset summary 
In the tavles below a summary of the availble dataset are summarised

The availible datasets contains following subsets of data, availivble for download, by the name provided in the initial column

### Healthy bearings
|Bearing Code | Run-in Period [h] | Radial Load [N] | Speed ​​[min ^ -1] |
|--- | --- | --- | --- |
| K001 |> 50 | 1000-3000 | 1500-2000 |
| K002 | 19 | 3000 | 2900 |
| K003 | 1 | 3000 | 3000 | 
| K004 | 5 | 3000 | 3000 | |
| K005 | 10 | 3000 | 3000 | |
| K006 | 16 | 3000 | 2900 | |

### Bearings with outer ring damage
|Bearing Code | Radial Load [N] | Speed ​​[min ^ -1] | Damage method | Damage extent (level) | Damage type |
|--- | --- | --- | --- | --- | --- |
| KA01 | - | - | EDM |1||
| KA03 | - | - | Elec. Engraved|2||
| KA04 | - | - | HALT|1|Fatigue,pitting|
| KA05 | - | - | Elec. Engraved|1|
| KA06 | - | - | Elec. Engraved|2|
| KA07 | - | - | Drilling |1|
| KA08 | - | - | Drilling |2|
| KA09 | - | - | Drilling |2|
| KA15 | - |-|HALT|1|Plastic deform.:Indentations|
| KA16 | - |-|HALT|2|fatigue: pitting|
| KA22 | - |-|HALT|1|fatigue: pitting|
| KA30 | - |-|HALT|1|Plastic deform.:Indentations|

### Bearings with inner ring damage
|Bearing Code | Radial Load [N] | Speed ​​[min ^ -1] | Damage method | Damage extent (level) | Damage type |
|--- | --- | --- | --- | --- | --- |
| KI01 | - | - |  EDM |1|
| KI03 | - | - | Elec. Engraved|1|
| KI04 | - | - | HALT |1|Fatigue,pitting|
| KI05 | - | - | Elec. Engraved|1|
| KI07 | - | - |  Elec. Engraved|2|
| KI08 | - | - |  Elec. Engraved|2|
| KI14 | - | - | HALT |1|Fatigue,pitting |
| KI16 | - | - | HALT |3|Fatigue,pitting |
| KI17 | - | - | HALT |1|Fatigue,pitting |
| KI18 | - | - | HALT |2|Fatigue,pitting |
| KI21 | - | - | HALT |1|Fatigue,pitting |

### Bearings with combined inner and outer ring damage
|Bearing Code | Radial Load [N] | Speed ​​[min ^ -1] | Damage method | Damage extent (level) |Damage type |
|--- | --- | --- | --- | --- | --- |
| KB23 | - | - | HALT |2|fatigue: pitting|
| KB24 | - | - | HALT |3|fatigue: pitting|
| KB27 | - | - | HALT |1|Plastic deform.:Indentations|


In [2]:
import os
import glob
import errno
import random
#import urllib
import urllib.request as ug
import numpy as np
from scipy.io import loadmat
import patoolib as pa

rarpath = "/usr/local/bin/unrar"

class PDB:
    def __init__(self, exp, rpm, rad_force, torque_mNm, length):
        #    def __init__(self):
        if exp not in ('K001', 'K002', 'K003', 'K004', 'K005', 'K006','KA01','KA02','KA03','KA04','KA05','KA06'
                       ,'KA07','KA08','KA09','KA15','KA16','KA22','KA30','KB23','KB24','KB27','KI01','KI03','KI04'
                       ,'KI05','KI07','KI08','KI14','KI16','KI17','KI18','KI21'):
            print("wrong experiment name: {}".format(exp))
            exit(1)
        if rpm not in ('1500', '900'):
            print("wrong rpm value: {}".format(rpm))
            exit(1)
        if rad_force not in ('1000', '400'):
            print("wrong load value: {}".format(rad_force))
            exit(1)
        if torque_mNm not in ('100', '700'):
            print("wrong torque value: {}".format(torque_mNm))
            exit(1)

        dict_rpm = {'1500': 'N15_', '900': 'N09_', '2900': 'N29_'}
        dict_torq = {'100': 'M01_', '700': 'M07_'}
        dict_load = {'400': 'F04_', '1000': 'F10_'}
        #Labels 1 = healthy bearing, 2 = outer ring damage , 3 = inner ring damage, 4 = combined damage
        dict_labels = {'K0': 1, 'KA': 2, 'KI': 3, 'KB': 4}

        #print(exp[0:2])
        self.y_label = dict_labels[exp[0:2]]
        print("Y is:")
        print(self.y_label)

        filestring = dict_rpm[rpm] + dict_torq[torque_mNm] + dict_load[rad_force]
        print('filestring')
        print(filestring)

        # create reciveing dir names from arguments
        rdir = os.path.join('.', 'data/PDB', exp)  # ,rpm,load)
        fmeta = os.path.join(os.path.dirname('.'), 'metadata.txt')
        all_lines = open(fmeta).readlines()

        lines = []
        for line in all_lines:
            l = line.split()
            # print(l)
            if (l[0] == exp): # and l[1] == rpm and l[2] == rad_force:  # and l[3] == torque_mNm:
                lines.append(l)

        print("l: ")
        print(lines)
        # prepare download
        self.length = length  # sequence length
        #        self._load_and_slice_data2(rdir, lines)
        self._load_and_slice_data(rdir, lines, filestring)

        # shuffle training and test arrays
        shuffle = 0
        if(shuffle):
            self._shuffle()

    def _mkdir(self, path):
        try:
            os.makedirs(path)
        except OSError as exc:
            if exc.errno == errno.EEXIST and os.path.isdir(path):
                pass
            else:
                print("can't create directory '{}''".format(path))
                exit(1)

    def _download(self, fpath, link):
        print("Downloading to: '{}'".format(fpath))
        #urllib.request.urlretrieve(link, fpath)
        ug.urlretrieve(link, fpath)

    def _load_and_slice_data(self, rdir, infos, filestring):

        self.X_train = np.zeros((0, self.length))
        self.X_test = np.zeros((0, self.length))
        self.y_train = []
        self.y_test = []

        ## w
        for idx, info in enumerate(infos):

            # directory to put the raw rar file
            rawdir = os.path.join(rdir, 'raw')
            self._mkdir(rawdir)

            # path to find the file
            fpath = os.path.join(rawdir, info[0] + '.rar')

            # if file already exists, avoid duplicate downloads
            if not os.path.exists(fpath):
                print("no dir/file")
                self._download(fpath, info[3].rstrip('\n'))

            # compressed file to uncompress
            cmpfile = rawdir + '/' + info[0] + '.rar'

            print("file to exrtract is is::")
            print(cmpfile)

            # unpack file
            if not os.path.exists(rdir + '/' + info[0]):
                pa.extract_archive(cmpfile, outdir=rdir, program=rarpath)
            else:
                print("file already extracted, skipping unrar")

            # a list of all files in the extracted dir
            ddir = rdir + '/' + info[0]
            flist_all = os.listdir(ddir)

            # print("filelist:")
            # print(flist_all)

            # use the searchstring, build from the program arguments to find files of interest
            flistsorted = [i for i in flist_all if filestring in i]

            print("sorted filelist:")
            print(flistsorted)

            # now build the dataset from all files of interest
            # iterate through the filelist
            for f in flistsorted:
                # load matlab file
                mat_dict = loadmat(ddir + '/' + f)#,struct_as_record=False)

                # get the values key, tha name of thenactual dataset equal to filename
                #key = list(filter(lambda x: 'N15_M07_F04_' in x, mat_dict.keys()))
                key = list(filter(lambda x: filestring in x, mat_dict.keys()))
                # load data
                #time_series = mat_dict[key[0]][:, 0] #['Y']
                time_series = mat_dict[key[0]]['Y'][0, 0][0, 6][2][:][0]

                idx_last = -(time_series.shape[0] % self.length)
                clips = time_series[:idx_last].reshape(-1, self.length)

                n = clips.shape[0]

                split = 0
                if(split):
                    # 75% train 25%test
                    n_split = int(3 * n / 4)

                    self.X_train = np.vstack((self.X_train, clips[:n_split]))
                    self.X_test = np.vstack((self.X_test, clips[n_split:]))
                    # todo add meaning full label

                    self.y_train += [self.y_label] * n_split
                    self.y_test  += [self.y_label] * (clips.shape[0] - n_split)

                else:
                    self.X_train = np.vstack((self.X_train, clips[:n]))
                    #self.X_test = np.vstack((self.X_test, clips[n_split:]))
                    # todo add meaning full label

                    self.y_train += [self.y_label] * n#_split
                    #self.y_test  += [self.y_label] * (clips.shape[0] - n_split)
                    
    def _shuffle(self):
        # shuffle training samples
        index = list(range(self.X_train.shape[0]))
        random.Random(0).shuffle(index)
        self.X_train = self.X_train[index]
        self.y_train = tuple(self.y_train[i] for i in index)

        # shuffle test samples
        index = list(range(self.X_test.shape[0]))
        random.Random(0).shuffle(index)
        self.X_test = self.X_test[index]
        self.y_test = tuple(self.y_test[i] for i in index)


# Build list of datasets
Each snippet is app 3 seconds, to allow CPU to start sampling

'''
## Analysis and dataset planning
After implementation of analysis and feature extraction functions, the artificial and real datasets are organised and tested as follows

|Class||Training|Testing|
--- |--- |--- |--- |
|1|Healthy|K002|K001|
|2|OR damage||KA22|
|2|OR damage|KA01|K004|
|2|OR damage|KA05|KA15|
|2|OR damage|KA07|KA30|
|2|OR damage||KA16|
|3|IR damage||KI14|
|3|IR damage|KI01|KI21|
|3|IR damage|KI05|KI17|
|3|IR damage|KI07|KI18|
|3|IR damage||K16|
'''


In [58]:
fsAcc = 64000      # samples /sec
sampLen = 3*fsAcc  # 3 sec.

be_data = []

#healthy sets
be_data.append(PDB('K001','1500','1000','700',sampLen))
be_data.append(PDB('K002','1500','1000','700',sampLen))
be_data.append(PDB('K003','1500','1000','700',sampLen))
be_data.append(PDB('K004','1500','1000','700',sampLen))
be_data.append(PDB('K005','1500','1000','700',sampLen))
be_data.append(PDB('K006','1500','1000','700',sampLen))

#outerring
be_data.append(PDB('KA01','1500','1000','700',sampLen)) #EDM
be_data.append(PDB('KA03','1500','1000','700',sampLen)) #
be_data.append(PDB('KA04','1500','1000','700',sampLen)) #
be_data.append(PDB('KA05','1500','1000','700',sampLen)) #
be_data.append(PDB('KA06','1500','1000','700',sampLen)) #
be_data.append(PDB('KA07','1500','1000','700',sampLen)) #
be_data.append(PDB('KA08','1500','1000','700',sampLen)) #
be_data.append(PDB('KA09','1500','1000','700',sampLen)) #
be_data.append(PDB('KA15','1500','1000','700',sampLen)) #
be_data.append(PDB('KA16','1500','1000','700',sampLen)) #
be_data.append(PDB('KA22','1500','1000','700',sampLen)) #
be_data.append(PDB('KA30','1500','1000','700',sampLen)) #

#innerring
be_data.append(PDB('KI01','1500','1000','700',sampLen)) #EDM   severity: 1
be_data.append(PDB('KI03','1500','1000','700',sampLen)) #Elec  severity: 1
be_data.append(PDB('KI04','1500','1000','700',sampLen)) #HALT  severity: 1
be_data.append(PDB('KI05','1500','1000','700',sampLen)) #Elec  severity: 1
be_data.append(PDB('KI07','1500','1000','700',sampLen)) #Elec  severity: 2
be_data.append(PDB('KI08','1500','1000','700',sampLen)) #Elec  severity: 2
be_data.append(PDB('KI14','1500','1000','700',sampLen)) #HALT  severity: 1
be_data.append(PDB('KI16','1500','1000','700',sampLen)) #HALT  severity: 3
be_data.append(PDB('KI17','1500','1000','700',sampLen)) #HALT  severity: 1
be_data.append(PDB('KI18','1500','1000','700',sampLen)) #HALT  severity: 2
be_data.append(PDB('KI21','1500','1000','700',sampLen)) #HALT  severity: 1


Y is:
1
filestring
N15_M07_F10_
l: 
[['K001', '1500', '400', 'http://groups.uni-paderborn.de/kat/BearingDataCenter/K001.rar']]
file to exrtract is is::
./data/PDB/K001/raw/K001.rar
file already extracted, skipping unrar
sorted filelist:
['N15_M07_F10_K001_20.mat', 'N15_M07_F10_K001_19.mat', 'N15_M07_F10_K001_9.mat', 'N15_M07_F10_K001_8.mat', 'N15_M07_F10_K001_18.mat', 'N15_M07_F10_K001_15.mat', 'N15_M07_F10_K001_5.mat', 'N15_M07_F10_K001_4.mat', 'N15_M07_F10_K001_14.mat', 'N15_M07_F10_K001_16.mat', 'N15_M07_F10_K001_6.mat', 'N15_M07_F10_K001_7.mat', 'N15_M07_F10_K001_17.mat', 'N15_M07_F10_K001_3.mat', 'N15_M07_F10_K001_13.mat', 'N15_M07_F10_K001_12.mat', 'N15_M07_F10_K001_2.mat', 'N15_M07_F10_K001_10.mat', 'N15_M07_F10_K001_11.mat', 'N15_M07_F10_K001_1.mat']
Y is:
1
filestring
N15_M07_F10_
l: 
[['K002', '1500', '400', 'http://groups.uni-paderborn.de/kat/BearingDataCenter/K002.rar']]
file to exrtract is is::
./data/PDB/K002/raw/K002.rar
file already extracted, skipping unrar
sorted file

## play sounds

In [70]:
np.shape(be_data)#[0].X_train[0])
#be_data[0].y_label
#dict_labels[1]

np.shape(be_data[5].X_train)

len(be_data[4].X_train)

40

In [75]:
!pwd


/Users/opprud/workspace/ceramicspeed/bearingbrain/tools/dataset_for_test/external_data/4_paderborn


## Read data from serial port and store data

In [92]:
import serial
import time
import csv
import struct
import os
import numpy as np
from numpy import savetxt
from numpy import asarray
from scipy.io.wavfile import write

dataFolder = './collected_data/raw/'
normalisedDataFolder = './collected_data/normalised/'

"""
Read data from connected Apollo3 device and save as waw file
"""
def serialPdmToWaw(ser, rxLen, samplerate, filename, scale=True):
    # open serial port
    ser = serial.Serial(ser,baudrate=115200,timeout=5)
    #ser = serial.Serial(ser,baudrate=500000,timeout=5)
    ser.flushInput()

    # required delay since DTR is pulled when port opens, this resets the CPU through the bootload circuit
    time.sleep(1.0)
    # write any char to CPU to start samopling
    ser.write(b'\n')
    
    # capture incoming data, 
    # Sampling starts by simpy resetting the Artemis Nano board
    ser_bytes = ser.read(size=rxLen)
    print("got %d",len(ser_bytes))

    L = int(len(ser_bytes))
    l2 = int(L/2)
    
    # unpack data into array
    DATA = struct.unpack('%dh'%l2,ser_bytes)
    d = asarray(DATA)    
    
    #set sample rate
    samplerate = fs #46875;
    
    # multiply to max range for a 16 bit vaw
    if(scale == True):
        B15_MAX = (1 << 15) - 1
        mul = B15_MAX / d.max()
        d_norm = d*(mul/2)
        # if file already exists, avoid duplicate downloads
        if not os.path.exists(normalisedDataFolder):
            os.makedirs(normalisedDataFolder)
            print("creating raw data dir")
        # write file
        write(normalisedDataFolder+filename+"_norm.wav", samplerate, d_norm)   

    if not os.path.exists(dataFolder):
        os.makedirs(dataFolder)
        print("creating data dir")

    # write filename
    write(dataFolder+filename+"_raw.wav", samplerate, d)   

    #close port
    ser.close()

    return ser
    

In [94]:
import sounddevice as sd
import time
fs_playback=64000

port = '/dev/cu.wchusbserial1430' 
#141320'
fs   = 15625 #using arduino waw example settings
        #46875   #hardcoded in CPU 
l    = 40000#24000
#fn   = "s600_2"

#ser = serialPdmToWaw(port, l, fs, fn, True)

dict_labels = {1 :'OK', 2: 'OUTER', 3 : 'INNER', 4 : 'BOTH'}

for i in range(2):#2np.shape(be_data)[0]):
    # any data ?
    if(len(be_data[i].X_train) > 0):
        s = be_data[i].X_train[0]
        #sd.play(s,samplerate=fs_playback)
        print(dict_labels[be_data[i].y_label])
        #capture data and store
        filename = ( str(dict_labels[be_data[i].y_label])+'_test3_'+str(i))
        ser = serialPdmToWaw(port, l, fs, filename, True)
        time.sleep(3)
    else:
        print("skipping :"+str(i))

OK
got %d 40000
OK
got %d 40000


In [24]:
np.shape(be_data)[0]

8

In [None]:
#outerring
d_OR_test_1  = PDB('KA22','1500','1000','700',SliceLen) 
d_OR_train_2 = PDB('KA01','1500','1000','700',SliceLen) #EDM
d_OR_test_2  = PDB('K004','1500','1000','700',SliceLen)
d_OR_train_3 = PDB('KA05','1500','1000','700',SliceLen) #Elec engraved
d_OR_test_3  = PDB('KA15','1500','1000','700',SliceLen)
d_OR_train_4 = PDB('KA07','1500','1000','700',SliceLen) #Drilling
d_OR_test_4  = PDB('KA30','1500','1000','700',SliceLen)
d_OR_test_5  = PDB('KA16','1500','1000','700',SliceLen) 

#innerring
d_IR_test_1  = PDB('KI14','1500','1000','700',SliceLen)
d_IR_train_2 = PDB('KI01','1500','1000','700',SliceLen) #EDM
d_IR_test_2  = PDB('KI21','1500','1000','700',SliceLen)
d_IR_train_3 = PDB('KI05','1500','1000','700',SliceLen) #Elct engraved
d_IR_test_3  = PDB('KI17','1500','1000','700',SliceLen)
d_IR_train_4 = PDB('KI07','1500','1000','700',SliceLen) #Elec engraved
d_IR_test_4  = PDB('KI18','1500','1000','700',SliceLen)
d_IR_test_5  = PDB('KI16','1500','1000','700',SliceLen) 

In [8]:
np.shape(ok1.X_test)

(0, 192000)

In [22]:
np.shape(be_data[0].y_label)
be_data[5].y_label

3

In [12]:
np.shape(ok_data)

(2,)