# Fault detection in the Tennessee Eastman Process

## Introduction

This demo shows a basic machine learning workflow for training a classification algorithm to distinguish various fault conditions. The data set for this demo originates from https://www.kaggle.com/averkij/tennessee-eastman-process-simulation-dataset

In [1]:
from bcolz import ctable
from sklearn.decomposition import IncrementalPCA
from sklearn.linear_model import SGDClassifier
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.metrics import accuracy_score, confusion_matrix
import numpy as np
from tqdm import tqdm
from dask import dataframe
from dask.cache import Cache

## Loading the training data

The original data set comes in the form of four R dataframes. I've converted them to bcoz ctables. That step does involve loading each dataframe into RAM, with the largest requiring about 4.5GiB.

In [2]:
training_data = ctable(rootdir='training_data.bcolz')
X = training_data[training_data.cols.names[3:]]
y = training_data['faultNumber']

In [3]:
X

ctable((5250000,), [('xmeas_1', '<f8'), ('xmeas_2', '<f8'), ('xmeas_3', '<f8'), ('xmeas_4', '<f8'), ('xmeas_5', '<f8'), ('xmeas_6', '<f8'), ('xmeas_7', '<f8'), ('xmeas_8', '<f8'), ('xmeas_9', '<f8'), ('xmeas_10', '<f8'), ('xmeas_11', '<f8'), ('xmeas_12', '<f8'), ('xmeas_13', '<f8'), ('xmeas_14', '<f8'), ('xmeas_15', '<f8'), ('xmeas_16', '<f8'), ('xmeas_17', '<f8'), ('xmeas_18', '<f8'), ('xmeas_19', '<f8'), ('xmeas_20', '<f8'), ('xmeas_21', '<f8'), ('xmeas_22', '<f8'), ('xmeas_23', '<f8'), ('xmeas_24', '<f8'), ('xmeas_25', '<f8'), ('xmeas_26', '<f8'), ('xmeas_27', '<f8'), ('xmeas_28', '<f8'), ('xmeas_29', '<f8'), ('xmeas_30', '<f8'), ('xmeas_31', '<f8'), ('xmeas_32', '<f8'), ('xmeas_33', '<f8'), ('xmeas_34', '<f8'), ('xmeas_35', '<f8'), ('xmeas_36', '<f8'), ('xmeas_37', '<f8'), ('xmeas_38', '<f8'), ('xmeas_39', '<f8'), ('xmeas_40', '<f8'), ('xmeas_41', '<f8'), ('xmv_1', '<f8'), ('xmv_2', '<f8'), ('xmv_3', '<f8'), ('xmv_4', '<f8'), ('xmv_5', '<f8'), ('xmv_6', '<f8'), ('xmv_7', '<f8'), ('

In [4]:
y

carray((5250000,), float64)
  nbytes := 40.05 MB; cbytes := 467.04 KB; ratio: 87.82
  cparams := cparams(clevel=5, shuffle=1, cname='lz4', quantize=0)
  chunklen := 32768; chunksize: 262144; blocksize: 262144
  rootdir := 'training_data.bcolz\faultNumber'
  mode    := 'a'
[  0.   0.   0. ...,  20.  20.  20.]

In [5]:
X_df = dataframe.from_bcolz(X)
X_df

Unnamed: 0_level_0,xmeas_1,xmeas_2,xmeas_3,xmeas_4,xmeas_5,xmeas_6,xmeas_7,xmeas_8,xmeas_9,xmeas_10,xmeas_11,xmeas_12,xmeas_13,xmeas_14,xmeas_15,xmeas_16,xmeas_17,xmeas_18,xmeas_19,xmeas_20,xmeas_21,xmeas_22,xmeas_23,xmeas_24,xmeas_25,xmeas_26,xmeas_27,xmeas_28,xmeas_29,xmeas_30,xmeas_31,xmeas_32,xmeas_33,xmeas_34,xmeas_35,xmeas_36,xmeas_37,xmeas_38,xmeas_39,xmeas_40,xmeas_41,xmv_1,xmv_2,xmv_3,xmv_4,xmv_5,xmv_6,xmv_7,xmv_8,xmv_9,xmv_10,xmv_11
npartitions=161,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1
0,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64,float64
32768,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5242880,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5249999,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...


## Out-of-core machine learning

Several machine learning methods train on individual batches of the larger training data set at a time. Therefore, we don't need to keep the entire data set in memory, we just need to load one batch at a time. In Scikit-learn, algorithms that support this implement a `partial_fit(X, y)` method. Here is a non-exhaustive list of algorithms:

- Decomposition:
  - `sklearn.decomposition.MiniBatchDictionaryLearning`
  - `sklearn.decomposition.IncrementalPCA`
- Classification:
  - `sklearn.naive_bayes.GaussianNB`
  - `sklearn.linear_model.SGDClassifier`
- Clustering:
  - `sklearn.cluster.Birch`
  - `sklearn.cluster.MiniBatchKMeans`
  
XGBoost and LightGBM can both use Dask, which in turn can wrap bcolz tables. This allows training gradient-boosted decission tree ensembles.

## Accessing batches of data efficiently

Bcolz is most efficient when accessing a contiguous block of data. However, for machine learning, we want batches of data that are randomly sampled from the entire data set. We can work around this by loading a large block, then randomly sampling within that block. The order of the blocks needs to be random too. I'll use a generator to do this.

In [5]:
def random_batch_generator(X: ctable, buffer_size: int, batch_size: int, y: ctable=None):
    buffer_starts = list(range(0, X.shape[0], buffer_size))
    PRNG = np.random.default_rng()
    PRNG.shuffle(buffer_starts) # Sample blocks of the training data in random order.
    for buffer_start in buffer_starts:
        buffer_end = min(X.shape[0], buffer_start + buffer_size)
        buffer = X[training_data.cols.names[3:]][buffer_start:buffer_end]
        buffer = buffer.view(np.float64).reshape(buffer.shape + (-1,))
        indices = PRNG.choice(buffer.shape[0], size=buffer.shape[0], replace=False)
        
        if y is not None:
            y_buffer = y[buffer_start:buffer_end]
            y_buffer = y_buffer.view(np.float64)
        for start in range(0, buffer.shape[0], batch_size):
            end = min(start + batch_size, buffer.shape[0])
            batch = buffer[indices[start:end], :]
            if y is not None:
                y_batch = y_buffer[indices[start:end]]
                yield batch, y_batch
            else:
                yield batch
        del buffer

## The pipeline

There's generally three steps in a machine learning pipeline - preprocessing, dimensionality reduction, and finally the ML algorithm. Each step needs to work on batches of data or support out-of-core processing.

We're going to take the following steps:
1. Preprocessing: feature whitening.
2. Dimensionality reduction: Principle Component Analysis
3. ML algorithm: Stochastic gradient descent (SGD)

In [6]:
batch_size = 64 * 1024
buffer_size = 4 * batch_size
total_length = int(np.ceil(X.shape[0] / batch_size))
print(f'Buffer: {buffer_size} rows, batch: {batch_size} rows, total steps: {total_length}')

Buffer: 262144 rows, batch: 65536 rows, total steps: 81


In [7]:
scaler = StandardScaler()
for batch in tqdm(random_batch_generator(X, buffer_size, batch_size), total=total_length):
    scaler.partial_fit(batch)

100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [00:35<00:00,  2.31it/s]


In [8]:
scaler.mean_

array([2.60840858e-01, 3.66377732e+03, 4.50617786e+03, 9.36923827e+00,
       2.69015887e+01, 4.23629281e+01, 2.72214950e+03, 7.48879855e+01,
       1.20400165e+02, 3.45964586e-01, 7.97595706e+01, 4.99911334e+01,
       2.65009318e+03, 2.51278907e+01, 4.99602615e+01, 3.12043499e+03,
       2.29341076e+01, 6.60021453e+01, 2.45631346e+02, 3.40393308e+02,
       9.44453643e+01, 7.70417501e+01, 3.19695759e+01, 8.87928867e+00,
       2.67706384e+01, 6.87406150e+00, 1.87129541e+01, 1.62878283e+00,
       3.26362489e+01, 1.38005826e+01, 2.45676048e+01, 1.25263230e+00,
       1.84738666e+01, 2.22216904e+00, 4.78166224e+00, 2.26500141e+00,
       1.79840572e-02, 8.39498723e-01, 9.74103505e-02, 5.37502095e+01,
       4.37929296e+01, 6.34920553e+01, 5.43011737e+01, 3.01559485e+01,
       6.31554539e+01, 2.29196979e+01, 3.99292803e+01, 3.80739037e+01,
       4.64420345e+01, 5.04799097e+01, 4.19082559e+01, 1.88092347e+01])

In [9]:
scaler.scale_

array([1.46108308e-01, 4.27775953e+01, 1.08699830e+02, 3.56353604e-01,
       2.31067704e-01, 3.13270037e-01, 7.42791827e+01, 1.31549474e+00,
       7.12738784e-02, 8.39785301e-02, 1.75978428e+00, 1.00239742e+00,
       7.48534772e+01, 1.10013240e+00, 1.01898245e+00, 7.70199728e+01,
       6.47620592e-01, 1.81705492e+00, 6.79386819e+01, 1.10078206e+01,
       1.26817403e+00, 1.38709653e+00, 1.73842397e+00, 2.20576767e-01,
       1.92120107e+00, 1.32540412e-01, 9.37602747e-01, 1.24616020e-01,
       2.61142722e+00, 2.85954005e-01, 2.95828176e+00, 1.45611509e-01,
       1.31335180e+00, 1.70181157e-01, 3.40819547e-01, 1.81953749e-01,
       1.01740862e-02, 9.02880515e-02, 1.32383917e-02, 5.81193757e-01,
       6.07382966e-01, 3.26951407e+00, 5.13291779e+00, 2.00389822e+01,
       7.23875563e+00, 1.08172485e+01, 1.26262125e+01, 2.94991252e+00,
       2.35821729e+00, 1.71937547e+01, 9.77333855e+00, 5.06438761e+00])

In [11]:
ipca = IncrementalPCA(n_components=3)
for epoch in range(10):
    print(f'Epoch {epoch}', flush=True)
    for batch in tqdm(random_batch_generator(X, buffer_size, batch_size), total=total_length):
        batch = scaler.transform(batch)
        ipca.partial_fit(batch)

Epoch 0


100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:26<00:00,  1.07s/it]

Epoch 1



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:35<00:00,  1.17s/it]

Epoch 2



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:17<00:00,  1.04it/s]

Epoch 3



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:18<00:00,  1.03it/s]

Epoch 4



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:17<00:00,  1.04it/s]

Epoch 5



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:16<00:00,  1.06it/s]

Epoch 6



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:17<00:00,  1.04it/s]

Epoch 7



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:15<00:00,  1.08it/s]

Epoch 8



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:14<00:00,  1.08it/s]

Epoch 9



100%|██████████████████████████████████████████████████████████████████████████████████| 81/81 [01:15<00:00,  1.07it/s]


In [12]:
ipca.explained_variance_ratio_

array([0.24695066, 0.19175973, 0.07551185])

## Stochastic Gradient Descent

The target variable, `y`, is an integer in the range `[0;20]`. Classification algorithms learn to output a one-hot-encoded class label, i.e. the integer labels get changed into a sequence of 21 1s and 0s, where one column is 1 and the others are 0. Why? Most classifiers learn a binary classification, i.e. they can separate two classes. To separate more than two classes, you have to train multiple classifiers. In this case, either 21 one-vs-many classifiers, or 21x21 one-vs-one classifiers.

In [214]:
encoder = OneHotEncoder(categories=[list(range(21))], sparse=False)
encoder.fit([[1]])
print(encoder.transform([[5]]))

[[0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


In [32]:
from collections import deque
from sklearn.multiclass import OneVsOneClassifier
from sklearn.naive_bayes import GaussianNB

In [73]:
# clf = OneVsOneClassifier(SGDClassifier(), n_jobs=-1)
clf = SGDClassifier(n_jobs=-1, fit_intercept=True, class_weight={0: 20, 1: 1.0})
# clf = GaussianNB()
for epoch in range(10):
    print(f'Epoch {epoch}', flush=True)
    progress_bar = tqdm(random_batch_generator(X, buffer_size, batch_size, y), total=total_length)
    accuracies = []#deque(maxlen=100)
    for x_batch, y_batch in progress_bar:
        y_batch[y_batch > 0] = 1 # Only trying to identify that a fault occurred, not which type of fault.
        
        # Subtract the mean and divide by the standard deviation for each feature (whitening the data).
        x_batch_transformed = scaler.transform(x_batch)
        
        # Transform the data using PCA.
#         x_batch_transformed = ipca.transform(x_batch_transformed)
        
        # Finally we fit the classifier.
        clf.partial_fit(x_batch_transformed, y_batch, classes=np.arange(2))
        
        # Report accuracy on the training data.
        accuracies.append(accuracy_score(y_batch, clf.predict(x_batch_transformed)))
        accuracy = np.mean(accuracies) # Report the mean accuracy for the last 100 batches
        progress_bar.set_postfix({'accuracy': f'{accuracy:.4f}'})

Epoch 0


 53%|█████████████████████████████████▋                              | 81/154 [00:31<00:27,  2.61it/s, accuracy=0.9976]

Epoch 1



 53%|█████████████████████████████████▋                              | 81/154 [00:22<00:20,  3.62it/s, accuracy=0.9975]

Epoch 2



 53%|█████████████████████████████████▋                              | 81/154 [00:18<00:16,  4.42it/s, accuracy=0.9974]

Epoch 3



 53%|█████████████████████████████████▋                              | 81/154 [00:22<00:20,  3.53it/s, accuracy=0.9976]

Epoch 4



 53%|█████████████████████████████████▋                              | 81/154 [00:21<00:19,  3.72it/s, accuracy=0.9980]

Epoch 5



 53%|█████████████████████████████████▋                              | 81/154 [00:20<00:18,  4.00it/s, accuracy=0.9982]

Epoch 6



 53%|█████████████████████████████████▋                              | 81/154 [00:25<00:22,  3.21it/s, accuracy=0.9983]

Epoch 7



 53%|█████████████████████████████████▋                              | 81/154 [00:19<00:17,  4.12it/s, accuracy=0.9984]

Epoch 8



 53%|█████████████████████████████████▋                              | 81/154 [00:27<00:24,  2.97it/s, accuracy=0.9984]

Epoch 9



 53%|█████████████████████████████████▋                              | 81/154 [00:31<00:28,  2.55it/s, accuracy=0.9984]


## Test set accuracy

In [74]:
testing_data = ctable(rootdir='testing_data.bcolz', )
X_test = testing_data[training_data.cols.names[3:]]
y_test = testing_data['faultNumber']
total_length = int(np.ceil(X_test.shape[0] / batch_size))

In [75]:
y_test[1:2].view(y_test.dtype)

array([0])

In [104]:
def sequential_batch_generator(X: ctable, buffer_size: int, batch_size: int, y: ctable=None):
    for buffer_start in range(0, X.shape[0], buffer_size):
        buffer_end = min(X.shape[0], buffer_start + buffer_size)
        buffer = X[training_data.cols.names[3:]][buffer_start:buffer_end]
        buffer = buffer.view(np.float64).reshape(buffer.shape + (-1,))
        
        if y is not None:
            y_buffer = y[buffer_start:buffer_end]
            y_buffer = y_buffer.view(y_buffer.dtype)
        for start in range(0, buffer.shape[0], batch_size):
            end = min(start + batch_size, buffer.shape[0])
            batch = buffer[start:end, :]
            if y is not None:
                y_batch = y_buffer[start:end]
                yield batch, y_batch
            else:
                yield batch

In [77]:
accuracy_count = 0
confusion_matrix_counts = np.zeros((2, 2))
for x_batch, y_batch in tqdm(sequential_batch_generator(X_test, buffer_size, batch_size, y_test), total=total_length):
    y_batch[y_batch > 0] = 1
    
    # Subtract the mean and divide by the standard deviation for each feature (whitening the data).
    x_batch_transformed = scaler.transform(x_batch)

    # Transform the data using PCA.
#     x_batch_transformed = ipca.transform(x_batch_transformed)

    y_pred = clf.predict(x_batch_transformed)
    accuracy_count += accuracy_score(y_batch, y_pred, normalize=False)
    confusion_matrix_counts += confusion_matrix(y_batch, y_pred, labels=[0, 1], normalize=None)
    
accuracy = accuracy_count / X_test.size

100%|████████████████████████████████████████████████████████████████████████████████| 154/154 [01:01<00:00,  2.52it/s]


In [78]:
accuracy

0.9521167658730159

In [79]:
confusion_matrix_counts

array([[0.000000e+00, 4.800000e+05],
       [2.663000e+03, 9.597337e+06]])

In [10]:
import tensorflow as tf

In [16]:
# model = tf.keras.Sequential(
#     [
#         tf.keras.layers.Dense(64, activation='tanh', input_shape=(52,)),
#         tf.keras.layers.Dense(64, activation='tanh'),
#         tf.keras.layers.Dense(64, activation='tanh'),
#         tf.keras.layers.Dense(21, activation='tanh'),
#         tf.keras.layers.Softmax()
#     ]
# )

model = tf.keras.Sequential(
    [
        tf.keras.layers.LSTM(256, return_sequences=True),
#         tf.keras.layers.LSTM(128, return_sequences=False),
        tf.keras.layers.Dense(300),
#         tf.keras.layers.Dropout(0.5),
        tf.keras.layers.Dense(21, activation='softmax')
    ]
)

In [17]:
model.compile(loss=tf.losses.CategoricalCrossentropy(),
    optimizer=tf.optimizers.Adam(),
    metrics=[tf.metrics.Accuracy(), tf.metrics.MeanSquaredError()])

In [18]:
model.summary()

ValueError: This model has not yet been built. Build the model first by calling `build()` or calling `fit()` with some data, or specify an `input_shape` argument in the first layer(s) for automatic build.

In [19]:
def keras_random_batch_generator(X: ctable, buffer_size: int, batch_size: int, y: ctable=None):
    for x_batch, y_batch in random_batch_generator(X, buffer_size, batch_size, y):
#         print(x_batch.shape)
#         print(y_batch.shape)
#         y_batch[y_batch > 0] = 1
#         print(y_batch.min(), y_batch.max())
#         x_batch_transformed = scaler.transform(x_batch)
#         y_batch_transformed = encoder.transform(y_batch.astype(np.int32).reshape(-1,1))
        y_batch_transformed = tf.keras.utils.to_categorical(y_batch, num_classes=21)
        yield x_batch.astype(np.float32).reshape(-1, 52, 1), y_batch_transformed
        
def keras_sequential_batch_generator(X: ctable, buffer_size: int, batch_size: int, y: ctable=None):
    for x_batch, y_batch in sequential_batch_generator(X, buffer_size, batch_size, y):
#         y_batch[y_batch > 0] = 1
        x_batch_transformed = scaler.transform(x_batch)
        yield x_batch_transformed, y_batch.astype(np.int32)

In [20]:
batch_size = 64
buffer_size = 256 * 1024
for epoch in range(10):
    model.fit(keras_random_batch_generator(X, buffer_size, batch_size, y), epochs=1)

NotImplementedError: in user code:

    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\training.py:806 train_function  *
        return step_function(self, iterator)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\training.py:796 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1211 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2585 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2945 _call_for_each_replica
        return fn(*args, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\training.py:789 run_step  **
        outputs = model.train_step(data)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\training.py:747 train_step
        y_pred = self(x, training=True)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:985 __call__
        outputs = call_fn(inputs, *args, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\sequential.py:386 call
        outputs = layer(inputs, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:663 __call__
        return super(RNN, self).__call__(inputs, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:985 __call__
        outputs = call_fn(inputs, *args, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent_v2.py:1108 call
        inputs, initial_state, _ = self._process_inputs(inputs, initial_state, None)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:862 _process_inputs
        initial_state = self.get_initial_state(inputs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:645 get_initial_state
        init_state = get_initial_state_fn(
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:2523 get_initial_state
        return list(_generate_zero_filled_state_for_cell(
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:2968 _generate_zero_filled_state_for_cell
        return _generate_zero_filled_state(batch_size, cell.state_size, dtype)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:2984 _generate_zero_filled_state
        return nest.map_structure(create_zeros, state_size)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\util\nest.py:635 map_structure
        structure[0], [func(*x) for x in entries],
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\util\nest.py:635 <listcomp>
        structure[0], [func(*x) for x in entries],
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\keras\layers\recurrent.py:2981 create_zeros
        return array_ops.zeros(init_state_size, dtype=dtype)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\util\dispatch.py:201 wrapper
        return target(*args, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\ops\array_ops.py:2747 wrapped
        tensor = fun(*args, **kwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\ops\array_ops.py:2794 zeros
        output = _constant_if_small(zero, shape, dtype, name)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\ops\array_ops.py:2732 _constant_if_small
        if np.prod(shape) < 1000:
    <__array_function__ internals>:5 prod
        
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\numpy\core\fromnumeric.py:3030 prod
        return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\numpy\core\fromnumeric.py:87 _wrapreduction
        return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
    C:\Users\williams\Anaconda3\envs\ml\lib\site-packages\tensorflow\python\framework\ops.py:845 __array__
        raise NotImplementedError(

    NotImplementedError: Cannot convert a symbolic Tensor (sequential_5/lstm_8/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported


In [251]:
for x_batch, y_batch in keras_random_batch_generator(X, buffer_size, batch_size, y):
    y_pred = model.predict(x_batch)
    print(y_batch)
    print(y_pred)

[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 1. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
[[0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 [0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 [0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 ...
 [0.00963884 0.0449955  0.04445286 ... 0.06350037 0.06350037 0.04530375]
 [0.01107437 0.05169675 0.05107329 ... 0.05146843 0.05213922 0.00987374]
 [0.0104119  0.04860424 0.04801808 ... 0.04838959 0.04902025 0.04893722]]
[[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 1.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]
[[0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 [0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 [0.04761905 0.04761905 0.04761905 ... 0.04761905 0.04761905 0.04761905]
 ...
 [0.00727222 0.04966

KeyboardInterrupt: 

In [145]:
accuracy_count = 0
confusion_matrix_counts = np.zeros((2, 2))
for x_batch, y_batch in tqdm(sequential_batch_generator(X_test, buffer_size, batch_size, y_test), total=total_length):
    y_batch[y_batch > 0] = 1
    
    # Subtract the mean and divide by the standard deviation for each feature (whitening the data).
#     x_batch_transformed = scaler.transform(x_batch)

    # Transform the data using PCA.
#     x_batch_transformed = ipca.transform(x_batch_transformed)

    y_pred = model.predict(x_batch)
    print(y_pred.shape)
    accuracy_count += accuracy_score(y_batch, y_pred, normalize=False)
    confusion_matrix_counts += confusion_matrix(y_batch, y_pred, labels=[0, 1], normalize=None)
    
accuracy = accuracy_count / X_test.size

  0%|                                                                                          | 0/154 [00:01<?, ?it/s]

(65536, 1)





ValueError: Input contains NaN, infinity or a value too large for dtype('float32').