Earlier, in a separate notebook, which you can find [here](https://www.kaggle.com/ariannsz/dirty-but-easy-pca-with-dask), we did PCA and save the results in a csv file.
We are now going to use that file. However, we had forgotten to include the weight and resp in the csv file, so we need to do a little bit of dirty work first.  
  
After that, we are going to create a NN, using TensorFlow and Keras, feed it with the 40 principle components along weights and resp, and try to do a good regression or classification! 

So, hang on and let's see what we can do! 


In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import dask.dataframe as dd
import dask.array as da
import gc

import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers

In [None]:
df = dd.read_csv('../input/janestreet-pca/pca.csv')
df = df.drop(columns=['Unnamed: 0'])
df = df.to_dask_array()
df = df.compute()

In [None]:
from dask_ml.preprocessing import StandardScaler as ss

temp = dd.read_csv('../input/jane-street-market-prediction/train.csv')
weights = temp['weight']
weights = weights.to_dask_array()

weights = weights.compute()
weights = weights.reshape(-1,1)

scaler = ss()
weights= scaler.fit_transform(weights)



In [None]:
temp = dd.read_csv('../input/jane-street-market-prediction/train.csv')
target = temp['resp']
target = target.to_dask_array()
target = target.compute()
target = target.reshape(-1,1)

In [None]:
target = np.divide(target, np.abs(target)).astype(int)

In [None]:
target += 1
target = np.divide(target, 2).astype(int)
print(target[:5,0])

In [None]:
del(temp)
data = [weights, df]
data = da.concatenate(data, axis=1)
data = data.compute()
del(weights)
gc.collect()

In [None]:
print(data.shape)
print(target.shape)

# Classification First
Lets make it easier and see this problem as a binary classification in the first step.
To do so, we are going to map the resp values to {1, -1} based on their sign. 

# A couple of notes
* First, I had target values as +1, -1, and I was getting val_AUC = 0!  
After diggin a little, I found the following statement from keras documentation ([link](https://keras.io/api/losses/probabilistic_losses/#binary_crossentropy-function)):

> *Use this cross-entropy loss when there are only two label classes (assumed to be 0 and 1). For each example, there should be a single floating-point value per prediction.*  >


So, I changed the target values to 0 and 1 and the problem was solved!

* The other note-worthy feature is the Dropout; it really helps with overfitting problem.  
I have not tried many different architectures, but among the ones that I tried, Dropout has proved to be helpful

In [None]:
X_valid, X_train = data[:5000,:] , data[5000:, :]
y_valid, y_train = target[:5000], target[5000:]

In [None]:
model = keras.models.Sequential([
    layers.Input(shape=(data.shape[1]), ),
    layers.Dropout(0.35),
    layers.Dense(84, activation='relu'),
    layers.Dropout(0.25),
    layers.Dense(42, activation='relu'),
    layers.Dropout(0.15),
    layers.Dense(21, activation='relu'),
    layers.Dropout(0.15),
    layers.Dense(8, activation='relu'),
    layers.Dense(1, activation='sigmoid')
])

In [None]:
model.compile(
    keras.optimizers.Adam(learning_rate=1e-3),
    loss = tf.keras.losses.BinaryCrossentropy(),
    metrics = tf.keras.metrics.AUC(name = 'AUC'), 
    )

es = keras.callbacks.EarlyStopping(monitor = 'val_AUC', min_delta = 1e-4, patience = 5, mode = 'max', 
                       baseline = None, restore_best_weights = True, verbose = 0)



In [None]:
history = model.fit(X_train, y_train, validation_data = (X_valid, y_valid), epochs = 2000, batch_size = 8197, callbacks = [es], verbose = 2)

In [None]:
import matplotlib.pyplot as plt
pd.DataFrame(history.history).plot(figsize=(8,5))
plt.grid(True)
plt.gca().set_ylim(0.4, 0.8) # set the vertical range to [0-1]
plt.show()