New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Neuraxle refactor #16
WIP Neuraxle refactor #16
Conversation
Here is what I think we could do with TF2:
This is some dirty pseudocode I've just done. For instance, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First code review pass
name='data_inputs' | ||
) | ||
|
||
# shape: (batch_size, seq_length, input_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# shape: (batch_size, seq_length, input_dim) | |
# shape: (batch_size, seq_length, output_dim) |
# shape: (batch_size) | ||
target_sequence_length = tf.placeholder(dtype=tf.int32, shape=[None], name='expected_outputs_length') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong dimension comment on an unused variable. Delete those 2 lines?
return decoder_outputs_training, decoder_outputs_inference | ||
|
||
|
||
def create_encoder(step, data_inputs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dimension comments of inputs and outputs will be important in those small functions
|
||
|
||
def create_inference_decoder(step: TensorflowV1ModelStep, encoder_state, decoder_cell): | ||
start_inputs = tf.constant(GO_TOKEN, shape=[step.hyperparams['batch_size'], step.hyperparams['output_dim']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One strange thing I noticed here: this "start_inputs" is 2D here, whereas it's 3D in the create_training_decoder
. Might be something to look at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'd delete the create_inference_decoder
function anyway, and rather instead use the other one for both test and train.
|
||
def create_training_decoder(step: TensorflowV1ModelStep, encoder_state, decoder_cell): | ||
go_tokens = tf.constant(GO_TOKEN, shape=[step.hyperparams['batch_size'], 1, step.hyperparams['output_dim']]) | ||
inputs = tf.concat([go_tokens, step['expected_outputs']], axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error might be here: I would use Go tokens everywhere. OR even better: I would use the encoder_outputs
at the last time step, which is probably encoder_outputs[:, -1, :]
, as a go token everywhere.
This way, we would have a decoder that is the same at inference (prediction) time AND at training time. We only need one function here. That might be the problem you were facing, explaining why the things didn't work.
Another thing that could explain why the things doesn't work is that you may have not shared parameters between the two decoders, despite it looks like you've properly shared them with the method create_stacked_rnn
. So perhaps that you have two decoders and one encoder by mistake (although I think you're okay).
|
||
decoder_cell = create_stacked_rnn(step) | ||
decoder_outputs_training = create_training_decoder(step, encoder_state, decoder_cell) | ||
decoder_outputs_inference = create_inference_decoder(step, encoder_state, decoder_cell) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep only one decoder here. See other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be more precise, I would edit something in the training decoder and delete the inference one that has the InferenceHelper for instance (that's too complicated to explain in a short training for people to use it and what I suggest will work fine).
return output | ||
|
||
|
||
def create_decoder_outputs(step, helper, encoder_state, decoder_cell): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review this function, but it looks overly complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we base ourselves as much as possible on my original seq2seq example here without many changes? (if that is possible anyhow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or see the TF2 example I've commented above this PR's review.
|
||
|
||
def create_encoder(step, data_inputs): | ||
encoder_cell = create_stacked_rnn(step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The create_stacked_rnn
was created outside the function in the decoder. When we'll fix the decoder to be used once only, it'll be pushed into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd code review pass
create_graph=create_graph, | ||
create_loss=create_loss, | ||
create_optimizer=create_optimizer, | ||
create_feed_dict=create_feed_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do? I need to understand everything
if exercise == 3: | ||
generate_x_y_data = generate_x_y_data_v3 | ||
if exercise == 4: | ||
generate_x_y_data = generate_x_y_data_v4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep the 4 examples. Good.
|
||
pipeline = DeepLearningPipeline( | ||
SignalPredictionPipeline(), | ||
validation_size=0.15, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no validation in the original example. This seems like it is a problem here (and in the LSTM example, too).
scoring_function=to_numpy_metric_wrapper(mean_squared_error) | ||
) | ||
|
||
data_inputs, expected_outputs = generate_x_y_data(isTrain=True, batch_size=SignalPredictionPipeline.BATCH_SIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what happens here so the following may be wrong but I think I spotted a problem:
I think your problem is that I was calling the generate_x_y_data
function at each epoch. You'd probably need here a train and a test data generator class that lazy loads each batches for the DL Minibatching Pipeline.
You'd also need to split the data on the time axis like this here if using cross validation:
So the users could optimize on the validation data with a manual search and manual parameter tuning (no random search), and then call a function to test on the test set or something. So we need to probably change the generate_x_y_data
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the cross validation is done in the deep learning pipeline from neuraxle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait you don't have all my changes I actually have a loop that calls generate_x_y_date a few times :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's change generate_x_y_data_v4 it was confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the normalization, and lazy data loading should be done in neuraxle
mse_train = pipeline.get_epoch_metric_train('mse') | ||
mse_validation = pipeline.get_epoch_metric_validation('mse') | ||
|
||
plot_metric(mse_train, mse_validation, xlabel='epoch', ylabel='mse', title='Model Mean Squared Error') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please upload a preview of that plot / chart in your PR for me to inspect this without running the code locally yet?
@@ -0,0 +1,3 @@ | |||
tensorflow==1.14 | |||
-e git://github.com/alexbrillant/Neuraxle.git@a270fe2b2f73c9350d76fcf4b6f058b764a8c8f7#egg=neuraxle | |||
requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexbrillant Let's change requests to urllib
. Let's not import requests anymore. See how I use urllib here:
https://www.neuraxle.org/stable/rest_api_serving.html#API-Call-Example
TODO: comments & docstrings...
Note : it wouldn't take long to migrate this version to tensorflow 2 !
How to use TrainingHelper and InferenceHelper together :
https://stackoverflow.com/questions/49134432/how-to-use-tensorflow-seq2seq-without-embeddings