Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Neuraxle refactor #16

Closed

Conversation

alexbrillant
Copy link
Contributor

@alexbrillant alexbrillant commented Jan 8, 2020

  • Use Neuraxle TensorflowV1ModelStep
  • Use Dynamic RNN to predict sequence of dynamic lengths
  • Clean tensorflow code & placeholders

TODO: comments & docstrings...
Note : it wouldn't take long to migrate this version to tensorflow 2 !

How to use TrainingHelper and InferenceHelper together :

https://stackoverflow.com/questions/49134432/how-to-use-tensorflow-seq2seq-without-embeddings

model.py Show resolved Hide resolved
@alexbrillant alexbrillant changed the title Neuraxle refactor (model.py only other changes were not wanted...) WIP Neuraxle refactor (model.py only other changes were not wanted...) Jan 8, 2020
@alexbrillant alexbrillant changed the title WIP Neuraxle refactor (model.py only other changes were not wanted...) WIP Neuraxle refactor Jan 8, 2020
@guillaume-chevalier
Copy link
Owner

Here is what I think we could do with TF2:

last_encoder_state, encoder_outputs = apply_encoder(encoder, tf_data_inputs)
last_encoder_output = encoder_outputs[:, -1, :]

stacked_cells = [gru_cell(...) for _ in range(stacked)]

rnn = RNN(stacked_cells)

decoder_outputs = []
state = last_encoder_state
for time_step in range(self.output_seq_len): 
    state, out = rnn(state_cell=state, input=last_encoder_output)  # repeat last encoder output as an input (cheap but very okay to do this here).
    decoder_outputs.append(out)

return decoder_outputs

This is some dirty pseudocode I've just done. For instance, the apply_encoder method is extracted in my code above as it seems not to be the problem we focus on. I think it would be possible to extract method as such for the decoder.

Copy link
Owner

@guillaume-chevalier guillaume-chevalier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First code review pass

name='data_inputs'
)

# shape: (batch_size, seq_length, input_dim)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# shape: (batch_size, seq_length, input_dim)
# shape: (batch_size, seq_length, output_dim)

Comment on lines +30 to +31
# shape: (batch_size)
target_sequence_length = tf.placeholder(dtype=tf.int32, shape=[None], name='expected_outputs_length')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrong dimension comment on an unused variable. Delete those 2 lines?

return decoder_outputs_training, decoder_outputs_inference


def create_encoder(step, data_inputs):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dimension comments of inputs and outputs will be important in those small functions



def create_inference_decoder(step: TensorflowV1ModelStep, encoder_state, decoder_cell):
start_inputs = tf.constant(GO_TOKEN, shape=[step.hyperparams['batch_size'], step.hyperparams['output_dim']])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One strange thing I noticed here: this "start_inputs" is 2D here, whereas it's 3D in the create_training_decoder. Might be something to look at.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I'd delete the create_inference_decoder function anyway, and rather instead use the other one for both test and train.


def create_training_decoder(step: TensorflowV1ModelStep, encoder_state, decoder_cell):
go_tokens = tf.constant(GO_TOKEN, shape=[step.hyperparams['batch_size'], 1, step.hyperparams['output_dim']])
inputs = tf.concat([go_tokens, step['expected_outputs']], axis=1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error might be here: I would use Go tokens everywhere. OR even better: I would use the encoder_outputs at the last time step, which is probably encoder_outputs[:, -1, :], as a go token everywhere.

This way, we would have a decoder that is the same at inference (prediction) time AND at training time. We only need one function here. That might be the problem you were facing, explaining why the things didn't work.

Another thing that could explain why the things doesn't work is that you may have not shared parameters between the two decoders, despite it looks like you've properly shared them with the method create_stacked_rnn. So perhaps that you have two decoders and one encoder by mistake (although I think you're okay).


decoder_cell = create_stacked_rnn(step)
decoder_outputs_training = create_training_decoder(step, encoder_state, decoder_cell)
decoder_outputs_inference = create_inference_decoder(step, encoder_state, decoder_cell)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep only one decoder here. See other comments.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more precise, I would edit something in the training decoder and delete the inference one that has the InferenceHelper for instance (that's too complicated to explain in a short training for people to use it and what I suggest will work fine).

return output


def create_decoder_outputs(step, helper, encoder_state, decoder_cell):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review this function, but it looks overly complicated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we base ourselves as much as possible on my original seq2seq example here without many changes? (if that is possible anyhow)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or see the TF2 example I've commented above this PR's review.



def create_encoder(step, data_inputs):
encoder_cell = create_stacked_rnn(step)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The create_stacked_rnn was created outside the function in the decoder. When we'll fix the decoder to be used once only, it'll be pushed into it.

Copy link
Owner

@guillaume-chevalier guillaume-chevalier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2nd code review pass

create_graph=create_graph,
create_loss=create_loss,
create_optimizer=create_optimizer,
create_feed_dict=create_feed_dict

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? I need to understand everything

if exercise == 3:
generate_x_y_data = generate_x_y_data_v3
if exercise == 4:
generate_x_y_data = generate_x_y_data_v4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the 4 examples. Good.


pipeline = DeepLearningPipeline(
SignalPredictionPipeline(),
validation_size=0.15,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no validation in the original example. This seems like it is a problem here (and in the LSTM example, too).

scoring_function=to_numpy_metric_wrapper(mean_squared_error)
)

data_inputs, expected_outputs = generate_x_y_data(isTrain=True, batch_size=SignalPredictionPipeline.BATCH_SIZE)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what happens here so the following may be wrong but I think I spotted a problem:

I think your problem is that I was calling the generate_x_y_data function at each epoch. You'd probably need here a train and a test data generator class that lazy loads each batches for the DL Minibatching Pipeline.

You'd also need to split the data on the time axis like this here if using cross validation:

So the users could optimize on the validation data with a manual search and manual parameter tuning (no random search), and then call a function to test on the test set or something. So we need to probably change the generate_x_y_data method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the cross validation is done in the deep learning pipeline from neuraxle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait you don't have all my changes I actually have a loop that calls generate_x_y_date a few times :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's change generate_x_y_data_v4 it was confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the normalization, and lazy data loading should be done in neuraxle

mse_train = pipeline.get_epoch_metric_train('mse')
mse_validation = pipeline.get_epoch_metric_validation('mse')

plot_metric(mse_train, mse_validation, xlabel='epoch', ylabel='mse', title='Model Mean Squared Error')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please upload a preview of that plot / chart in your PR for me to inspect this without running the code locally yet?

@@ -0,0 +1,3 @@
tensorflow==1.14
-e git://github.com/alexbrillant/Neuraxle.git@a270fe2b2f73c9350d76fcf4b6f058b764a8c8f7#egg=neuraxle
requests

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexbrillant Let's change requests to urllib. Let's not import requests anymore. See how I use urllib here:
https://www.neuraxle.org/stable/rest_api_serving.html#API-Call-Example

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants