Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working with TF2.0 #16

Closed
andySigler opened this issue Jul 3, 2019 · 8 comments
Closed

Working with TF2.0 #16

andySigler opened this issue Jul 3, 2019 · 8 comments
Labels
enhancement New feature or request

Comments

@andySigler
Copy link

Hi and thanks for this MDN abstraction and NIME paper. Below is more of a feature request than an issue.

Have you thought of making your MDN layer work with TF2.0?

I am relatively new to TF and attempting to port your work to 2.0, you can see the changes I've made here. It seems to be working, except I am unable to save the model using model.save(), I get the error Unable to create link (name already exists)

If you have any thoughts, suggestions that would greatly appreciated as I'm pretty lost with that error ¯_(ツ)_/¯

@andySigler
Copy link
Author

I've created a separate branch generates MDN layers functionally, and is working with tf2.0 saving/loading

@cpmpercussion
Copy link
Owner

cpmpercussion commented Aug 11, 2019

Hey @andySigler - this looks great! I'm not quite ready to move to tf2.0 yet, so I haven't played around with it, but is everything working?

re model.save() I don't have that working either (was never properly implemented or tested).

Not sure that I want to move to tf2.0 until it's the stable release, but at that time, I'll probably think about pulling all your changes in and moving forward from there.

@cpmpercussion cpmpercussion added the enhancement New feature or request label Aug 11, 2019
@cpmpercussion cpmpercussion pinned this issue Aug 11, 2019
@DeviLeo
Copy link

DeviLeo commented Aug 22, 2019

I found mdn works well both on tf 1.14 and tf 2.0 beta1 using tf.keras if remove the codes below in MDN.build() method.

# self.trainable_weights = self.mdn_mus.trainable_weights + self.mdn_sigmas.trainable_weights + self.mdn_pi.trainable_weights
# self.non_trainable_weights = self.mdn_mus.non_trainable_weights + self.mdn_sigmas.non_trainable_weights + self.mdn_pi.non_trainable_weights

In Keras, Layer.trainable_weights and Layer.non_trainable_weights will not update and keep empty after mdn_mus/sigmas/pi.build().
But in tf.keras, they will update automatically after mdn_mus/sigmas/pi.build().

In tf 2.0 beta1, RuntimeError: Unable to create link (name already exists) will raise when model.save().
I use get_weights() and set_weights() to make it work.

import pickle

# get weights
with open('mdn_weights.pkl', 'wb') as f:
        weights = model.get_weights()
        pickle.dump(weights, f, pickle.HIGHEST_PROTOCOL)

# set weights
with open('mdn_weights.pkl', 'rb') as f:
        weights = pickle.load(f)
        model.set_weights(weights)

@cpmpercussion
Copy link
Owner

cpmpercussion commented Aug 23, 2019

Hi, @DeviLeo this tracks, so far model.save() is not properly implemented.

@DeviLeo
Copy link

DeviLeo commented Sep 4, 2019

Hi, @cpmpercussion, @andySigler
To support both TensorFlow 2.0 and TensorFlow Serving, I replaced class MDN(Layer) ... with the functional APIs and saved the keras model successfully.
The keras model also can be exported as saved_model format which can be used on TensorFlow Serving.
See the functional APIs codes below.

from tensorflow import keras
from tensorflow.keras import backend as K
from tensorflow.keras.layers import Input, Dense, concatenate
from tensorflow.keras.models import Model

def mdn_network(x, output_dimension, num_mixtures):
    ''' Define the network of mdn '''
    mdn_mus = Dense(num_mixtures * output_dimension, name='mdn_mus')  # mix*output vals, no activation
    mdn_sigmas = Dense(num_mixtures * output_dimension, activation=elu_plus_one_plus_epsilon, name='mdn_sigmas')  # mix*output vals exp activation
    mdn_pi = Dense(num_mixtures, name='mdn_pi')  # mix vals, logits
    mdn_out = concatenate([mdn_mus(x), mdn_sigmas(x), mdn_pi(x)], name='mdn_outputs')
    return mdn_out

def MDN(input_shape, output_dimension, num_mixtures):
    """ `input_shape` should be the shape without the batch dimension. """
    mdn_input = Input(input_shape)
    mdn_output = mdn_network(mdn_input, output_dimension, num_mixtures)
    mdn_model = Model(mdn_input, mdn_output)
    return mdn_model

Test codes below

from tensorflow import keras
import mdn

def test_save_mdn():
    N_HIDDEN = 5
    N_MIXES = 5
    model = keras.Sequential()
    model.add(keras.layers.Dense(N_HIDDEN, batch_input_shape=(None, 1), activation='relu'))
    model.add(keras.layers.Dense(N_HIDDEN, activation='relu'))
    model.add(mdn.MDN(N_HIDDEN, 1, N_MIXES))
    model.compile(loss=mdn.get_mixture_loss_func(1, N_MIXES), optimizer=keras.optimizers.Adam())
    model.save('test_save.h5')
    model.save('test_saved_model', 'tf')

    custom_objects = {
        'MDN': mdn.MDN, 
        'elu_plus_one_plus_epsilon': mdn.elu_plus_one_plus_epsilon,
        'mdn_loss_func': mdn.get_mixture_loss_func(1, N_MIXES)
    }
    m_2 = keras.models.load_model('test_save.h5', custom_objects=custom_objects)
    assert isinstance(m_2, keras.Sequential)

if __name__ == "__main__":
    test_save_mdn()

Notice
The first parameter input_shape of mdn.MDN should be the shape without the batch dimension.
For example

from tensorflow import keras

output_dimension, num_mixes = 1, 5
input = keras.layers.Input((4, 100)) # input.shape is (None, 4, 100)
output = mdn.MDN(input.shape[1:], output_dimension, num_mixes)(input) # input_shape should be (4, 100)

model = keras.Model(input, output)
model.compile(loss=mdn.get_mixture_loss_func(output_dimension, num_mixes), optimizer=keras.optimizers.Adam())
model.save('test_mdn.h5')
model.save('test_mdn_saved_model', 'tf')

@cpmpercussion
Copy link
Owner

@DeviLeo, you can certainly do it that way and great if that approach works for you! I guess I'm focused on using the Layer API (hence the name of this repo), so I will try to keep things within a Layer subclass.

@cpmpercussion
Copy link
Owner

cpmpercussion commented Sep 5, 2019

Just for those in this thread, I have a branch called tfkeras (456fd53) where I'm just exploring the differences with the tensorflow keras API. My goal would be to have as much compatibility as possible between the main code. In that spirit, I've just moved the trainable_weights calculation to a property which works for both tf.keras (where that property is read-only I guess) and regular keras (where it isn't). Note, I'm still just testing with TF1.14.0.

@cpmpercussion
Copy link
Owner

I've now merged the tfkeras branch (c2c3839) so this library is now compatible with tf2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants