Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNIST Example Keras Model Error #3755

Closed
blepfo opened this issue Mar 26, 2018 · 3 comments
Closed

MNIST Example Keras Model Error #3755

blepfo opened this issue Mar 26, 2018 · 3 comments

Comments

@blepfo
Copy link

blepfo commented Mar 26, 2018

This Issue documents what I believe to be an error in the MNIST example code.

I was using the code in models/official/mnist/mnist.py as a template for developing a different model, and I ran into an error that I believe exists in the MNIST model code. The following description is when I try to execute code directly copy-pasted from mnist.py.


System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04.4 LTS in WSL
  • TensorFlow installed from (source or binary): Installed from Anaconda
  • TensorFlow version (use command below): 1.6.0
  • Exact command to reproduce: Copied directly from lines 32-88 of models/official/mnist/mnist.py, with the last line added to test instantiating the Model object.
import tensorflow as tf

class Model(tf.keras.Model):
  """Model to recognize digits in the MNIST dataset.
  Network structure is equivalent to:
  https://github.com/tensorflow/tensorflow/blob/r1.5/tensorflow/examples/tutorials/mnist/mnist_deep.py
  and
  https://github.com/tensorflow/models/blob/master/tutorials/image/mnist/convolutional.py
  But written as a tf.keras.Model using the tf.layers API.
  """

  def __init__(self, data_format):
    """Creates a model for classifying a hand-written digit.
    Args:
      data_format: Either 'channels_first' or 'channels_last'.
        'channels_first' is typically faster on GPUs while 'channels_last' is
        typically faster on CPUs. See
        https://www.tensorflow.org/performance/performance_guide#data_formats
    """
    super(Model, self).__init__()
    if data_format == 'channels_first':
      self._input_shape = [-1, 1, 28, 28]
    else:
      assert data_format == 'channels_last'
      self._input_shape = [-1, 28, 28, 1]

    self.conv1 = tf.layers.Conv2D(
        32, 5, padding='same', data_format=data_format, activation=tf.nn.relu)
    self.conv2 = tf.layers.Conv2D(
        64, 5, padding='same', data_format=data_format, activation=tf.nn.relu)
    self.fc1 = tf.layers.Dense(1024, activation=tf.nn.relu)
    self.fc2 = tf.layers.Dense(10)
    self.dropout = tf.layers.Dropout(0.4)
    self.max_pool2d = tf.layers.MaxPooling2D(
        (2, 2), (2, 2), padding='same', data_format=data_format)

  def __call__(self, inputs, training):
    """Add operations to classify a batch of input images.
    Args:
      inputs: A Tensor representing a batch of input images.
      training: A boolean. Set to True to add operations required only when
        training the classifier.
    Returns:
      A logits Tensor with shape [<batch_size>, 10].
    """
    y = tf.reshape(inputs, self._input_shape)
    y = self.conv1(y)
    y = self.max_pool2d(y)
    y = self.conv2(y)
    y = self.max_pool2d(y)
    y = tf.layers.flatten(y)
    y = self.fc1(y)
    y = self.dropout(y, training=training)
    return self.fc2(y)

test = Model('channels_first')

Describe the problem

The model architecture specification in mnist.py is encapsulated using a tf.keras.Model object. The model layers are specified in the __init__() function, and they are chained together to compute the model's output in the __call__() function. From what I can tell, the subclassing of tf.keras.Model is a misuse of the Keras Model class API, and this is not a valid way to construct a model in TensorFlow r1.6.

On line 52 of mnist.py, a call is made super(Model, self).__init__(). This calls the tf.keras.Model constructor. However, the Keras Model constructor requires two arguments -- inputs and outputs. Thus, when I try to instantiate the Model object, I get the following error:

TypeError: __init__() takes at least 3 arguments (1 given)

The workaround is not difficult -- one just has to add a tf.keras.layers.Input in the __init__() function, and move the contents of __call__() into __init__(). The call to super() should then be moved to the end of __init__(), to become

super(Model, self).__init__(inputs=inputs, outputs=outputs)

where inputs is the tf.keras.layers.Input, and outputs is the output of the layer representing the model output.

That said, as I mentioned, the Keras documentation suggests that Model objects should be instantiated using Model(inputs, outputs) rather than trying to subclass tf.keras.Model as is done in mnist.py (Model class API, Getting started with the Keras functional API).

Is my assessment here correct, or am I missing something in the MNIST example code? I'm using TensorFlow version 1.6.0 with Python 2.7.13 (using Anaconda).

@fchollet
Copy link
Member

However, the Keras Model constructor requires two arguments -- inputs and outputs.

The API you are trying to use is only available as of TF 1.7. As of TF 1.6, the Keras Model constructor does not allow model subclassing.

Please upgrade to the latest TF 1.7 release candidate to fix your issue.

That said, as I mentioned, the Keras documentation suggests that Model objects should be instantiated using Model(inputs, outputs) rather than trying to subclass tf.keras.Model

That has been the convention so far, but with the introduction of eager execution, we are adding a new way to use the Keras Model API: model subclassing (as seen in all eager execution code examples). This new API will become part of the Keras API spec in the future, and you will be able to use it with multi-backend Keras too (not just tf.keras).

Note that the model subclassing API is available whether or not eager execution is enabled. You can build Models via subclassing even when manipulating symbolic tensors. In general all Model APIs work the same way with or without eager execution.

Also note that this new API does not replace any existing API (e.g. Model(inputs, outputs) or Sequential), rather it is an addition. We only recommend this API to fairly advanced users who are in need of maximum flexibility (which comes at the cost of slightly more verbosity and less hand-holding).

@asimshankar
Copy link
Contributor

To add to what fchollet said: The branches in this models repository are compatible with the corresponding branch of the github.com/tensorflow/tensorflow repository.

So, if you're looking at the "master" branch here, then you'll need to build TensorFlow from source of the "master" branch. And if you're looking for official samples compatible with a specific version of TensorFlow, look at the corresponding branch (e.g., r1.6, r1.7 etc.) of this repository (See also #3491)

Hope that helps.

And also, for MNIST in particular, it probably makes sense to switch to tf.keras.Sequential, which I will do.

@navneet-nmk
Copy link

I have built a model by subclassing Model but I am facing the following bug:

`

NotImplementedError: fit_generator is not yet enabled for unbuilt Model subclasses

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants