Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Keras Pre-made Models #95

Merged
merged 5 commits into from
Jun 10, 2019
Merged

RFC: Keras Pre-made Models #95

merged 5 commits into from
Jun 10, 2019

Conversation

tanzhenyu
Copy link
Contributor

@tanzhenyu tanzhenyu commented Apr 25, 2019

Feedback period will be open until 2019-05-10

Keras pre-made models

Status Proposed
Author(s) Zhenyu Tan (tanzheny@google.com)
Sponsor Francois Chollet (fchollet@google.com), Alexandre Passos (apassos@google.com)
Updated 2019-04-29

Objective

This document proposes several pre-made Keras models that would allow users to:

  • build basic machine learning models easily
  • compose them with other keras Layers
  • replace Canned Estimators in TF 2.0.

@tanzhenyu tanzhenyu changed the title Create 20190425-keras-premade-models.md RFC Keras Premade Models Apr 25, 2019

### Proposal 1: Customized training function & composability
We propose to let each subclassed pre-made model override the training function. It is optional to provide a special subclass `CannedModel` if other methods such as `compile` and `fit` needs to be overriden as well. In traditional Keras models, such training function is dominated by autodiff - given the forward pass of the model, the gradients for each Operation is generated and backprop computation is automatically laid out for the entire model. However such assumption is only valid for neural network based supervised learning architecture. For many other scenarios, we would need to break this assumption:
1. gradients may not be used, e.g., any un-supervised learning tasks
Copy link
Member

@seanpmorgan seanpmorgan Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subclass CannedModel seems like a half measure. If the training function is heavily modified it seems strange not to have an object type that describes that. Perhaps IrregularModel which would specify models that do not follow the typical Keras paradigm.

The canned DNN model is a CannedModel imo, but not an irregular model.

Copy link
Member

@seanpmorgan seanpmorgan Apr 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type CannedModel also makes it sound as if users are not supposed to be building their own unique model types. To me canned means pre-made (which is what this RFC is about... but it should be extensible for users imo).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better not to have CannedModel for now, and let's just override train_function (or train_on_batch) -- WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that's better than an optional subclass. I would probably still prefer a new Model type so that it's understood how different these are than typical Keras Models, but something like IrregularModel may just be more bloat to the API than its worth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping this as a design question to be addressed through design reviews.

@ewilderj ewilderj added this to Needs attention in RFC management via automation Apr 29, 2019
@ewilderj ewilderj added the RFC: Proposed RFC Design Document label Apr 29, 2019
@ewilderj ewilderj moved this from Needs attention to Open reviews in RFC management Apr 29, 2019
@ewilderj ewilderj changed the title RFC Keras Premade Models RFC: Keras Pre-made Models Apr 29, 2019
@AakashKumarNain
Copy link
Member

Because the RFC is about Pre-made models, I think this should also be considered in the RFC:
https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/QMCbkS6uZSM

@unrahul
Copy link

unrahul commented Apr 29, 2019

Could you give a clear example of what you mean by pre-made models, I am not sure if i clearly understand the difference between pre-made and extensible. Does this proposal mean, we can have estimator like models that are extensible for users to modify if needed otherwise use as is ?

* relies on continuous graph rebuilding and checkpoints reloading which slows down training
* relies on global collections and not TF 2.0 friendly.
* makes many advanced features such as meta/transfer learning difficult
* enforces user to create input functions when not necessary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not making premade estimators better from these perspectives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have two sets of high level APIs and Keras is the one that we choose to proceed because of its simplicity and pythonic implementation.

### Challenges
Many canned models, including BoostedTrees, KMeans (as well as WideDeep and many others more in the future) are highly complicated and do not follow the simple foward & backprop workflow, an assumption heavily relied on by Keras. Building such models, while not compromising the basic composability of Keras (i.e., compose layers on top of layers), is the main challenge we need to resolve in this document. Another challenge is that these models can have multiple training phases (such as collaborative filtering models). We propose the following two approaches to address them:

### Proposal 1: Customized training function & composability
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other phases like evaluation and prediction? What's the difference between this and tf.estimator's model_fn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation and prediction are specified in Mode in Estimator world. In Keras model will by default create train_function, evaluate_function and predict_function and use the corresponding one during fit/eval/predict.

@tanzhenyu
Copy link
Contributor Author

Could you give a clear example of what you mean by pre-made models, I am not sure if i clearly understand the difference between pre-made and extensible. Does this proposal mean, we can have estimator like models that are extensible for users to modify if needed otherwise use as is ?

@unrahul I'm not sure why do you mean by estimator like models. But pre-made in this case is similar to keras-application models, you can compose them with any other layers and models. You can modify it by inheritance yes.

@tanzhenyu
Copy link
Contributor Author

Because the RFC is about Pre-made models, I think this should also be considered in the RFC:
https://groups.google.com/a/tensorflow.org/forum/#!topic/developers/QMCbkS6uZSM

@AakashKumarNain Not sure I understand the correlation -- from the discussion it looks like that is related to training/testing for variables. The models in this proposal do not have any variables in that property.

@AakashKumarNain
Copy link
Member

@tanzhenyu Yeah I got it but then I think the name for RFC should have been something else. Anyways, thank you for clarification again

@KesterTong
Copy link

Can we add some discussion on integration with ProcessingLayers, which should also help clarify what the inputs to a CannedModel are?

@fchollet and I have brainstormed the following two options:

Option 1: Models accept a list of tensors as inputs, or pair of lists for DNNLinearModel, where the pair is the list of DNN and linear inputs respectively.

age = tf.keras.Input(shape=(1,), dtype=tf.float32, name='age')
bucketized_age = tf.keras.Discretize(age, bins=4)(age)
occupation = tf.keras.Input(shape=(None,), dtype=tf.string, name='occupation')
occupation_id = tf.keras.VocabLookup(vocabulary_list)(occupation)
occupation_embed = tf.keras.layers.Embedding(output_dim=10)(occupation_id)
processing_stage = tf.keras.ProcessingStage(
    inputs=[age, occupation], outputs=[bucketized_age, occupation_embed])
# for DNNLinearModel:
# processing_stage = tf.keras.ProcessingStage(
#     inputs=[age, occupation],
#     outputs=([bucketized_age, occupation_embed], [occupation_id]))

canned_model = tf.keras.canned.LinearClassifier()
# for DNNLinearModel:
# canned_model = tf.keras.canned.DNNLinearClassifier()
output = canned_model(processing_stage.outputs)
model = tf.keras.Model(inputs=processing_stage.inputs, outputs=[output])

dftrain = pd.read_csv('storage.googleapis.com/tf-datasets/titanic/train.csv')
y_train = dftrain.pop('survived')
ds = tf.data.Dataset.from_tensor_slices(dict(dftrain), y_train)
processing_stage.update(ds)
model.train(ds, epochs=10)

Option 2: Instead of accepting a list of tensors (or pair of lists for DNNLinearModel), models accept a single tensor (or pair of tensors for DNNLinearModel). The input represents the concatenated features.

age = tf.keras.Input(shape=(1,), dtype=tf.float32, name='age')
bucketized_age = tf.keras.Discretize(age, bins=4)(age)
occupation = tf.keras.Input(shape=(None,), dtype=tf.string, name='occupation')
occupation_id = tf.keras.VocabLookup(vocabulary_list)(occupation)
occupation_embed = tf.keras.layers.Embedding(output_dim=10)(occupation_id)
processing_stage = tf.keras.ProcessingStage(
    inputs=[age, occupation],
    outputs=tf.keras.layers.concatenate([bucketized_age, occupation_embed]))
# for DNNLinearModel:
# processing_stage = tf.keras.ProcessingStage(
#     inputs=[age, occupation],
#     outputs=(tf.keras.layers.concatenate([bucketized_age, occupation_embed]), occupation_id))

canned_model = tf.keras.canned.LinearClassifier()
# for DNNLinearModel:
# canned_model = tf.keras.canned.DNNLinearClassifier()
output = canned_model(processing_stage.outputs)
model = tf.keras.Model(inputs=processing_stage.inputs, outputs=[output])

dftrain = pd.read_csv('storage.googleapis.com/tf-datasets/titanic/train.csv')
y_train = dftrain.pop('survived')
ds = tf.data.Dataset.from_tensor_slices(dict(dftrain), y_train)
processing_stage.update(ds)
model.train(ds, epochs=10)

@ewilderj ewilderj moved this from Open reviews to Awaiting Committee Notes in RFC management May 31, 2019
@ewilderj ewilderj moved this from Awaiting Committee Notes to In Revision in RFC management May 31, 2019
@ewilderj ewilderj merged commit d9b5cfb into tensorflow:master Jun 10, 2019
RFC management automation moved this from In Revision to Accepted RFCs Jun 10, 2019
@ewilderj ewilderj added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Oct 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes RFC: Accepted RFC Design Document: Accepted by Review
Projects
RFC management
  
Accepted RFCs
Development

Successfully merging this pull request may close these issues.

None yet

8 participants