Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serving a model #153

Closed
dimidd opened this issue Oct 9, 2018 · 6 comments
Closed

Serving a model #153

dimidd opened this issue Oct 9, 2018 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@dimidd
Copy link
Contributor

dimidd commented Oct 9, 2018

Is your feature request related to a problem? Please describe.
Serving a trained model in production.

Describe the solution you'd like
I'd like to understand how to interface with tensorflow.

Describe alternatives you've considered
I'm able to save and load a model, but not sure how to restore and serve it using TF.

@madisonmay madisonmay added the enhancement New feature or request label Oct 9, 2018
@madisonmay
Copy link
Contributor

Hi @dimidd, thanks for the feature request.

At the moment the development branch does not support exposing the model through tensorflow serving. However, we're in the middle of a large refactor (#148) that should migrate finetune onto the tensorflow estimator API. Can't make any promises, but as tensorflow serving has explicit support for the estimator framework it seems likely that exposing some functionality to make finetune work with tensorflow serving will be straightforward. Will keep you posted via this ticket.

--Madison

@madisonmay madisonmay self-assigned this Oct 12, 2018
@dimidd
Copy link
Contributor Author

dimidd commented Oct 31, 2018

Hi Madison,

Now that the TF-estimator refactor is done, I'd like to follow-up on this. Are there tasks I can help with? Any low-hanging fruits?

Thanks again

@madisonmay
Copy link
Contributor

madisonmay commented Oct 31, 2018

Hi @dimidd, thanks for the follow-up! I haven't worked with tensorflow serving before so you'll have to bear with me but I'll try to use this ticket to lay out a rough plan of attack:

  1. Add an export() function to the BaseModel class in finetune/base.py.
  2. Build off of finetune/input_pipeline.py:BasePipeline.feed_shape_type_def in order to implement a ServingReceiverFunction for the possible input types (single field, multifield, etc.).
  3. Finish export() method with call to model.export_savedmodel(save_location, serving_input_receiver_fn=serving_input_receiver_fn). This output file would then be compatible with the tensorflow serving API and should be able to be plugged directly into any system already using tensorflow serving.

I think this what we could add for MVP support for this feature. Hosting the saved model would probably be deferred to the end user for now. Is this roughly what you were thinking of?

@dimidd
Copy link
Contributor Author

dimidd commented Oct 31, 2018

Thanks! I'll dive into the code.

@benleetownsend
Copy link
Contributor

@dimidd Had a discussion with @madisonmay about a TF-Serving solution. I have worked with this in the past and have enough experience to know this is not something we will be able to officially support with finetune.

The problems with using the Serving API with finetune is that:

  • We have significant code written in python to do pre and post processing at inference time. This is quite heavy process, but with the TF-Serving model, only the tensorflow section of the model is served, This makes the pre-processing the clients responsibility and means they would need the data related to this. (no simple string requests here).
  • You cannot currently serve models that use py_funcs in them.

However, serving a model from finetune with something like flask, would be pretty simple, and with the inference optimizations discussed in #188 should be performant enough for most use cases.

@dimidd
Copy link
Contributor Author

dimidd commented Nov 7, 2018

Hi Ben,

Agreed. I've actually used Madison's suggestions (e.g. something like dimidd@ecedc5c) with flask, and it works quite well.
Here's the relevant code:

import finetune as ft
from finetune.model import PredictMode

application = Flask(__name__)
[...]
model = ft.Classifier.load('my_model.bin')
mode = PredictMode.NORMAL
test_pred = model._inference2(["lorem"], mode)

def predict(input_data):
    [...]
     return model._inference2([input], mode)[0]

@application.route('/api', methods=['POST'])
def api():
    [...]
    output_data = predict(input_data)
    [...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants