Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Saving torchscript models #2263

Closed
hhsecond opened this issue Jan 4, 2020 · 13 comments
Closed

[FR] Saving torchscript models #2263

hhsecond opened this issue Jan 4, 2020 · 13 comments
Labels
Acknowledged This issue has been read and acknowledged by the MLflow admins. area/models MLmodel format, model serialization/deserialization, flavors enhancement New feature or request priority/important-soon The issue is worked on by the community currently or will be very soon, ideally in time for the

Comments

@hhsecond
Copy link
Contributor

hhsecond commented Jan 4, 2020

Describe the proposal

Option to save torchscript model using torch.jit.save instead of torch.save which enables the deployment toolkits to pickup the optimized torchscript model for production

Motivation

Mlflow currently doesn't distinguish between native pytorch model (subclass of torch.nn.Module) and torchscript model (subclass of torch.nn.ScriptModule). This is crucial since the production deployment system works efficiently on models saved as torchscript models (using torch.jit.save)
This is not possible to achieve with the current pytorch.save_model since we use torch.save internally.

Proposed Changes

I have two proposals in mind after talking to @mateiz in slack

  1. Use the existing pytorch.save_model itself with an argument user can control to tell mlflow that whether it's a torchscript model or a torch.nn.Module model (or we can check the type of the incoming model and decide which flavor but explicit is better than implicit, I guess). The only constraint here is the same pytorch file will now have to save two flavors -> pytorch or torchscript. This sort of doesn't follow the MLFlow API design where every flavor gets one save, load and log function.
  2. Use another flavor and call it as torchscript and make a module for that. The API here becomes mlflow.torchscript

I have coded up a draft form of the second approach here but looking for suggestions before I make a PR for this to mlflow

@hhsecond hhsecond added the enhancement New feature or request label Jan 4, 2020
@smurching
Copy link
Collaborator

@hhsecond do you know what the relationship is between Torchscript & native PyTorch models & how Torchscript is commonly used? Just skimmed through the Torchscript tutorial, which suggests that Torchscript is an IR for native PyTorch models that's most-often used to score PyTorch models in non-Python contexts, rather than load them back for scoring in Python.

If that's the case, then I think a variant of option 1) may make more sense, where we call both torch.save and torch.jit.save in pytorch.save_model to generate both torchscript & pytorch model flavors.

Will give this a more detailed look tomorrow, but if you have any pointers to docs/explanations on this topic that'd be welcome - thanks :)

@hhsecond
Copy link
Contributor Author

hhsecond commented Jan 9, 2020

@smurching Your understanding is right. There are two decisions I am not sure about if we are letting pytorch.save_model saves both flavors.

  1. How will a user tells MLFLow which flavor (among this pytorch and torchscript) he'd like to use. We can definitely infer this by using isinstance or letting the user specify that as an argument. It is just that we need to take that call
  2. Usually other part of the program (MLFlow core itself or user developing code) access the flavor name of pytorch module by mlflow.pytorch.FLAVOR_NAME. This is going to break if we have two flavors in the same module

Am I making sense or did I miss anything obvious here?

@smurching
Copy link
Collaborator

Good questions. Given that there's no way to actually execute a torchscript model in Python (Is this correct? I couldn't find anything about a Python runtime for torchscript models, only C++), I'm not sure it makes sense to introduce a mlflow.torchscript Python module.

Instead, for 1), could we simply persist the model in both pytorch & torchscript flavors when pytorch.save_model is called? Or is it not always possible / too expensive to do so? We do something similar currently in mlflow.spark.log_model, which accepts an sample_input argument that when provided, causes mlflow.spark.log_model to generate an mleap flavor in addition to the spark flavor - in the pytorch case, we would unconditionally generate the torchscript flavor.

For 2), the only code in MLflow that references the Pytorch flavor by name is within the mlflow.pytorch module itself, in order to provide a name to the flavor when saving pytorch models. Currently when we save pytorch models, we generate both:

  • A pytorch flavor (which is used by mlflow.pytorch.load_model to load the model back into a Python Pytorch model object) and
  • A python-function flavor (which is used by APIs like mlflow.pyfunc.load_model to load back an object for predicting on pandas DataFrames).

Given that there's no "python-function" representation of a torchscript model, the Python-function flavor implemented within mlflow.pytorch could simply preserve its existing implementation without any ambiguity.

@hhsecond
Copy link
Contributor Author

@smurching
So the reason behind the existence of torchscript is that the python interpreter is probably not good for production runtime especially when you have high load. Also, Your assumption is slightly wrong about torchscript is not loadable in python runtime. You can. But that kills the purpose of torchscript and you'd better be loading the native pytorch model instead of torchscript model if your runtime is python.

  1. Saving in torchscript requires some effort from users (it is not automagically convertible). So user needs to make torchscript model out of native pytorch model and then pass that to our save or log function
  2. So let's take a case of deployment plugins. Any runtime that needs to fetch torchscript model, won't be able to do so if pytorch flavor represents both native pytorch and torchscirpt

@tomasatdatabricks
Copy link
Contributor

Hi @hhsecond, thanks for looking into this!

From looking at the history of this issue, my understanding is

  1. we want to enable users to also store torchscript version of their model in order to enable high perf scoring.
  2. torchsript model generation requires user input.

If that is correct, I would propose the following api change:
Add optional torchsript model argument to mlflow.pytorch.(log/save)_model functions. If the argument is provided, mlflow.pytorch module also produce torchscript flavor. We do not have to worry about loading it into python for now, since you mentioned that people are mostly interested in this if they do C++ scoring. Alternatively, we can opt to load torchscript by default in the pyfunc model load. If the torchsript flavor is not available or if it fails to load, we can fall back onto native pytorch flavor.

Does that sound reasonable to you?

@hhsecond
Copy link
Contributor Author

Yes, this was my initial thought too. The only concern here is for the folks who rely on getting the flavor name from the module instead of hardcoding i.e someone who calls mlflow.pytorch.FLAVOR_NAME always gets either one of pytorch or torchscript. This is not a deal-breaker we can either

  • Send a list of both flavors which breaks the standard in other modules
  • Expect the developer to hardcode the value instead of taking from us which nullifies the need of the existence of mlflow.xxx.FLAVOR_NAME

@tomasatdatabricks
Copy link
Contributor

Hmm I am not sure I understand why do you want mlflow.pytorch.FLAVOR_NAME to also point to torchscript? I think if we are creating a separate torchscript flavor, the flavor name would be a separate constant. So e.g. mlflow.torchscript.FLAVOR_NAME if we decide we want a separate module for torchscript or mlflow.pytorch.TORCHSCRIPT_FLAVOR_NAME if we keep it inside of pytorch.

@hhsecond
Copy link
Contributor Author

hhsecond commented Mar 18, 2020

I don't have any use case with the need for both flavor names. But I assumed you'd need this as the API's design when you said:

mlflow.pytorch module also produce torchscript flavor.

My bad. I apologize.

One last thing that needed to be notified, is about this statement from your previous comment

Add optional torchsript model argument to mlflow.pytorch.(log/save)_model functions. If the argument is provided, mlflow.pytorch module also produce torchscript flavor.

Saving both pytorch and torchscript is not possible. Users can pass either a pytorch (native) model or torchscript model. We have two options to figure out whether it's torchscript or native pytorch model

  • User pass an argument (something like torchscript=True)
  • We'd figure out from the type of the model object

Once we know what's the model object, we'll use the corresponding mechanism to save the model. The pyfunc model can load both pytorch and torchscript models into python. And if it's a C++ runtime, it can load torchscript in the way they want and will raise an error in case it's a native pytorch model

@tomasatdatabricks
Copy link
Contributor

I see. What would happen if you call mlflow.pytorch.load_model(path/to/torchscript/model)?

I think we should also consider adding torchscript as an independent flavor, so that user would do:

mlflow.torchscript.load/log/save_model(...)

@hhsecond
Copy link
Contributor Author

What would happen if you call mlflow.pytorch.load_model(path/to/torchscript/model)?

From the path, I am assuming you are passing the path to a torchscript model, It would fetch torchscript model loaded into python. So the catch here is if we have mlflow.pytorch.load_model for loading both pytorch and torchscript model, the downstream process can't make assumptions about the signature (attributes, methods, etc) of the returned object since pytorch object and torchscript object is different.

Adding torchscript as an independent flavor is the most straightforward and easy/simple solution, IMO.

@tomasatdatabricks
Copy link
Contributor

I agree! Let me know if you would like to contribute this, I'd be happy to review your PR.

@tomasatdatabricks tomasatdatabricks added the Acknowledged This issue has been read and acknowledged by the MLflow admins. label Mar 19, 2020
@hhsecond
Copy link
Contributor Author

I'd love to. I'll send you a review request soon

@pogil pogil added the area/models MLmodel format, model serialization/deserialization, flavors label May 1, 2020
@juntai-zheng juntai-zheng added the priority/important-soon The issue is worked on by the community currently or will be very soon, ideally in time for the label Jul 20, 2020
@dbczumar
Copy link
Collaborator

dbczumar commented Jun 2, 2022

mlflow.pytorch.log_model() now supports persistence of torchscript models (#3557).

@dbczumar dbczumar closed this as completed Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Acknowledged This issue has been read and acknowledged by the MLflow admins. area/models MLmodel format, model serialization/deserialization, flavors enhancement New feature or request priority/important-soon The issue is worked on by the community currently or will be very soon, ideally in time for the
Projects
None yet
Development

No branches or pull requests

6 participants