[torchscript] stop serializing default arguments #54613

suo · 2021-03-24T18:37:24Z

Today, when serializing TorchScript code we make no distinction between arguments that were explicitly specified by the user, and arguments that can from defaults. This leads us to "burn in" default arguments.

This behavior causes forward compatibility problems when a new argument with a default value is added to an operator. Consider:

# version 1
aten.foo(Tensor arg1)

becoming

# version 2
aten.foo(Tensor arg1, String new_arg="my_default_value")

The serialized code at version 2 looks like:

aten.foo(my_tensor_input, "my_default_value")

so when you try to run it in a binary at version 1, you get a "too many arguments for schema" error.
One option is to not serialize the default value, so then the serialized code at version 2 looks like:

aten.foo(my_tensor_input)  # the schema matcher will insert new_arg="my_default_value"

and thus we retain compatibility.

This is a practically important issue, currently adding a new default value to a custom op will break PyPER due to forward compat issues: https://fburl.com/3fr2kjvn

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @anjali411 @gmagogsfm

The text was updated successfully, but these errors were encountered:

suo · 2021-03-24T18:38:32Z

related SEV4: S226373

albanD · 2021-03-25T14:00:35Z

Is there any concern that the behavior of the underlying function will silently change? Because if the old runner does not support that argument, then running that code with an older version won't give the same result as running with the new version (which is the one expected by the user when serializing).

dzhulgakov · 2021-03-26T00:16:34Z

@albanD - if that's the case then whoever added the new argument in the first place has broken the backward compatibility already.

In the above example aten.foo(x, "my_default_value") should return exactly the same value as aten.foo(x) did in the previous version of the runtime. Otherwise just regular python eager code doing aten.foo(x) won't be backward compatible. If that's the case, we'd need to go through BC process with adapters and such. But that's not the common case.

The only case where there would be a different if we implement this issue if we at any point decide to change the default value of an argument. I'd argue it's a pretty rare situation and kind of a sketchy thing to do in any case because it might break some python code inadvertently.

cc @raziel fyi

gmagogsfm · 2021-03-26T00:38:43Z

I think the right fix is to not include default values of arguments when user code doesn't specify it.

On top of the reasons @dzhulgakov mentioned, I think not saving default values is more aligned with PyTorch eager and future torch.package behavior, which both faithfully record exact Python code user writes. In this case, the user code is arguably forward-compatible, it is TorchScript's implementation that breaks originally forward-compatible user code.

Additionally, TorchScript is currently in a slightly strange state in terms of "providing hermetic packaging for PyTorch program". As we have seen in this case, TorchScript pull information (default value of argument) from aten op schema of version A , bakes it in IR, but attempts to execute the program with an aten library of version B. This is effectively only baking in part of aten library implementation while in theory we should either not baking in the default value or bake in default value + schema + aten kernels.

ezyang · 2021-05-26T03:57:25Z

sooo anyone gonna work on this?

gmagogsfm · 2021-06-07T16:21:37Z

Update:
On normal TS runtime side, this is already implemented.

The approach is to find values that are

Generated by constant node
Used as an argument to an operator invocation
Its constant value matches the default value of that argument

During serialization, do not print this argument, instead relying on future compilation to fill up needed arguments according to schemas available.

On mobile side, this isn't enabled yet due to 1) needing backport feature because this changes breaks backward compatibility (WIP) 2) long deployment timeline on mobile devices

gmagogsfm · 2021-06-23T22:02:43Z

Update:
This has been addressed for both server and mobile deployment. Closing.

ezyang · 2021-06-25T18:03:53Z

@gmagogsfm could you please tag which PR fixed this for server and mobile?

gmagogsfm · 2021-06-25T18:56:34Z

Somehow Github won't let me add PRs to "linked PR" column, so I will just list them here:
#57498
#57778
#56845
#58852
#59714

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Mar 24, 2021

suo added the high priority label Mar 24, 2021

pytorch-probot bot added the triage review label Mar 24, 2021

github-actions bot added this to Need triage in JIT Triage Mar 24, 2021

gmagogsfm self-assigned this Mar 24, 2021

gmagogsfm moved this from Need triage to In progress in JIT Triage Mar 24, 2021

gmagogsfm closed this as completed Jun 23, 2021

JIT Triage automation moved this from In progress to Done Jun 23, 2021

jbschlosser mentioned this issue Oct 29, 2021

Adding a new kwarg to a torch.nn.functional function breaks FC for JIT #67565

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torchscript] stop serializing default arguments #54613

[torchscript] stop serializing default arguments #54613

suo commented Mar 24, 2021 •

edited by pytorch-probot bot

suo commented Mar 24, 2021

albanD commented Mar 25, 2021

dzhulgakov commented Mar 26, 2021

gmagogsfm commented Mar 26, 2021

ezyang commented May 26, 2021

gmagogsfm commented Jun 7, 2021

gmagogsfm commented Jun 23, 2021

ezyang commented Jun 25, 2021

gmagogsfm commented Jun 25, 2021

[torchscript] stop serializing default arguments #54613

[torchscript] stop serializing default arguments #54613

Comments

suo commented Mar 24, 2021 • edited by pytorch-probot bot

suo commented Mar 24, 2021

albanD commented Mar 25, 2021

dzhulgakov commented Mar 26, 2021

gmagogsfm commented Mar 26, 2021

ezyang commented May 26, 2021

gmagogsfm commented Jun 7, 2021

gmagogsfm commented Jun 23, 2021

ezyang commented Jun 25, 2021

gmagogsfm commented Jun 25, 2021

suo commented Mar 24, 2021 •

edited by pytorch-probot bot