[export] Initial serialization #102125

angelayi · 2023-05-23T23:09:35Z

Serialization TODOs:

pytree spec: Serialize pytree to string #102577
higher order ops
node metadata (specifically nn_module_stack/source_fn)
shape env
graph module metadata?

Stack from ghstack (oldest at bottom):

-> [export] Initial serialization #102125

[ghstack-poisoned]

pytorch-bot · 2023-05-23T23:09:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102125

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3c0676b:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

torch/_export/serialize.py

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

torch/_export/serialize.py

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

zhxchen17

looks great, lots of questions tho.

torch/_export/serialize.py

Serialization TODOs: - [ ] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

ghstack-source-id: 170e292382340441224db66295232fddf7899e03 Pull Request resolved: #102125

avikchaudhuri

Accepting to unblock, lgtm. See comments though.

avikchaudhuri · 2023-05-31T16:30:49Z

torch/_export/serde/schema.py

 @dataclass
 class Node:
-    target: Operator
+    target: str


Why? Because version info will reside elsewhere?

yup, it'll reside in the toplevel GraphModule

avikchaudhuri · 2023-05-31T16:32:08Z

torch/_export/serde/schema.py

@@ -143,7 +137,7 @@ class Graph:
 @dataclass
 class BackwardSignature:
    gradients_to_parameters: Dict[str, str]
-    gradients_to_userInputs: Dict[str, str]
+    gradients_to_user_inputs: Dict[str, str]


What's "user" supposed to mean here?

not sure..copied from the internal thrift file

avikchaudhuri · 2023-05-31T16:34:07Z

torch/_export/serde/schema.py

-    buffers: Dict[str, TensorMeta]
-    parameters: Dict[str, TensorMeta]
-    metadata: Dict[str, str]
+    opset_version: Dict[str, int]


Are we keeping the dict here or somewhere else (indexed by a single int)? @larryliu0820

My naive approach is to serialize a dict directly. It will be a namespace to op set version mapping. Example: {“aten”: 3, “custom_namespace”: 4}. We can change this in the future if needed.

OK, so in any case, no mapping of op names (also str) to versions, yeah? That mapping will exist somewhere else in a table / code?

avikchaudhuri · 2023-05-31T16:36:44Z

torch/_export/serde/schema.py

    signature: GraphSignature
+    call_spec: CallSpec


Can you describe what the purpose of call_spec is? Won't graph_module have the call_spec mappings burned in?

Is there a standard transform that uses this information?

call_spec contains the in_spec/out_spec which is used when we're running the graph module eagerly so that the inputs/output formats match how we would run the original function eagerly. We call it in the ExportedProgram's call function: https://github.com/pytorch/pytorch/blob/main/torch/_export/exported_program.py#L94

Bit confused about why we also then have flatten / unflatten calls in a graph module...

Mm, it defines the way we pass inputs and receive outputs to the graph module. Zhengxu stated below that it makes more sense to put the GraphSignature in GraphModule, and I think CallSpec should go with wherever the Signature goes.

Serialization TODOs: - [ ] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

ghstack-source-id: 91b93b80ebc856e95c83015c12993bccdd6b2161 Pull Request resolved: #102125

zhxchen17 · 2023-05-31T17:57:38Z

torch/_export/serde/schema.py

-    buffers: Dict[str, TensorMeta]
-    parameters: Dict[str, TensorMeta]
-    metadata: Dict[str, str]
+    opset_version: Dict[str, int]


@angelayi I think it makes more sense to me if we put opset_version in ExportedProgram, and put signature in GraphModule. The rationale is to closely mirror the schema in sigmoid.

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 1d3fc157a7a5c2e615dbcc7f0e87d76f2f4c43ed

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 033cb9a22d905d944e182dba3b191df4c52413c8

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 22b0c38ddf3887e5966c0fe0b00c6984c30d98a9

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 32766639106abc0c4cea03bd298254140e7f3a1a

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 8e7d5cd4769bd6b4dcf64036dab43d54d7d4493a

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 8627d9f783cea5af9c36b09f4216c7effc021593

v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Approved by: https://github.com/avikchaudhuri, https://github.com/zhxchen17

[export] Initial serialization

9e3b3a1

[ghstack-poisoned]

angelayi mentioned this pull request May 23, 2023

[export] Initial deserialization #102126

Closed

angelayi requested review from zhxchen17, mergennachin, avikchaudhuri and tugsbayasgalan May 23, 2023 23:10

Update on "[export] Initial serialization"

53f05fa

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

tugsbayasgalan reviewed May 23, 2023

View reviewed changes

torch/_export/serialize.py Outdated Show resolved Hide resolved

torch/_export/serialize.py Outdated Show resolved Hide resolved

torch/_export/serialize.py Outdated Show resolved Hide resolved

torch/_export/serialize.py Outdated Show resolved Hide resolved

Update on "[export] Initial serialization"

42870ee

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

This was referenced May 25, 2023

[export] ExportedProgram #102259

Closed

[export] Rename graph_module.py to exported_program.py #102260

Closed

angelayi requested a review from larryliu0820 May 25, 2023 07:12

angelayi added module: export release notes: export labels May 25, 2023

larryliu0820 reviewed May 25, 2023

View reviewed changes

torch/_export/serialize.py Outdated Show resolved Hide resolved

torch/_export/serialize.py Outdated Show resolved Hide resolved

angelayi added 5 commits May 25, 2023 20:10

Update on "[export] Initial serialization"

06251b6

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

Update on "[export] Initial serialization"

9ae3ea0

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

Update on "[export] Initial serialization"

ab22dac

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

Update on "[export] Initial serialization"

3750235

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

Update on "[export] Initial serialization"

092aa93

Serialization TODOs: - [ ] pytree spec - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

zhxchen17 reviewed May 30, 2023

View reviewed changes

larryliu0820 reviewed May 30, 2023

View reviewed changes

torch/_export/serialize.py Outdated Show resolved Hide resolved

Update on "[export] Initial serialization"

c785ca4

Serialization TODOs: - [ ] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

angelayi added a commit that referenced this pull request May 31, 2023

[export] Initial serialization

ff61a59

ghstack-source-id: 170e292382340441224db66295232fddf7899e03 Pull Request resolved: #102125

angelayi requested a review from zhxchen17 May 31, 2023 14:25

avikchaudhuri approved these changes May 31, 2023

View reviewed changes

Update on "[export] Initial serialization"

3c0676b

Serialization TODOs: - [ ] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] shape env - [ ] graph module metadata? [ghstack-poisoned]

angelayi added a commit that referenced this pull request May 31, 2023

[export] Initial serialization

8ed8e05

ghstack-source-id: 91b93b80ebc856e95c83015c12993bccdd6b2161 Pull Request resolved: #102125

zhxchen17 reviewed May 31, 2023

View reviewed changes

angelayi mentioned this pull request Jun 1, 2023

[export] Initial serialization v2 #102707

Closed

5 tasks

angelayi closed this Jun 7, 2023

facebook-github-bot deleted the gh/angelayi/45/head branch July 8, 2023 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[export] Initial serialization #102125

[export] Initial serialization #102125

angelayi commented May 23, 2023 •

edited

pytorch-bot bot commented May 23, 2023 •

edited

zhxchen17 left a comment

avikchaudhuri left a comment

avikchaudhuri May 31, 2023

angelayi May 31, 2023

avikchaudhuri May 31, 2023

angelayi May 31, 2023

avikchaudhuri May 31, 2023

larryliu0820 May 31, 2023 •

edited

avikchaudhuri Jun 1, 2023

avikchaudhuri May 31, 2023

angelayi May 31, 2023 •

edited

avikchaudhuri Jun 1, 2023

angelayi Jun 1, 2023

zhxchen17 May 31, 2023

[export] Initial serialization #102125

[export] Initial serialization #102125

Conversation

angelayi commented May 23, 2023 • edited

pytorch-bot bot commented May 23, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102125

✅ No Failures

zhxchen17 left a comment

Choose a reason for hiding this comment

avikchaudhuri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

larryliu0820 May 31, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelayi May 31, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelayi commented May 23, 2023 •

edited

pytorch-bot bot commented May 23, 2023 •

edited

larryliu0820 May 31, 2023 •

edited

angelayi May 31, 2023 •

edited