[export] Initial serialization v2 #102707

angelayi · 2023-06-01T04:39:18Z

v2 of #102125 because of git issues
corresponding deserialization diff: #102716

Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections:

serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes] -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes
GraphModuleSerializer class that serializes torch.fx.GraphModule
to the schema.GraphModule dataclass
ExportedProgramSerializer class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass

Serialization TODOs:

pytree spec: Serialize pytree to string #102577
higher order ops
node metadata (specifically nn_module_stack/source_fn) [export] Serialize metadata #103274
constraints [export] Serialize symbolic values #103273
graph module metadata

The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented.

pytorch-bot · 2023-06-01T04:39:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102707

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 0d44425:

NEW FAILURE - The following job has failed:

Check Labels (gh)

BROKEN TRUNK - The following jobs failed but were present on the merge base 12cd1db:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2023-06-01T17:06:43Z

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

ydwu4

Can add some high-level descriptions about this PR in the pr description? Just to keep a record in case we need to find out why we do it in this way in the future and also encourage people to add more meaningful reviews.Example description could include 1. what's the motivation of this change? 2. how users are supposed to user this API 3. Does the test cover all corner cases? Anything that's in your mind but not tested yet? 4. what are the implementation choices we made and why we choose this implementation instead of the alternatives? 5. Is there any lilmitations of current design? Feel free to skip some if you find meaningless and it would be better if new points could be added if you feel that may be helpful for people to understand better.

I understand it's discussed offline and adding more descriptions could slow down our development a little bit but would definitely be helpful to keep everyone (e.g. those who didn't participate in the discussion like me) on the same page and help us maintain the code in the long run :).

facebook-github-bot · 2023-06-05T14:09:37Z

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-06-05T14:39:16Z

This pull request was exported from Phabricator. Differential Revision: D46362466

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 1d3fc157a7a5c2e615dbcc7f0e87d76f2f4c43ed

facebook-github-bot · 2023-06-05T20:20:31Z

This pull request was exported from Phabricator. Differential Revision: D46362466

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 033cb9a22d905d944e182dba3b191df4c52413c8

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 22b0c38ddf3887e5966c0fe0b00c6984c30d98a9

facebook-github-bot · 2023-06-05T20:35:56Z

This pull request was exported from Phabricator. Differential Revision: D46362466

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 32766639106abc0c4cea03bd298254140e7f3a1a

facebook-github-bot · 2023-06-05T20:52:43Z

This pull request was exported from Phabricator. Differential Revision: D46362466

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 8e7d5cd4769bd6b4dcf64036dab43d54d7d4493a

facebook-github-bot · 2023-06-05T21:29:21Z

This pull request was exported from Phabricator. Differential Revision: D46362466

Summary: v2 of #102125 because of git issues corresponding deserialization diff: #102716 Implementing serialization of the exported program to a python dataclass, and then from that dataclass to json. This is split into a couple of sections: - `serialize(ep: ep.ExportedProgram, opset_version: Dict[str, int]) -> Tuple[bytes, bytes]` -- takes an exported program object, a dictionary mapping opset namespaces to versions, and returns the serialized exported program in bytes, and separately the state dict serialized in bytes - `GraphModuleSerializer` class that serializes torch.fx.GraphModule to the schema.GraphModule dataclass - `ExportedProgramSerializer` class that serializes torch._export.exported_program.ExportedProgram to the schema.ExportedProgram dataclass Serialization TODOs: - [x] pytree spec: #102577 - [ ] higher order ops - [ ] node metadata (specifically nn_module_stack/source_fn) - [ ] constraints - [ ] graph module metadata The tests are not super comprehensive, but that's because I think it'll be better tested + easier to test once deserialization is implemented. Pull Request resolved: #102707 Reviewed By: zhxchen17 Differential Revision: D46362466 Pulled By: angelayi fbshipit-source-id: 8627d9f783cea5af9c36b09f4216c7effc021593

facebook-github-bot · 2023-06-05T22:24:39Z

This pull request was exported from Phabricator. Differential Revision: D46362466

facebook-github-bot · 2023-06-06T05:05:17Z