-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Labels
Description
What needs to happen?
Currently a Python Row (encoded with Row Coder) go through serialization/deserialization becomes a schema generated types named tuple. There are many caveats for this behavior
-
original type get lost
-
[Bug]: Python schema generated types cannot be pickled #22714
with cloudpickle becomes default and schema registry coder registry saved on pipeline submission, we should be able to use the schema id registered in the schema registry to obtain the user type, then use coder registry for the user type to get registered (row) coder, that makes user_type->GBK still produces user_type
Issue Priority
Priority: 2 (default / most normal work should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner
Reactions are currently unavailable