[FLINK-12409][python] Adds from_elements in TableEnvironment #8474

dianfu · 2019-05-17T11:27:54Z

What is the purpose of the change

This pull request adds the API from_elements in TableEnvironment. It's a convenient API to create a table from a collection of elements.
It works as follows:

Serializes the python data to a local file via PickleSerializer
Reads the local file and deserializes the python object and creates a DataStream from the deserialized elements in Java
Creates a Table from the created DataStream

Brief change log

Adds from_elements API in TableEnvironment
Adds PythonUtil to create a DataStream/DataSet from a file containing the serialized python objects
Adds PythonTableUtil to create a Table from a DataStream
Adds PythonTypeUtil to convert the deserialized objects according to the schema

Verifying this change

This change added tests and can be verified as follows:

Added unit test test_calc.test_from_elements

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
The serializers: (no)
The runtime per-record code paths (performance sensitive): (no)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
The S3 file system connector: (no)

Documentation

Does this pull request introduce a new feature? (no)
If yes, how is the feature documented? (not applicable)

flinkbot · 2019-05-17T11:29:16Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

✅ 1. The [description] looks good.
- Approved by @sunjincheng121 [committer]
✅ 2. There is [consensus] that the contribution should go into to Flink.
- Approved by @sunjincheng121 [committer]
❗ 3. Needs [attention] from.
- Needs attention by @zentol [PMC]
✅ 4. The change fits into the overall [architecture].
- Approved by @sunjincheng121 [committer]
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

sunjincheng121 · 2019-05-27T00:03:44Z

The CI throws TypeError for a python test case. I have been restarted it. And make sure all the test cases work well before the review.
@flinkbot approve-until architecture

sunjincheng121

Thanks for the PR @dianfu!
The PR overall looks good! And left a few comments. Please let me know what do you think?

One open question is:
Should we put the python code into flink-table-api-java-bridge module, and those days I think about can we add java code in flink-python?

Best,
Jincheng

sunjincheng121 · 2019-05-27T03:26:57Z