Skip to content

Conversation

@HuangXingBo
Copy link
Contributor

@HuangXingBo HuangXingBo commented Aug 26, 2022

What is the purpose of the change

This pull request will optimize PyFlink tests

Brief change log

  • Remove the warnings in the Python tests
  • Make pyflink tests share same Minicluster to reduce the startup time

Verifying this change

This change added tests and can be verified as follows:

  • Original tests

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 26, 2022

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@HuangXingBo
Copy link
Contributor Author

@flinkbot run azure

@zentol
Copy link
Contributor

zentol commented Aug 26, 2022

This doesn't seem to have the desired effect; the runtime of flink-python is still 51m, like on master. :(

@HuangXingBo
Copy link
Contributor Author

Due to the different performance of machines in the Azure, there is actually a deviation of about 30~40 minutes at most. Sometimes, the test time on the master can be exceed to 1.5 hours. As a whole, the average test time of running three versions of Python can be controlled within 2 hours, which can effectively avoid the situation that the nightly test exceeds 4 hours https://dev.azure.com/hxbks2ks/FLINK-TEST/_build/results?buildId=2036&view=logs&j=fba17979-6d2e-591d-72f1-97cf42797c11 . However, I will try again to see if the test time can be controlled within 30 minutes, so as to deal with the situation of timeout after more tests added in the future. The longer the test time of this release than last release is due to many new tests are added.

@HuangXingBo
Copy link
Contributor Author

@HuangXingBo
Copy link
Contributor Author

@flinkbot run azure

1 similar comment
@HuangXingBo
Copy link
Contributor Author

@flinkbot run azure

Copy link
Contributor

@dianfu dianfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HuangXingBo Nice work! Have left a few minor comments.

fastavro>=1.1.0,<1.4.8
grpcio>=1.29.0,<1.47
grpcio-tools>=1.3.5,<=1.14.2
grpcio-tools>=1.29.0,<=1.46.3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to keep the version limit consistent between grpcio and grpcio-tools? That's either >=1.29.0,<1.47 or >=1.29.0,<1.46.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Make sense.

* Starts a Flink mini cluster as a resource and registers the respective
* StreamExecutionEnvironment.
*/
public class MiniClusterWithClientResource {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse the MiniClusterWithClientResource defined in flink-test-utils module?

HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
HuangXingBo added a commit that referenced this pull request Sep 5, 2022
@kgrimsby
Copy link

kgrimsby commented Sep 26, 2022

Hi, doesn't this commit introduce a bug?

I've tried to setup from flink release-1.16 branch and found that the changes in pyflink/fn_execution/flink_fn_execution_pb2.py at DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'...') will never return a descriptor since the code for AddSerializedFile in protbuf <3.18 has no return on that function:

def AddSerializedFile(self, serialized_file_desc_proto):
    """Adds the FileDescriptorProto and its types to this pool.
    Args:
      serialized_file_desc_proto (bytes): A bytes string, serialization of the
        :class:`FileDescriptorProto` to add.
    """

    # pylint: disable=g-import-not-at-top
    from google.protobuf import descriptor_pb2
    file_desc_proto = descriptor_pb2.FileDescriptorProto.FromString(
        serialized_file_desc_proto)
    self.Add(file_desc_proto)

Am I missing something here?

huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
huangxiaofeng10047 pushed a commit to huangxiaofeng10047/flink that referenced this pull request Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants