Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-22052][python] Add FLIP-142 public classes to python API #15441

Closed
wants to merge 3 commits into from

Conversation

sjwiesman
Copy link
Contributor

What is the purpose of the change

This adds the new public APIs from FLIP-142 to the Python DataStream API. It is based on top of #15429 so only the last 3 commits are relevant.

Brief change log

96aa09c Add new state backend classes and sync updated JavaDoc

7193a21 Add new checkpoint storage classes

57c27b5 Add new methods to StreamExecutionEnvironment

Verifying this change

Added new Python Unit Tests. Note to the reviewer, I had some issues running the python tests locally. I will monitor the CI to ensure everything passes.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 31, 2021

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 43c1bd1 (Thu Sep 23 17:28:42 UTC 2021)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 31, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@dianfu dianfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjwiesman Thanks for the PR.

flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
file_state_size_threshold,
write_buffer_size)

self._j_filesystem_checkpoint_storage = j_filesystem_checkpoint_storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_j_filesystem_checkpoint_storage could be removed

flink-python/pyflink/datastream/checkpoint_storage.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/state_backend.py Outdated Show resolved Hide resolved
flink-python/pyflink/datastream/state_backend.py Outdated Show resolved Hide resolved
j_predefined_options = self._j_embedded_rocks_db_state_backend.getPredefinedOptions()
return PredefinedOptions._from_j_predefined_options(j_predefined_options)

def set_options(self, options_factory_class_name: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's not enough to only pass the class name of the options_factory. For example, for DefaultConfigurableOptionsFactory, there are many methods in it to configure it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is fine (it's actually copied directly from RocksDBStateBackend). This is just loading a factory class that is defined in Java. We could think about making the factory definable in Python but that is a separate issue.

@sjwiesman
Copy link
Contributor Author

@dianfu thank you for the ver fast review. I've applied all your suggestions. Please take another look and let me know if you have any other concerns.

Copy link
Contributor

@dianfu dianfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjwiesman Thanks a lot for the update. LGTM overall with just one minor comment. Feel free to merge it after addressing this issue.

:param directory The savepoint directory
:return: This object.
"""
return self._j_stream_execution_environment.setDefaultSavepointDirectory(directory)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return self

@sjwiesman
Copy link
Contributor Author

Thanks @dianfu. I've made the final fix. Since this is based #15429 I will wait on that PR to finish before merging.

@sjwiesman
Copy link
Contributor Author

@dianfu there are some relevant test failures that seem to be related. I am having trouble debugging and was hoping you could take another look.

@dianfu
Copy link
Contributor

dianfu commented Apr 2, 2021

@sjwiesman Sure. I'll take a look asap!

Copy link
Contributor

@dianfu dianfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjwiesman The test failures should be caused by the following issues.

raise TypeError("%s is not an instance of CheckpointStorage." % j_checkpoint_storage)

if get_java_class(JJobManagerCheckpointStorage).isAssignableFrom(j_clz):
return JobManagerCheckpointStorage(j_jobmanager_checkpoint_storage=j_clz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return JobManagerCheckpointStorage(j_jobmanager_checkpoint_storage=j_clz)
return JobManagerCheckpointStorage(j_jobmanager_checkpoint_storage= j_checkpoint_storage)

if get_java_class(JJobManagerCheckpointStorage).isAssignableFrom(j_clz):
return JobManagerCheckpointStorage(j_jobmanager_checkpoint_storage=j_clz)
elif get_java_class(JFileSystemCheckpointStorage).isAssignableFrom(j_clz):
return FileSystemCheckpointStorage(j_filesystem_checkpoint_storage=j_clz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return FileSystemCheckpointStorage(j_filesystem_checkpoint_storage=j_clz)
return FileSystemCheckpointStorage(j_filesystem_checkpoint_storage= j_checkpoint_storage)


def test_create_jobmanager_checkpoint_storage(self):

self.assertItNotNone(JobManagerCheckpointStorage())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.assertItNotNone(JobManagerCheckpointStorage())
self.assertIsNotNone(JobManagerCheckpointStorage())

checkpoint_path = JPath(checkpoint_path)
if max_state_size is None:
max_state_size = JJobManagerCheckpointStorage.DEFAULT_MAX_STATE_SIZE
j_jobmanager_checkpoint_storage = JobManagerCheckpointStorage(checkpoint_path,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
j_jobmanager_checkpoint_storage = JobManagerCheckpointStorage(checkpoint_path,
j_jobmanager_checkpoint_storage = JJobManagerCheckpointStorage(checkpoint_path,

checkpoint_path = JPath(checkpoint_path)

if file_state_size_threshold is None:
file_state_size_threshold = FileSystemCheckpointStorage.MAX_FILE_STATE_THRESHOLD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we should set it as -1 if not specified.


:return: The number of threads used to transfer files while snapshotting/restoring.
"""
return self._j_state_backend.getNumberOfTransferingThreads()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return self._j_state_backend.getNumberOfTransferingThreads()
return self._j_state_backend.getNumberOfTransferThreads()

@sjwiesman sjwiesman force-pushed the state-backend-py branch 2 times, most recently from 8b19876 to a4daf4b Compare April 2, 2021 12:51
@dianfu
Copy link
Contributor

dianfu commented Jun 8, 2021

@sjwiesman What's the status of this PR?

@sjwiesman
Copy link
Contributor Author

@dianfu I'm very sorry, this fell off my radar. I've just rebased the branch, if CI is good we can go ahead and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants