Skip to content

Conversation

@jaslou
Copy link

@jaslou jaslou commented Aug 2, 2019

What is the purpose of the change

(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)

Brief change log

(for example:)

  • The TaskInfo is stored in the blob store on job creation time as a persistent artifact
  • Deployments RPC transmits only the blob storage reference
  • TaskManagers retrieve the TaskInfo from the blob cache

Verifying this change

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (100MB)
  • Extended integration test for recovery after master (JobManager) failure
  • Added test that validates that TaskInfo is transferred only once across recoveries
  • Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

zentol and others added 30 commits July 12, 2019 11:48
…ng to PubSubSink serializer and emulator settings
  - Some JavaDoc comments
  - Make the class final, because several methods are not designed to handle inheritence well.
  - Avoid repeated string concatenation/building
…resource profile

This change is covered by various existing integration tests that failed prior to this fix.
Before, exceptions that occurred after cancelling a source (as the
KafkaConsumer did, for example) would make a job fail when attempting a
"stop-with-savepoint". Now we ignore those exceptions.
…ial revert of FLINK-11458): use single threaded Task's dispatcher thread pool
…on in the blocking method in case of spurious wakeups
This commit reworks JSON format to use a runtime converter created based
on given TypeInformation. Pre this commit conversion logic was based on
reference comparison of TypeInformation which was not working after
serialization of the format.

This also introduces a builder pattern for ensuring future immutability
of schemas.

This closes #7932.
This PR makes HiveTableSink implements OverwritableTableSink.

This closes #9067.
…talog when creating sink for CatalogTable

Planner should first try getting table factory from catalog when creating table sinks for CatalogTable.

This closes #9039.
This PR adds comprehensive documentation for unified catalog APIs and catalogs.

The ticket for corresponding Chinese documentation is FLINK-13086.

This closes #8976.
This PR integrates FunctionCatalog with Catalog APIs.

This closes #8920.
…LI SessionContext

This PR supports remembering current catalog and database that users set in SQL CLI SessionContext.

This closes #9049.
wuchong and others added 23 commits August 2, 2019 10:09
Add a areTypesCompatible() method to LogicalTypeChecks. This will compare two LogicalTypes without field names and other logical attributes (e.g. description, isFinal).
This commit combines HBaseTableSourceITCase and HBaseLookupFunctionITCase and HBaseConnectorITCase into one class.
This can save much cluster initialization time for us.

This closes #9275
…emantics fixed per partition type

In a long term we do not need auto-release semantics for blocking (persistent) partition. We expect them always to be released externally by JM and assume they can be consumed multiple times.

The pipelined partitions have always only one consumer and one consumption attempt. Afterwards they can be always released automatically.

ShuffleDescriptor.ReleaseType was introduced to make release semantics more flexible but it is not needed in a long term.

FORCE_PARTITION_RELEASE_ON_CONSUMPTION was introduced as a safety net to be able to fallback to 1.8 behaviour without the partition tracker and JM taking care about blocking partition release. We can make this option specific for NettyShuffleEnvironment which was the only existing shuffle service before. If it is activated then the blocking partition is also auto-released on a consumption attempt as it was before. The fine-grained recovery will just not find the partition after the job restart in this case and will restart the producer.
Make MultiTaskSlot not available for allocation when it’s releasing children
to avoid ConcurrentModificationException.

This closes #9288.
@flinkbot
Copy link
Collaborator

flinkbot commented Aug 2, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit bf99b26 (Tue Aug 06 15:59:02 UTC 2019)

Warnings:

  • 14 pom.xml files were touched: Check for build and licensing issues.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 2, 2019

CI report:

@jaslou
Copy link
Author

jaslou commented Aug 4, 2019

I'm so sorry to do that, but I didn't mean it, It's my misoperation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.