1.9.0 #10258

Paper-plane123 · 2019-11-20T03:51:05Z

What is the purpose of the change

(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)

Brief change log

(for example:)

The TaskInfo is stored in the blob store on job creation time as a persistent artifact
Deployments RPC transmits only the blob storage reference
TaskManagers retrieve the TaskInfo from the blob cache

Verifying this change

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (100MB)
Extended integration test for recovery after master (JobManager) failure
Added test that validates that TaskInfo is transferred only once across recoveries
Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
The serializers: (yes / no / don't know)
The runtime per-record code paths (performance sensitive): (yes / no / don't know)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
The S3 file system connector: (yes / no / don't know)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

FLINK-13249 was a bug where a deadlock occurred when the network thread got blocked on a lock while requesting partitions to be read by remote channels. The test mimicks that situation to guard the fix applied in an earlier commit.

…03 jar This closes #9223

This closes #9088

…stUtils The method Unsafe.defineClass() is removed in Java 11. To support Java 11, we rework the method "CommonTestUtils.createClassNotInClassPath()" to use a different mechanism. This commit now writes the class byte code out to a temporary file and create a new URLClassLoader that loads the class from that file. That solution is not a complete drop-in replacement, because it cannot add the class to an existing class loader, but can only create a new pair of (classloader & new-class-in-that-classloader). Because of that, the commit also adjusts the existing tests to work with that new mechanism. This closes #9251

… strategy This closes #9241.

…ing partition request On producer side the netty handler receives the CancelPartitionRequest for releasing the SubpartitionView resource. In previous implementation we try to find the corresponding view via available queue in PartitionRequestQueue. But in reality the view is not always available to stay in this queue, then the view would never be released. Furthermore the release of ResultPartition/ResultSubpartitions is based on the reference counter in ReleaseOnConsumptionResultPartition, but while handling the CancelPartitionRequest in PartitionRequestQueue, the ReleaseOnConsumptionResultPartition is never notified of consumed subpartition. That means the reference counter would never decrease to 0 to trigger partition release, which would bring file resource leak in the case of BoundedBlockingSubpartition. In order to fix above two issues, the corresponding view is released via all reader queue instead, and then it would call ReleaseOnConsumptionResultPartition#onConsumedSubpartition meanwhile to solve this bug.

…ependant

…led input channel IDs

This closes #9257.

Currently test cases will fail when trying to close the output stream if all data written but ClosedByInterruptException occurs at the ending phase. This commit fixes it. This closes #9235

… of any one This closes #9249

…witch case This closes #9227.

…nd OptimizerConfigOptions This closes #9203

…nge of artifact of flink-python module (#9270)

…ame & withBuiltinDatabaseName

…experimental annotation

…cedure

This closes #9229.

… size on the TM side

Only kill Yarn application if it does not properly terminate. This closes #9175.

…ed memory size into wrong configuration instance. [FLINK-13241][yarn][test] Update YarnResourceManagerTest#testCreateSlotsPerWorker to compute tmCalculatedResourceProfile based on the RM altered configuration. [FLINK-13241][yarn][test] Update YarnConfigurationITCase to verify that TMs are started with correct managed memory size. [FLINK-13241][runtime] Calculating and set managed memory size outside of ResourceManager. [FLINK-13241][rumtime/yarn][test] Move YarnResourceManagerTest#testCreateSlotsPerWorker to ResourceManagerTest#testCreateWorkerSlotProfiles, and update to verify slot profile calculation with determinate managed memory size. [FLINK-13241][runtime] Move getResourceManagerConfiguration from ResourceManagerFactory to ResourceManagerUtil. This closes #9246.

… function and DIV(), DIV_INT() function from blink planner This commit remove BITAND, BITOR, BITNOT, BITXOR scalar functions because they are not standard. This commit also removes DIV(), DIV_INT() because we already have "/" and "/INT" operators.

… keep it compatible with old planner The behavior of AVG aggregate function in blink planner always return double/decimal type which is not standard.

…de of "explainTerms" to generate operator names

…t in blink planner This closes #9363

…link planner in scala shell This closes #9389

…g docs-and-source profile

…intFailureManager This closes #9364.

…to keep it compatible with old planner CONCAT(string1, string2, ...) should returns NULL if any argument is NULL. CONCAT_WS(sep, string1, string2,...) should returns NULL if sep is NULL and automatically skips NULL arguments.

…RING type instead of BINARY This fix the behavior of FROM_BASE64() to align with old planner.

…TE() function with old planner

…ALUE(), SUBSTR() builtin functions which are not standard. LENGTH, SUBSTR, KEYVALUE can be covered by existing functions, e.g. CHAR_LENGTH, SUBSTRING, STR_TO_MAP(str)[key].

…common.md, queryable_state.md) This closes #9384

…nCallResolver for class name more meaningful. This closes #9281

…6c6b48) into Chinese documents

…Chinese This closes #9348

… stream group aggregate in FlinkRelMdColumnInterval This closes #9346

…nt toString method to explain more info This closes #9347

…for blink planner This close #9396

…base crashes sql-client Avoid crashing sql-client when switching to non-existing catalog or database. This closes #9399.

Hive documentation is currently spread across a number of pages and fragmented. In particular: - An example was added to getting-started/examples, however, this section is being removed - There is a dedicated page on hive integration but also a lot of hive specific information is on the catalog page This closes #9308.

…2e test This closes #9391.

Fix the issue that Flink cannot access Hive table with decimal columns. This closes #9390.

…in blink planner to fix TPC-H e2e test failed This closes #9427

flinkbot · 2019-11-20T03:54:33Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 5a5966b (Wed Dec 04 15:52:27 UTC 2019)

Warnings:

148 pom.xml files were touched: Check for build and licensing issues.
Invalid pull request title: No valid Jira ID provided

_{Mention the bot in a comment to re-run the automated checks.}

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

flinkbot · 2019-11-20T04:15:54Z

CI report:

5a5966b : UNKNOWN

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run travis re-run the last Travis build

StephanEwen and others added 30 commits July 26, 2019 19:16

[FLINK-13325][test] Add test for FLINK-13249

21621fb

FLINK-13249 was a bug where a deadlock occurred when the network thread got blocked on a lock while requesting partitions to be read by remote channels. The test mimicks that situation to guard the fix applied in an earlier commit.

[FLINK-13352][docs] Using hive connector with hive-1.2.1 needs libfb3…

dbcbc09

…03 jar This closes #9223

[FLINK-13012][hive] Handle default partition name of Hive table

1d81374

This closes #9088

[FLINK-13429][table-common] Fix BoundedOutOfOrderTimestamps watermark…

b670b11

… strategy This closes #9241.

[FLINK-13387][WebUI] Fix log download for old UI

12118b9

[FLINK-13245][network] Make subpartition consumption notification ind…

acd15d2

…ependant

[FLINK-13245][network] Remove redundant bookkeeping for already cance…

71a53d4

…led input channel IDs

[hotfix] Fix typo in api_concepts.md

db6529a

[FLINK-13458][table] ThreadLocalCache clashes for Blink planner

14afeea

This closes #9257.

[FLINK-13228][tests][filesystems] Harden HadoopRecoverableWriterTest

e373c44

Currently test cases will fail when trying to close the output stream if all data written but ClosedByInterruptException occurs at the ending phase. This commit fixes it. This closes #9235

[FLINK-13447][table] Change default planner to legacy planner instead…

5b01e71

… of any one This closes #9249

[FLINK-9526][e2e] Fix unstable BucketingSink end-to-end test

7fec439

[FLINK-12747][docs] Getting Started - Table Api Walkthrough

f4943dd

[FLINK-13347][table-planner] should handle SEMI/ANTI JoinRelType in s…

ef29f30

…witch case This closes #9227.

[FLINK-13375][table] Improve config names in ExecutionConfigOptions a…

dfe3eb0

…nd OptimizerConfigOptions This closes #9203

[hotfix][python] Change "flink-python-" to "flink-python" for the cha…

589fd4d

…nge of artifact of flink-python module (#9270)

[hotfix][table-api-java] Improve documentation of withBuiltinCatalogN…

208fbea

…ame & withBuiltinDatabaseName

[hotfix][table] Unify default catalog & builtin catalog naming

40a2128

[FLINK-13350][table-api-java] Annotate useCatalog & useDatabase with …

8d37909

…experimental annotation

[hotfix][table-common] Fix logging in the table source validation pro…

daee14f

…cedure

[FLINK-13279][table-sql-client] Fully qualify sink name in sql-client

b1db13f

This closes #9229.

[hotfix][sql-client] Add USE CATALOG/USE to CLI help

549079c

[hotfix][test] Extract common variables

f384449

[FLINK-12171][Network] Do not limit the network buffer memory by heap…

e457cd4

… size on the TM side

[hotfix][test] Converting fraction to double to improve the precision

351b303

[hotfix][Network] Rename the terminology

8dec21f

[FLINK-12038][tests] Harden YARNITCase

36d9a75

Only kill Yarn application if it does not properly terminate. This closes #9175.

docete and others added 26 commits August 9, 2019 11:05

[FLINK-13523][table-planner-blink] Refactor AVG aggregate function to…

421f0a5

… keep it compatible with old planner The behavior of AVG aggregate function in blink planner always return double/decimal type which is not standard.

[FLINK-13587][table-planner-blink] Introduces a framework to reuse co…

c649c8b

…de of "explainTerms" to generate operator names

[FLINK-13587][table-planner-blink] Fix some operator names are not se…

bf37130

…t in blink planner This closes #9363

[FLINK-13645][table-planner-blink] Fix Error in code-gen when using b…

f827360

…link planner in scala shell This closes #9389

[hotfix] Fix the style check failure of flink-sql-parser when enablin…

a2005c0

…g docs-and-source profile

[FLINK-13593][checkpointing] Prevent failing the wrong job in Checkpo…

7a1222f

…intFailureManager This closes #9364.

[hotfix][tests] Add NoOpJobFailCall singleton

6515807

[FLINK-13547][table-planner-blink] Fix FROM_BASE64() should return ST…

2ae0e8a

…RING type instead of BINARY This fix the behavior of FROM_BASE64() to align with old planner.

[FLINK-13547][table-planner-blink] Align the implementation of TRUNCA…

0196a95

…TE() function with old planner

[FLINK-13547][table-planner-blink] Remove LENGTH(), JSONVALUE(), KEYV…

7e13f5b

…ALUE(), SUBSTR() builtin functions which are not standard. LENGTH, SUBSTR, KEYVALUE can be covered by existing functions, e.g. CHAR_LENGTH, SUBSTRING, STR_TO_MAP(str)[key].

[FLINK-13637][docs] Fix problems of anchors in document(building.md, …

56c6b48

…common.md, queryable_state.md) This closes #9384

[hotfix][table] Rename TableAggFunctionCallVisitor to TableAggFunctio…

f400fbb

…nCallResolver for class name more meaningful. This closes #9281

[docs-sync] Synchronize the latest documentation changes (commits to 5…

edc6826

…6c6b48) into Chinese documents

[FLINK-13505][docs-zh] Translate "Java Lambda Expressions" page into …

fda402d

…Chinese This closes #9348

[FLINK-13562][table-planner-blink] Fix incorrect input type for local…

0b65aea

… stream group aggregate in FlinkRelMdColumnInterval This closes #9346

[FLINK-13563][table-planner-blink] TumblingGroupWindow should impleme…

0cda582

…nt toString method to explain more info This closes #9347

[FLINK-13473][table-blink] Add stream Windowed FlatAggregate support …

29f6c3f

…for blink planner This close #9396

[FLINK-13526][sql-client] Switching to a non existing catalog or data…

a8fb572

…base crashes sql-client Avoid crashing sql-client when switching to non-existing catalog or database. This closes #9399.

[FLINK-13490][jdbc] Fix return null in JDBCUtils::getFieldFromResultSet

69a91e8

[FLINK-13489][e2e] Fix akka timeout problem of the heavy deployment e…

64738aa

…2e test This closes #9391.

[FLINK-13534][hive] Unable to query Hive table with decimal column

f33a6f0

Fix the issue that Flink cannot access Hive table with decimal columns. This closes #9390.

[hotfix][doc] remove two obsolete Hive doc files

9410674

[FLINK-13704][table-planner-blink] Revert removing SUBSTR() function …

5a5966b

…in blink planner to fix TPC-H e2e test failed This closes #9427

rmetzger added the review=description? label Nov 20, 2019

Paper-plane123 closed this Nov 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1.9.0 #10258

1.9.0 #10258

Uh oh!

Paper-plane123 commented Nov 20, 2019

Uh oh!

flinkbot commented Nov 20, 2019 •

edited

Loading

Uh oh!

flinkbot commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

1.9.0 #10258

1.9.0 #10258

Uh oh!

Conversation

Paper-plane123 commented Nov 20, 2019

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

flinkbot commented Nov 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Automated Checks

Review Progress

Uh oh!

flinkbot commented Nov 20, 2019

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

flinkbot commented Nov 20, 2019 •

edited

Loading