Release 1.4 #5585

ArtHustonHitachi · 2018-02-26T18:20:37Z

Thank you very much for contributing to Apache Flink - we are happy that you want to help us improve Flink. To help the community review your contribution in the best possible way, please go through the checklist below, which will get the contribution into a shape in which it can be best reviewed.

Please understand that we do not do this to make contributions to Flink a hassle. In order to uphold a high standard of quality for code contributions, while at the same time managing a large number of contributions, we need contributors to prepare the contributions well, and give reviewers enough contextual information for the review. Please also understand that contributions that do not follow this guide will take longer to review and thus typically be picked up with lower priority by the community.

Contribution Checklist

Make sure that the pull request corresponds to a JIRA issue. Exceptions are made for typos in JavaDoc or documentation files, which need no JIRA issue.
Name the pull request in the form "[FLINK-XXXX] [component] Title of the pull request", where FLINK-XXXX should be replaced by the actual issue number. Skip component if you are unsure about which is the best component.
Typo fixes that have no associated JIRA issue should be named following this pattern: [hotfix] [docs] Fix typo in event time introduction or [hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator.
Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
Make sure that the change passes the automated tests, i.e., mvn clean verify passes. You can set up Travis CI to do that following this guide.
Each pull request should address only one issue, not mix up code from multiple issues.
Each commit in the pull request has a meaningful commit message (including the JIRA id)
Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.

(The sections below can be removed for hotfixes of typos)

What is the purpose of the change

(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)

Brief change log

(for example:)

The TaskInfo is stored in the blob store on job creation time as a persistent artifact
Deployments RPC transmits only the blob storage reference
TaskManagers retrieve the TaskInfo from the blob cache

Verifying this change

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

Added integration tests for end-to-end deployment with large payloads (100MB)
Extended integration test for recovery after master (JobManager) failure
Added test that validates that TaskInfo is transferred only once across recoveries
Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): (yes / no)
The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
The serializers: (yes / no / don't know)
The runtime per-record code paths (performance sensitive): (yes / no / don't know)
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
The S3 file system connector: (yes / no / don't know)

Documentation

Does this pull request introduce a new feature? (yes / no)
If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

Before, we had it in places that require it. This doesn't work when running mvn javadoc:aggregate because this will only run for the root pom and can then not find the "bundle" dependencies.

(cherry picked from commit ad8ef6d)

This closes #4746.

- do not shade everything, especially not JDK classes! -> instead define include patterns explicitly - do not shade core Flink classes (only those imported from flink-hadoop-fs) - hack around Hadoop loading (unshaded/non-relocated) classes based on names in the core-default.xml by overwriting the Configuration class (we may need to extend this for the mapred-default.xml and hdfs-defaults.xml): -> provide a core-default-shaded.xml file with shaded class names and copy and adapt the Configuration class of the respective Hadoop version to load this file instead of core-default.xml. Add checkstyle suppression pattern for the Hadoop Configuration classes Also fix the (integration) tests not working because they tried to load the relocated classes which are apparently not available there Remove minimizeJar from shading of flink-s3-fs-presto because this was causing "java.lang.ClassNotFoundException: org.apache.flink.fs.s3presto.shaded.org.apache.commons.logging.impl.LogFactoryImpl" since these classes are not statically imported and thus removed when minimizing. Fix s3-fs-presto not shading org.HdrHistogram Fix log4j being relocated in the S3 fs implementations Add shading checks to travis

This uses traps to ensure that we properly do cleanups, remove config values and shutdown things.

…ateBackend and MemoryStateBackend. (cherry picked from commit 2906698)

This a walkaround strange javaassist bug. The issue should go away once we upgrade netty dependency. Please check the ticket for more information. This closes #5007.

This closes #4999.

This closes #5000.

As of FLINK-4500 the Cassandra connector will wait for pending updates to finish upon checkpoint. This closes #5002.

This closes #4995.

This closes #4973.

This closes #4968.

This closes #5010.

This closes #4959.

This closes #5014.

This closes #4981.

…I and SQL This closes #4544.

This closes #5011.

This closes #4726.

…e that calls UDAGGs. This closes #5018.

This closes #5019.

…store Previously, the key and namespace serializers for the HeapInternalTimerService were not reconfigured on restore to be compatible with previously written serializers. This caused an immediate error to restore savepoints in Flink 1.4.0, since in Flink 1.4.0 we changed the base registrations in the Kryo serializer. That change requires serializer reconfiguration. This commit fixes this by writing also the serializer configuration snapshots of the key and namespace serializer into savepoints, and use them to reconfigure the new serializers on rrestore. This improvement also comes along with making the written data for timer service snapshots versioned. Backwards compatibility with previous non-versioned formats is not broken.

… AbstractEventTimeWindowCheckpointingITCase After adding the TypeSerializerConfigSnapshots of timer serializers to the timers snapshots, the size of the timer snapshots have potentially doubled. This caused the AbstractEventTimeWindowCheckpointingITCase to be failing, because the configured max memory state size and Akka framesize were too small. This commit doubles those sizes. This closes #5362.

…rRunner

…askManagerRunner Previously, the YarnTaskManagerRunner contained a code path that exists for the sole purpose of injecting mock runners. Having code paths just to utilize tests in production code is in general a bad idea. This commit fixes this be making YarnTaskManagerRunner a factory-like class, which creates a Runner that contains all the runner’s properties, such as configuration. Unit tests can than test against the contained configuration in the created Runner to validate that everything is configured properly. This closes #5172.

Before, the condition was being read via in.read() and not in.readFully()

…onnector shading - Do not shade Elasticsearch dependencies - Do not shade Flink Elasticseach Connector classes - Also shade log4j-api dependency in Elasticsearch 5 connector. This is required for the log4j-to-slf4j bridge adapter to work properly. - Add NOTICE files for license statements for all ES connectors This closes #5426. This closes #5243.

…s a data stream as keyed stream (backport from 1.5 branch) This closes #5439.

This closes #5447.

This closes #5489.

This closes #5420.

This closes #5474.

This closes #5395.

…rterIsClosed() The test is inherently unstable as it will always fail if any other server is started on the port between the closing of the reporter and the polling of metrics. This closes #5473.

…aultRegistry It appeared as if the HTTPServer wasn't actually doing anything, but it internally accessed the singleton registry that we also access to register metrics.

This closes #5419.

This is preparation for modifying a new ITCase to use modern state features.

This new test does not pretend to use legacy state but now instead uses the more modern operator state varieties.

…aConsumer This commit fixes incorrectly using the parent of the user code class loader. Since Kafka 010 / 011 versions directly reuse 09 code, this fix fixes the issue for all versions. This commit also extends the Kafka010Example, so that is uses a custom watermark assigner. This allows our end-to-end tests to have caught this bug.

…al execution

This allows the test to perform the cleanup procedure (as well as printing any error logs) if an interruption occurred while waiting for the test data to be written to Kafka, therefore increasing visibility of reasons to why the test was stalling. This closes #5568.

This closes #5574.

aljoscha and others added 30 commits November 13, 2017 12:05

[FLINK-7702] Add maven-bundle-plugin to root pom

431ae36

Before, we had it in places that require it. This doesn't work when running mvn javadoc:aggregate because this will only run for the root pom and can then not find the "bundle" dependencies.

[FLINK-8040] [tests] Fix test instability in ResourceGuardTest

e2b92f2

(cherry picked from commit ad8ef6d)

[FLINK-7657] [table] Add time types FilterableTableSource push down

bb04187

This closes #4746.

[FLINK-7657] [table] Add all basic types to RexProgramExtractor

ce1cb8f

[FLINK-7973] Add shaded S3 FileSystem end-to-end tests

9f68212

[hotfix] Make end-to-end test scripts more robust

e666e62

This uses traps to ensure that we properly do cleanups, remove config values and shutdown things.

[hotfix] fix presto end-to-end test not cleaning up

3574f8b

[hotfix] ignore a warning from the error check of the S3 e2e tests

97a3491

[hotfix] let end-to-end tests check for empty .out files again

8f2d0fa

[FLINK-8053] [checkpoints] Default to asynchronous snapshots for FsSt…

8b7698d

…ateBackend and MemoryStateBackend. (cherry picked from commit 2906698)

[FLINK-7845][runtime] Make NettyMessage public

6f9ab72

This a walkaround strange javaassist bug. The issue should go away once we upgrade netty dependency. Please check the ticket for more information. This closes #5007.

[hotfix][docs][javadocs] Remove double "of"

cf099f1

This closes #4999.

[hotfix][docs] Fix typos in deployment AWS documentation

c2d3d6a

This closes #5000.

[FLINK-4500][docs] Update cassandra documentation regarding data loss

24970a9

As of FLINK-4500 the Cassandra connector will wait for pending updates to finish upon checkpoint. This closes #5002.

[hotfix][docs] Fix broken link to FLINK-7811

a4b9996

This closes #4995.

[FLINK-8011][dist] Set flink-python to provided

8d8c52f

This closes #4973.

[FLINK-8006] [Startup Shell Scripts] - Fixing $pid

fcc79c0

This closes #4968.

[FLINK-8056][dist] Use 'web.port' instead of 'jobmanager.web.port'

7c7f24e

This closes #5010.

[FLINK-7998][examples] Fix TPCHQuery3 examples

2774335

This closes #4959.

[FLINK-8071][build] Bump shade-plugin asm version to 5.1

195e3da

This closes #5014.

[FLINK-7419][build][avro] Shade jackson dependency in flink-dist

5c6eaab

This closes #4981.

[FLINK-7973] disable JNI bridge for relocated hadoop classes in s3-fs-*

d6d35fa

[FLINK-7451] [table] Support non-ascii character literals in Table AP…

81e3a88

…I and SQL This closes #4544.

[FLINK-7451] [table] Disable testing of the charset in TableTestBase

49aeb8f

[FLINK-8013] [table] Support aggregate functions with generic arrays

32bfc38

This closes #5011.

[FLINK-7678] [table] Support composite inputs for user-defined functions

06a922e

This closes #4726.

[FLINK-7490] [table] Use correct classloader to compile generated cod…

084ff68

…e that calls UDAGGs. This closes #5018.

[FLINK-7942] [table] Reduce aliasing in RexNodes

13962e1

This closes #5019.

[FLINK-7698] [table] Tests joins with null literals

397f0d1

tzulitai and others added 29 commits February 6, 2018 18:30

[FLINK-8275] [security, yarn] Fix keytab local path in YarnTaskManage…

2014665

…rRunner

[FLINK-7760] Fix deserialization of NFA state in CEP library

5f9e367

Before, the condition was being read via in.read() and not in.readFully()

[FLINK-8362][elasticsearch] shade all dependencies

0c53e79

[FLINK-8571] [DataStream] Introduce utility function that reinterpret…

beff62d

…s a data stream as keyed stream (backport from 1.5 branch) This closes #5439.

[FLINK-8423] OperatorChain#pushToOperator catch block may fail with NPE

bafb91e

This closes #5447.

[hotfix] Remove costly logging statements from CEP SharedBuffer

a044d9d

[FLINK-8652] [QS] Reduce log level in getKvState to DEBUG.

59f9ded

This closes #5489.

[hotfix] Remove more checkState() calls from SharedBuffer serialization

f2b5635

[FLINK-8576][QS] Reduce verbosity when classes can't be found

1b70f50

This closes #5420.

[FLINK-8520][cassandra] Fix race condition

054af99

This closes #5474.

[FLINK-8308] Remove explicit yajl-ruby dependency, update Jekyll to 3+

50b6484

[FLINK-8303] Add hawkins back to Gemfile

d4435e1

[FLINK-8303] [docs] Allow to overwrite ruby/gem binary

cc76c32

This closes #5395.

[FLINK-8692][docs] Remove extra parenthesis in scala code samples

528317c

[FLINK-8621][prometheus][tests] Remove endpointIsUnavailableAfterRepo…

45efe47

…rterIsClosed() The test is inherently unstable as it will always fail if any other server is started on the port between the closing of the reporter and the polling of metrics. This closes #5473.

[hotfix][prometheus] Document internal usage of CollectorRegistry.def…

d88f43b

…aultRegistry It appeared as if the HTTPServer wasn't actually doing anything, but it internally accessed the singleton registry that we also access to register metrics.

[hotfix][prometheus][tests] Add utility for generating port ranges

527faf6

[FLINK-8574][travis] Add timestamp to logging messages

a7df424

This closes #5419.

[FLINK-8735] Rename StatefulJobSavepointMigrationITCase

82e6f8d

This is preparation for modifying a new ITCase to use modern state features.

[FLINK-8735] Add new StatefulJobSavepointMigrationITCase

f06ec38

This new test does not pretend to use legacy state but now instead uses the more modern operator state varieties.

[hotfix] Update docs version to Flink 1.4.1

0d89a1c

[hotfix] [test] Make test-streaming-kafka010.sh more flexible for loc…

59169d0

…al execution

[FLINK-8772] [kafka] Fix missing log parameter

a0193f1

This closes #5574.

ArtHustonHitachi closed this Feb 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 1.4 #5585

Release 1.4 #5585

Uh oh!

ArtHustonHitachi commented Feb 26, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Release 1.4 #5585

Release 1.4 #5585

Uh oh!

Conversation

ArtHustonHitachi commented Feb 26, 2018

Contribution Checklist

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants