[sql-jdbc] #20192

lpn666 · 2022-07-07T05:47:28Z

What is the purpose of the change

When I using the sql-jdbc to transform a big table from mysql to other database, the flink program load the entire table into memory. The source table is too big (16GB), and the taskmanager crashed.
So What can I do, or what about add a new option to limit the speed of reading data (or batch the data )

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

…_boot.py This closes #18025.

Use a dedicated thread to run each jar, so that pooled threads can't keep references to user-code (e.g., in a ThreadLocal).

… TableSourceTable This closes #18040 Co-authored-by: guanghxu <xuguangheng1995@gmail.com>

…s not the same with query result's changelog upsert key This closes #18048

This reverts commit 50cd7bb.

…SlotProfileTestingUtils

…rofile to `reservedAllocations` to avoid confusion

…kable out to be a standalone test class in flink-runtime

…r batch jobs even if local recovery is enabled

This closes #15229.

…in FileSink doc

…ption (#18035)

…to recover after losing and regaining leadership.

…perLeaderElectionITCase#testJobExecutionOnClusterWithLeaderChange` (FLINK-25235) This closes #18066.

…ternalProducer if transaction finalization fails In the KafkaCommitter we retry transactions if they failed during committing. Since we reuse the KafkaProducers we update the used transactionalId to continue committing other transactions. To prevent accidental overwrites we track the transaction state inside the FlinkKafkaInternalProducer. Before this change, the state was not reset on a failures during the transaction finalization and setting a new transactionalId failed. The state is now always reset nevertheless whether finalizing the transaction fails (commit, abort).

…explicitly for job to finish. (cherry picked from commit 7976be0)

while using TableEnvironment in the ITCase, a Flink MiniCluster will be started/stopped automatically in the background. Since the shutdown of the MiniCluster will be called asynchronously, CollectResultFetcher will get data lost sometimes based on race conditions and the unchecked RuntimeException java.lang.IllegalStateException will be thrown that we were not aware of. The solution is to control the lifecycle of the MiniCluster manually in this test. The MiniClusterWithClientResource could be a good fit in this case. (cherry picked from commit fca04c3)

…ngTest#testRemoteClose This closes #18089.

This closes #17949

…CKPOINTS as default for externalized-checkpoint-retention

This consistency level is only available on write, so we need to create one builder for reading and one for writing. Some sinks are used for both reading and writing, in that case, reading builder is used. (cherry picked from commit c40bbf1)

… mailbox is closed already

We suspect that the NetworkFailureProxy is causing constant connectivity problems to the brokers during testing resulting in either network timeouts or corrupted results. Since the NetworkFailureProxy is only used for testing the deprecated FlinkKafkaProducer/Consumer we can safely remove it because we will not add new features to the connectors.

…e avro schema (#19705) Co-authored-by: Haizhou Zhao <haizhou_zhao@apple.com>

…to-end tests to avoid flooding the disk space

…kages to clean up more diskspace before starting the E2E tests. Also removing the line that removes `^ghc-8.*` since that doesn't exist anymore on the machines. (cherry picked from commit db6baf4)

This closes #19765.

…w exception This closes #19241.

This closes #19788.

…already provided in producer Properties

… PyFlink Table API jobs in batch mode This closes #19816.

…verride method 'merge' is used in cases where 'merge' is used This closes #19817.

This closes #19848.

…ed record This closes #19913

…19254)

This closes #19892

…ndition

…rtitionSplitReader.fetch() to handle no valid partition case This closes #19979.

…onsumer invocations in split assignment This closes #19982.

…ttl enabled in RetractableTopNFunction This closes #19997

…loss This closes #20019

…r only from one thread.

…an continue to use Kubernetes 1.24+ and the `none` driver, since Kubernetes 1.24 has dropped support for Dockershim.

… DataStream and SQL connector (#19904) (cherry picked from commit 5d564b1)

…isk space (cherry picked from commit a339704)

lpn666 · 2022-07-07T05:48:54Z

What is the purpose of the change

When I using the sql-jdbc to transform a big table from mysql to other database, the flink program load the entire table into memory. The source table is too big (16GB), and the taskmanager crashed. So What can I do, or what about add a new option to limit the speed of reading data (or batch the data )

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

HuangXingBo and others added 30 commits December 7, 2021 10:06

[FLINK-23493][python] Remove the calling of child process in the beam…

b4fda42

…_boot.py This closes #18025.

[FLINK-25022][rest] Run jars in separate threads

5207fe5

Use a dedicated thread to run each jar, so that pooled threads can't keep references to user-code (e.g., in a ThreadLocal).

[FLINK-22113][table-planner] Implement column uniqueness checking for…

0ec9924

… TableSourceTable This closes #18040 Co-authored-by: guanghxu <xuguangheng1995@gmail.com>

[FLINK-20370][table] part1: Fix wrong results when sink primary key i…

7c5ddbd

…s not the same with query result's changelog upsert key This closes #18048

Revert "[FLINK-24859][doc][formats] Make new formats name coherent"

94f269d

This reverts commit 50cd7bb.

[hotfix][runtime] Refactor testing methods out from SlotProfile into …

a1f7a5c

…SlotProfileTestingUtils

[hotfix][runtime] Rename previousExecutionGraphAllocations in SlotP…

3eae179

…rofile to `reservedAllocations` to avoid confusion

[hotfix][tests] Factor PerJobMiniClusterFactoryTest#MyCancellableInvo…

f915f09

…kable out to be a standalone test class in flink-runtime

[FLINK-19142][runtime] Use LocationPreferenceSlotSelectionStrategy fo…

63cf221

…r batch jobs even if local recovery is enabled

[FLINK-19142][runtime] Fix slot hijacking after task failover

347becb

This closes #15229.

[FLINK-25091][docs] Change ORC compression attribute reference error …

946056e

…in FileSink doc

[FLINK-25096] Fixes empty exception history for JobInitializationExce…

7c380ba

…ption (#18035)

[FLINK-23946][clients] Dispatcher in application mode should be able …

ff96e93

…to recover after losing and regaining leadership.

[FLINK-23946][FLINK-24038][tests] Harden ZooKeeperLeaderElectionITCase.

4567d87

[FLINK-23946][clients] Code review based fixes.

0a99256

[FLINK-23946] Flink 1.14 compatibility + disable a flaky test `ZooKee…

b194e83

…perLeaderElectionITCase#testJobExecutionOnClusterWithLeaderChange` (FLINK-25235) This closes #18066.

[FLINK-24077][HBase/IT] Add check of row count after insert and wait …

4e962d9

…explicitly for job to finish. (cherry picked from commit 7976be0)

[FLINK-25240][build] Upgrade Log4j2 to 2.15.0

031f46f

[hotfix][docs] Update Log4j version in Gradle template

d35a983

[FLINK-20195][coordination] Deduplicate jobs for overview

d3977d2

[FLINK-22096][tests] Fix port conflict for ServerTransportErrorHandli…

256153e

…ngTest#testRemoteClose This closes #18089.

[hotfix][docs] Fix Scala example for MiniCluster test

980bb03

This closes #17949

[FLINK-24987][streaming-java] Add explicit enum value NO_EXTERNAL_CHE…

30da65c

…CKPOINTS as default for externalized-checkpoint-retention

[FLINK-24987][docs] Improve ExternalizedCheckpointCleanup documentation

39c6949

[FLINK-25295][build] Update log4j2 dependency to 2.16.0

31112ca

[FLINK-24846][streaming] Ignoring completing async operator record if…

e7df5ec

… mailbox is closed already

MartijnVisser and others added 28 commits May 11, 2022 14:21

[hotfix][ci] Update libssl download link

08330a6

[FLINK-27255] [flink-avro] flink-avro does not support ser/de of larg…

d3be0d0

…e avro schema (#19705) Co-authored-by: Haizhou Zhao <haizhou_zhao@apple.com>

[FLINK-24433][Tests][Buildsystem] Turn off debug logging for all end-…

713f0e0

…to-end tests to avoid flooding the disk space

[FLINK-24433][Tests][Buildsystem] Remove additional pre-installed pac…

64dac77

…kages to clean up more diskspace before starting the E2E tests. Also removing the line that removes `^ghc-8.*` since that doesn't exist anymore on the machines. (cherry picked from commit db6baf4)

[FLINK-27676][python] Fix on_timer output behind triggering watermark

a5eee99

This closes #19765.

[FLINK-26788][core] Add cause when AbstractDeserializationSchema thro…

3be78d0

…w exception This closes #19241.

[FLINK-27733][python] Rework on_timer output behind watermark bug fix

945c153

This closes #19788.

[FLINK-27751][build] Disable jboss repository

82d36d0

[FLINK-27174][connector/kafka] Fix checking of bootstrapServers when …

c81a051

…already provided in producer Properties

[FLINK-27760][python] Fix the issue that NPE is thrown when executing…

47f0ad4

… PyFlink Table API jobs in batch mode This closes #19816.

[FLINK-27776][python] Throw meaningful exceptions when UDAF doesn't o…

07fc8c4

…verride method 'merge' is used in cases where 'merge' is used This closes #19817.

[hotfix][doc] fix typo and enlarge image

e14b0ad

This closes #19848.

[FLINK-27418][table-planner] Fix topN retraction for previously delet…

80abb30

…ed record This closes #19913

[hotfix][table-runtime] Fix ArrayDataSerializer null fields writing (#…

0c061da

…19254)

[FLINK-27683][table-planner] Fix SQL hints can't work with targetColumns

ca16d28

This closes #19892

[FLINK-24960][yarn-tests] change order of statements to avoid race co…

5275206

…ndition

[FLINK-27041][connector/kafka] Catch IllegalStateException in KafkaPa…

a4f8335

…rtitionSplitReader.fetch() to handle no valid partition case This closes #19979.

[FLINK-27762][connector/kafka] Catch WakeupException and retry KafkaC…

4c9d99b

…onsumer invocations in split assignment This closes #19982.

[FLINK-28019][table] fix error when retract a staled record if state …

5dbb51c

…ttl enabled in RetractableTopNFunction This closes #19997

[FLINK-27420] Recreate metric groups for each new RM to avoid metric …

ff06054

…loss This closes #20019

[hotfix][ci] Update libssl download link

e94653b

[FLINK-25454][runtime] Pause and resume time for throughput calculato…

10e6341

…r only from one thread.

[FLINK-28226][python] Limit grpcio<1.47

a473be6

[FLINK-28263][TPCDS][Tests] Clean-up generated data folder by TPCDS test

852713c

[FLINK-28269][Kubernetes] Install cri-dockerd and crictl so that we c…

d101a8a

…an continue to use Kubernetes 1.24+ and the `none` driver, since Kubernetes 1.24 has dropped support for Dockershim.

[FLINK-28355][python][e2e] Clean up python environment after e2e tests

1e1935e

[FLINK-27865][docs] Add example for configuring SASL and SSL in Kafka…

d716239

… DataStream and SQL connector (#19904) (cherry picked from commit 5d564b1)

[FLINK-28329][Tests] Output the top 15 directories in terms of used d…

35e43a9

…isk space (cherry picked from commit a339704)

lpn666 closed this Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sql-jdbc] #20192

[sql-jdbc] #20192

Uh oh!

lpn666 commented Jul 7, 2022

Uh oh!

lpn666 commented Jul 7, 2022

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[sql-jdbc] #20192

[sql-jdbc] #20192

Uh oh!

Conversation

lpn666 commented Jul 7, 2022

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

lpn666 commented Jul 7, 2022

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants