Skip to content

Conversation

@JingsongLi
Copy link
Contributor

What is the purpose of the change

limit the parallelism of HiveCatalogUseBlinkITCase to avoid too many slot requests by default parallelism (use the core size of machine).

Verifying this change

This change is already covered by existing tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 12, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 4df01ef (Fri Aug 23 10:18:58 UTC 2019)

Warnings:

  • 1 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 12, 2019

CI report:

@KurtYoung
Copy link
Contributor

Another way is to extend AbstractTestBase, which will configure mini cluster in a more predictable way.

@KurtYoung
Copy link
Contributor

And I'm not sure the root cause is the parallelism is too high for this case. We should support running very high parallelism batch job with only one slot by design.

@JingsongLi
Copy link
Contributor Author

@KurtYoung Yeah, the main problem is memory request is too high(128MB). I think 8dacb85 will solve it.

@JingsongLi JingsongLi changed the title [FLINK-13688][hive] Limit the parallelism of HiveCatalogUseBlinkITCase [FLINK-13688][hive] Limit the parallelism/memory of HiveCatalogUseBlinkITCase Aug 12, 2019
@KurtYoung
Copy link
Contributor

How about we use a pre configured mini cluster instead of lower the memory request? Because it's not 100% safe to just rely on lowering the sort memory, because we might choose hash aggregate.

@JingsongLi
Copy link
Contributor Author

How about we use a pre configured mini cluster instead of lower the memory request? Because it's not 100% safe to just rely on lowering the sort memory, because we might choose hash aggregate.

Yeah, that is a way too, but in this case, I think use lower memory is OK. Because the data is too few.
1.It must rely on sort now, because UDAF only can be support by sort agg.
2.I think I can extract the configuration in BatchTestBase, to config all memory things.

@KurtYoung
Copy link
Contributor

I would like to repeat my first question: why not just extending AbstractTestBase

@JingsongLi
Copy link
Contributor Author

I would like to repeat my first question: why not just extending AbstractTestBase

OK.

@JingsongLi
Copy link
Contributor Author

This case makes me feel that the first time a user uses blink batch sql, he has to think about resources.....

@bowenli86
Copy link
Member

+1 for fixing it. @KurtYoung @JingsongLi

@KurtYoung
Copy link
Contributor

Verified locally, +1.

KurtYoung pushed a commit that referenced this pull request Aug 16, 2019
@KurtYoung KurtYoung closed this in a194b37 Aug 16, 2019
@JingsongLi JingsongLi deleted the hivetest branch August 16, 2019 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants