Skip to content

Conversation

@KurtYoung
Copy link
Contributor

What is the purpose of the change

  • Fix hardcoded flink version in tpch end-to-end test.

Brief change log

  • Use auto fetched flink version in tpch end-to-end test.
  • Changes test parallelism to 2 to cover the situation that parallelism is higher than the slot number, since the testing cluster only have one task manager and contains only one slot.

Verifying this change

Tested on a linux server and passed.

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 6, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 4e442c2 (Tue Aug 06 16:04:02 UTC 2019)

Warnings:

  • 1 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Aug 6, 2019

CI report:

Also changed test parallelism to 2 to cover the situation that
parallelism is higher than the slot number, since the testing cluster
only have one task manager and contains only one slot
TARGET_DIR="$END_TO_END_DIR/flink-tpch-test/target"
TPCH_DATA_DIR="$END_TO_END_DIR/test-scripts/test-data/tpch"
java -cp "$TARGET_DIR/flink-tpch-test-1.10-SNAPSHOT.jar:$TARGET_DIR/lib/*" org.apache.flink.table.tpch.TpchDataGenerator "$SCALE" "$TARGET_DIR"
FLINK_VERSION=`ls "${END_TO_END_DIR}/../flink-table/flink-table-api-java/target" | sed -n "s/.*flink-table-api-java-\(.*\)-tests\.jar/\1/p" | uniq`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLINK_VERSION has been defined in common.sh, maybe we can use it directly.

java -cp "$TARGET_DIR/flink-tpch-test-1.10-SNAPSHOT.jar:$TARGET_DIR/lib/*" org.apache.flink.table.tpch.TpchDataGenerator "$SCALE" "$TARGET_DIR"
FLINK_VERSION=`ls "${END_TO_END_DIR}/../flink-table/flink-table-api-java/target" | sed -n "s/.*flink-table-api-java-\(.*\)-tests\.jar/\1/p" | uniq`

java -cp "$TARGET_DIR/flink-tpch-test-"$FLINK_VERSION".jar:$TARGET_DIR/lib/*" org.apache.flink.table.tpch.TpchDataGenerator "$SCALE" "$TARGET_DIR"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
java -cp "$TARGET_DIR/flink-tpch-test-"$FLINK_VERSION".jar:$TARGET_DIR/lib/*" org.apache.flink.table.tpch.TpchDataGenerator "$SCALE" "$TARGET_DIR"
java -cp "${TARGET_DIR}/flink-tpch-test-${FLINK_VERSION}.jar:$TARGET_DIR/lib/*" org.apache.flink.table.tpch.TpchDataGenerator "$SCALE" "$TARGET_DIR"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we take the same approach as for other e2e tests, e.g. like test_streaming_bucketing.sh. We usually just rename the end jar so it doesn't include the version.

We use your approach only if we need to use jars from the dist, which we can't strip off from version.

WDYT?

planner: blink
type: batch
result-mode: table
parallelism: 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this change? It's unrelated to the subject of the commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a supplement to this issue: https://issues.apache.org/jira/browse/FLINK-13441. I think this is also a good place to cover such scenarios, and don't need to open a jira for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this related to the linked issue? Honestly I still don't get what is the purpose of this change.

I disagree this is a good place to add such cases. It's impossible to link this change to the reason why it was introduced. I'm not saying we need a new JIRA issue, but it should at least be in a separate hotfix commit with at least brief explanation what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained this change in the commit message if you missed that:

Also changed test parallelism to 2 to cover the situation that parallelism is higher than the slot number, since the testing cluster only have one task manager and contains only one slot

I agree this can be isolated to another commit.

Copy link
Contributor

@dawidwys dawidwys Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right I missed it, sorry for that. Now I get it. I think this would be a perfect description for the extracted commit (and easier to find)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I already did the exactly thing in my last commit.

@wuchong
Copy link
Member

wuchong commented Aug 6, 2019

+1 for strip off version.

@KurtYoung
Copy link
Contributor Author

Striping off version sounds good, I will change it.

This can help to cover the situation that parallelism is higher than the slot number, since the testing cluster only have one task manager and contains only one slot.
@KurtYoung
Copy link
Contributor Author

@dawidwys Thanks for the reviewing, I tested locally again in my laptop and passed. I will merge this if you don't have any further comment.

Copy link
Contributor

@dawidwys dawidwys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I have no more comments, feel free to merge it. Thank you @KurtYoung for the updates.

@KurtYoung KurtYoung closed this in 648fc72 Aug 6, 2019
KurtYoung added a commit that referenced this pull request Aug 6, 2019
@KurtYoung KurtYoung deleted the tpch-e2e branch August 6, 2019 14:07
becketqin pushed a commit to becketqin/flink that referenced this pull request Aug 17, 2019
becketqin pushed a commit to becketqin/flink that referenced this pull request Aug 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants