Skip to content

Conversation

@zhijiangW
Copy link
Contributor

@zhijiangW zhijiangW commented May 24, 2019

What is the purpose of the change

In order to make abstract InputGate simple for extending new implementations in shuffle service architecture, we could remove unnecessary methods from it.

Currently InputGate#getOwningTaskName is only used for debugging log in BarrierBuffer and StreamInputProcessor. This task name could be got from Environment#getTaskInfo and then be passed into the constructor of BarrierBuffer/StreamInputProcessor for use.

Brief change log

  • Add the full task name in the TaskInfo
  • Remove getOwningTaskName from InputGate abstract methods
  • Refactor the related process to get task name for BarrierBuffer and StreamInputProcessor

Verifying this change

covered by existing tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented May 24, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❗ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details
The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@zhijiangW
Copy link
Contributor Author

@flinkbot attention @azagrebin

Copy link
Contributor

@azagrebin azagrebin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhijiangW ! I like the refactoring and left some smaller comments. One bigger concern is that it seems we still depends directly on some network specific options and I am not sure whether they make sense for all shuffle services.

@zhijiangW
Copy link
Contributor Author

zhijiangW commented May 29, 2019

Thanks for the reviews and good suggestions @azagrebin !

Yes, we still rely on some network options in BarrierBuffer implementation. And I also think it is not good to do so. This PR is mainly making the interface seem more general and clean. We could further refactor the BarrierBuffer implementation future as I suggested inline comments.

I pushed and squashed the commits for addressing the other issues.

Copy link
Contributor

@azagrebin azagrebin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing comments @zhijiangW !

getOwningTaskName looks good to me.

We can consider NETWORK_CREDIT_MODEL in other PR. Maybe, we add ShuffleService#supportsCachingOrBlockingInputChannels API flag or something like that. Then it could be propagated somehow to StreamTask. getPage might refactored similar way.

At the moment I would suggest we do not remove getPage in this PR because querying network configuration directly exposes its details more than justing having InputGate.getPage. I also discussed with @pnowojski using Buffer.size for this but it looks like a bigger effort which could be a right way to go at the end but not obvious and probably needs a separate issue. Removing of InputGate.getPage does not look like a big blocker for Shuffle API and could stay in InputGate for now although I agree it is not perfect to have it as InputGate.getPage.

could we reduce the scope of this PR by addressing only getOwningTaskName at the moment? because this part looks already mergable, WDYT?

@zhijiangW
Copy link
Contributor Author

Thanks for further reviews @azagrebin !
I already reverted the changes for getPage issue which could be solved in a separate PR future in a better way.

@zhijiangW zhijiangW force-pushed the FLINK-12603 branch 2 times, most recently from b7707c8 to 14aa43b Compare June 3, 2019 02:48
@zhijiangW zhijiangW changed the title [FLINK-12603][network] Refactor InputGate interface to remove unnecessary methods [FLINK-12603][network] Remove getOwningTaskName method from InputGate Jun 3, 2019
Copy link
Contributor

@azagrebin azagrebin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhijiangW ! LGTM 👍

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @zhijiangW. I had some comments which we should resolve before merging.

@zhijiangW
Copy link
Contributor Author

@tillrohrmann thanks for the confirmation and I have addressed the comments by squashing the commits.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments @zhijiangW. I had one last comment. Moreover, the build is not going through because of

[ERROR] /home/travis/build/apache/flink/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java:[58,9] method does not override or implement a method from a supertype
11:30:03.090 [ERROR] /home/travis/build/apache/flink/flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/InputGateWithMetrics.java:[60,33] cannot find symbol
  symbol:   method getOwningTaskName()
  location: variable inputGate of type org.apache.flink.runtime.io.network.partition.consumer.InputGate

@zhijiangW
Copy link
Contributor Author

zhijiangW commented Jun 4, 2019

@tillrohrmann I have addressed the last comment and the travis issue is caused by another merged PR which is also solved now.

zhijiangW added 2 commits June 5, 2019 10:10
In order to make abstract InputGate simple for extending new implementations in shuffle service architecture, we could remove unnecessary methods from it.
InputGate#getOwningTaskName is only used for debugging log in BarrierBuffer and StreamInputProcessor. This task name could also be generated in StreamTask
via Environment#getTaskInfo and Environment#getExecutionId. Then it could be passed into the constructors of BarrierBuffer/StreamInputProcessor for use.
Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing my comments @zhijiangW. LGTM. Merging this PR now.

@zhijiangW zhijiangW deleted the FLINK-12603 branch June 10, 2020 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants