Skip to content

[BEAM-4056] Identify side inputs by transform id and local name#5118

Merged
tgroh merged 2 commits intoapache:masterfrom
bsidhom:side-input-ref
Apr 16, 2018
Merged

[BEAM-4056] Identify side inputs by transform id and local name#5118
tgroh merged 2 commits intoapache:masterfrom
bsidhom:side-input-ref

Conversation

@bsidhom
Copy link
Contributor

@bsidhom bsidhom commented Apr 12, 2018

This is necessary to identify side inputs during portable pipeline translation and execution.


Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand:
    • What the pull request does
    • Why it does it
    • How it does it
    • Why this approach
  • Each commit in the pull request should have a meaningful subject line and body.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@bsidhom
Copy link
Contributor Author

bsidhom commented Apr 12, 2018

R: @tgroh

// Side Input PCollection ids. Each must be present as a value in the inputs of
// any PTransform the ExecutableStagePayload is the payload of.
repeated string side_inputs = 3;
// The side inputs required for this executable stage. Each must be prsent as a side input of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you familiar with that common abbreviation??? It shaves off a whole character!

String outerLocalName = String.format("%s:%s",
sideInput.transformId(), sideInput.localName());
pt.putInputs(outerLocalName, sideInput.getCollection().getId());
payload.addSideInputs(SideInputId.newBuilder()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your formatting looks funky here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this is better.

public abstract class SideInputReference {

/** Create a side input reference. */
public static SideInputReference of(String transformId, String localName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a PTransformNode? Would that be available everywhere we're constructing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's available. We already require components everywhere due to PCollectionNode. Done.

Copy link
Contributor Author

@bsidhom bsidhom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments. PTAL.

// Side Input PCollection ids. Each must be present as a value in the inputs of
// any PTransform the ExecutableStagePayload is the payload of.
repeated string side_inputs = 3;
// The side inputs required for this executable stage. Each must be prsent as a side input of
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you familiar with that common abbreviation??? It shaves off a whole character!

String outerLocalName = String.format("%s:%s",
sideInput.transformId(), sideInput.localName());
pt.putInputs(outerLocalName, sideInput.getCollection().getId());
payload.addSideInputs(SideInputId.newBuilder()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if this is better.

public abstract class SideInputReference {

/** Create a side input reference. */
public static SideInputReference of(String transformId, String localName,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's available. We already require components everywhere due to PCollectionNode. Done.

Copy link
Member

@tgroh tgroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of nits

// any PTransform the ExecutableStagePayload is the payload of.
repeated string side_inputs = 3;
// The side inputs required for this executable stage. Each must be present as a side input of
// exactly one PTransform within this ExecutableStagePayload.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would modify this spec a little - "Each Side Input of each PTransform within this ExecutableStagePayload must be represented within this field." or thereabouts.

That way it represents the minimum contents instead of the maximum contents; if we make more side inputs available than required, I don't expect either harness to break, just to be very confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public abstract class SideInputReference {

/** Create a side input reference. */
public static SideInputReference of(PTransformNode transform, String localName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Let me know if this is any better. I'm sad that there's no Beam auto-formatter. 😭

Copy link
Contributor Author

@bsidhom bsidhom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing review comments.

// any PTransform the ExecutableStagePayload is the payload of.
repeated string side_inputs = 3;
// The side inputs required for this executable stage. Each must be present as a side input of
// exactly one PTransform within this ExecutableStagePayload.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public abstract class SideInputReference {

/** Create a side input reference. */
public static SideInputReference of(PTransformNode transform, String localName,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Let me know if this is any better. I'm sad that there's no Beam auto-formatter. 😭

Copy link
Member

@tgroh tgroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Please clean up history pre-merge.

@bsidhom
Copy link
Contributor Author

bsidhom commented Apr 16, 2018

Done.

bsidhom added 2 commits April 16, 2018 10:37
This is necessary to identify side inputs during portable pipeline
translation and execution.
@tgroh tgroh merged commit a7819fe into apache:master Apr 16, 2018
@bsidhom bsidhom deleted the side-input-ref branch April 16, 2018 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants