Skip to content

[BEAM-2966] Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.#3831

Closed
robertwb wants to merge 4 commits intoapache:masterfrom
robertwb:pvalueish
Closed

[BEAM-2966] Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.#3831
robertwb wants to merge 4 commits intoapache:masterfrom
robertwb:pvalueish

Conversation

@robertwb
Copy link
Copy Markdown
Contributor

@robertwb robertwb commented Sep 9, 2017

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@robertwb
Copy link
Copy Markdown
Contributor Author

robertwb commented Sep 9, 2017

R: @KesterTong

@robertwb
Copy link
Copy Markdown
Contributor Author

jenkins: retest this please

@robertwb
Copy link
Copy Markdown
Contributor Author

jenkins: retest this please

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.2%) to 69.509% when pulling 2c98ab8 on robertwb:pvalueish into e9d3a4a on apache:master.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage decreased (-0.2%) to 69.523% when pulling 541b4b3 on robertwb:pvalueish into e9d3a4a on apache:master.

return {key: self.visit(value, *args) for (key, value) in node.items()}
def visit_nested(self, node, *args):
if isinstance(node, (tuple, list)):
# namedtuples require unpacked arguments in their constructor,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear that this supports subclasses of a namedtuple. Such subclasses will inherit the _make class method from the namedtuple, but the _make method will produce the namedtuple not the subclass. Instead, could we test for the existence of _make (as a test of whether we are dealing with a subclass of namedtuple or a direct subclass of tuple, which we assume has the usual tuple constructor) but still invoke the constructor node.__class__ when dealing with a subclass of a namedtuple? i.e.

if isinstance(node, tuple) and hasattr(node.__class__, '_make'):
  # node is an instance of a subclass of a namedtuple.
  return node.__class__(*[self.visit(x, *args) for x in node])
elif isinstance(node, (tuple, list)):
  ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown

@KesterTong KesterTong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding "Support multiple materializations of the smae pvalue." can you clarify what the new behavior of the cache is? It seems to me that the cache is now not really functioning as a cache because we never decrement the refcount. If so that's not a problem I just want to understand the new behavior.

@KesterTong
Copy link
Copy Markdown

is there a JIRA issue for this PR? If not I could open one. Would be helpful as reference for tf.Transform release notes etc. since tf.Transform will rely on older versions of Beam which will hit the bug this PR fixes.

@robertwb robertwb changed the title Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs. [BEAM-2966] Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs. Sep 18, 2017
@robertwb
Copy link
Copy Markdown
Contributor Author

PTAL

@KesterTong
Copy link
Copy Markdown

Thanks, I think you missed my question above regarding the commit titled "Support multiple materializations of the smae pvalue."?

@robertwb
Copy link
Copy Markdown
Contributor Author

Sorry, I added comments but did not address it directly. The cache is solely an implementation detail that is thrown away as soon as the pipeline is gc'd (and, hopefully, will simply go away completely when we clean things up). In particular, the ref-counting is only used during pipeline execution.

@robertwb
Copy link
Copy Markdown
Contributor Author

retest this please

@coveralls
Copy link
Copy Markdown

Coverage Status

Changes Unknown when pulling cbe8dd8 on robertwb:pvalueish into ** on apache:master**.

@asfgit asfgit closed this in cfbdb61 Sep 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants