[BEAM-2966] Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.#3831
[BEAM-2966] Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.#3831robertwb wants to merge 4 commits intoapache:masterfrom
Conversation
|
R: @KesterTong |
|
jenkins: retest this please |
|
jenkins: retest this please |
| return {key: self.visit(value, *args) for (key, value) in node.items()} | ||
| def visit_nested(self, node, *args): | ||
| if isinstance(node, (tuple, list)): | ||
| # namedtuples require unpacked arguments in their constructor, |
There was a problem hiding this comment.
It's not clear that this supports subclasses of a namedtuple. Such subclasses will inherit the _make class method from the namedtuple, but the _make method will produce the namedtuple not the subclass. Instead, could we test for the existence of _make (as a test of whether we are dealing with a subclass of namedtuple or a direct subclass of tuple, which we assume has the usual tuple constructor) but still invoke the constructor node.__class__ when dealing with a subclass of a namedtuple? i.e.
if isinstance(node, tuple) and hasattr(node.__class__, '_make'):
# node is an instance of a subclass of a namedtuple.
return node.__class__(*[self.visit(x, *args) for x in node])
elif isinstance(node, (tuple, list)):
...
KesterTong
left a comment
There was a problem hiding this comment.
Regarding "Support multiple materializations of the smae pvalue." can you clarify what the new behavior of the cache is? It seems to me that the cache is now not really functioning as a cache because we never decrement the refcount. If so that's not a problem I just want to understand the new behavior.
|
is there a JIRA issue for this PR? If not I could open one. Would be helpful as reference for tf.Transform release notes etc. since tf.Transform will rely on older versions of Beam which will hit the bug this PR fixes. |
|
PTAL |
|
Thanks, I think you missed my question above regarding the commit titled "Support multiple materializations of the smae pvalue."? |
|
Sorry, I added comments but did not address it directly. The cache is solely an implementation detail that is thrown away as soon as the pipeline is gc'd (and, hopefully, will simply go away completely when we clean things up). In particular, the ref-counting is only used during pipeline execution. |
|
retest this please |
|
Changes Unknown when pulling cbe8dd8 on robertwb:pvalueish into ** on apache:master**. |
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue.mvn clean verifyto make sure basic checks pass. A more thorough check will be performed on your pull request automatically.