New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-7678] Fixes bug in output element_type generation in Kv PipelineVisitor #9238
[BEAM-7678] Fixes bug in output element_type generation in Kv PipelineVisitor #9238
Conversation
@aaltay would you have time to review or recommend another reviewer? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it makes sense to preserve the previous output.element_type
if one is not set.
It looks like you have a test case in (https://issues.apache.org/jira/browse/BEAM-7678). Would it possible to convert that to a unit test? It will be good to see a passing test.
I agree that we should have a unit test for this. In the case of the manual test the Message class lives in a separate library and it's a very complex class (several hundreds of lines of code). My guess is that for some reason the inference is not able to work for that class. I had spent some time trying to write a test doing exactly what you suggested before submitting the pr. The problem is that I can't find an easy way to create a type that would fail class TestOutputType(object):
def __init__(self, value):
self.value = value I had created a stateful DoFn that would hit the Kv PipelineVisitor (verified via debugger) but the |
b12f1ca
to
a70de1d
Compare
I was able to write a unit test for this feature. I rebased the code because of some wrong commits I accidentally pushed to my pr. |
Thank you this looks good. I will merge it after tests pass. Just to be explicit, this test was failing before your fix, right? |
Correct, I verified that the test was failing before the fix and passed after the fix. |
Test error looks like a flake. Filed: https://issues.apache.org/jira/browse/BEAM-7911 |
Run Python PreCommit |
@ecanzonieri I assigned the issue to you and closed it. I added as a contributor, you should be able to assign JIRA issues to yourself from now on. |
This review tries to address BEAM-7678. When we use typehints in the output the
element_type
is already defined so in theory there is no need to try to infer it. There are some types that cannot be inferred by the functioninfer_output_type
, in these cases the typehint should be used to bypass the inferred type.I'm not sure how to test this code, I can reproduce the issue in a manual test, using as type a class that for some reason the
infer_output_type
is not able to infer.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.