Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-7747] Close the file handle owned by fastavro.write.Writer in _FastAvroSink.close(). #9111

Merged
merged 1 commit into from Jul 24, 2019

Conversation

tvalentyn
Copy link
Contributor

@tvalentyn tvalentyn commented Jul 19, 2019

Currently, _FastAvroSink.close() calls flush() method on FastAvro writer, which does not close the file handle: https://github.com/fastavro/fastavro/blob/53ed64d95d4f82875d1238b1f76d4d87547e40e1/fastavro/_write.pyx#L548. On Windows platform, the temporary output file created by the sink cannot be accessed until it is closed.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status --- --- Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@tvalentyn
Copy link
Contributor Author

Run Python_PVR_Flink PreCommit

@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

@tvalentyn
Copy link
Contributor Author

R: @chamikaramj

@tvalentyn
Copy link
Contributor Author

Run Seed Job

@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

2 similar comments
@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

@@ -629,3 +629,4 @@ def write_record(self, writer, value):

def close(self, writer):
writer.flush()
writer.fo.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just do a "super(_FastAvroSink, self).close()" since file handle is closed at FileBasedSink.

def close(self, file_handle):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FileBasedSink.close(writer) calls writer.close():

However FastAvro writer only has flush(), but does not have close(): https://github.com/fastavro/fastavro/blob/53ed64d95d4f82875d1238b1f76d4d87547e40e1/fastavro/_write_py.py#L336

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.

@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -629,3 +629,4 @@ def write_record(self, writer, value):

def close(self, writer):
writer.flush()
writer.fo.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.

@tvalentyn
Copy link
Contributor Author

tvalentyn commented Jul 22, 2019

Postcommit failure surfaces on a clean build: BEAM-7798.

@tvalentyn
Copy link
Contributor Author

Run Python 2 PostCommit

@aaltay
Copy link
Member

aaltay commented Jul 23, 2019

Run Python 3.5 PostCommit

@aaltay
Copy link
Member

aaltay commented Jul 23, 2019

Run Python 2 PostCommit

@tvalentyn
Copy link
Contributor Author

We need to re-run postsubmits after #9122 is merged.

@tvalentyn
Copy link
Contributor Author

Run Python 2 PostCommit

@tvalentyn
Copy link
Contributor Author

Run Python 3.5 PostCommit

@tvalentyn
Copy link
Contributor Author

Run Python 2 PostCommit

@robertwb robertwb merged commit 09fe498 into apache:master Jul 24, 2019
@tvalentyn tvalentyn deleted the patch-54 branch August 27, 2019 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants