Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Don't close upstream on HttpFile::Flush #1201

Merged

Conversation

petzeb
Copy link
Contributor

@petzeb petzeb commented Apr 24, 2023

Closing the upstream on flush will effectively terminate the ongoing curl connection. This means that we would need re-establish the connection in order to resume writing, this is not what we want. In the spirit of the documentation of File::Flush

/// Flush the file so that recently written data will survive an 
/// application crash (but not necessarily an OS crash). For 
/// instance, in LocalFile the data is flushed into the OS but not 
/// necessarily to disk.

We will instead wait for the curl thread to finish consuming what ever might be in the upload cache, but leave the connection open for subsequent writes.

Fixes #1196

@google-cla
Copy link

google-cla bot commented Apr 24, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Closing the upstream on flush will effectively terminate the ongoing
curl connection. This means that we would need re-establish the
connection in order to resume writing, this is not what we want. In the
spirit of the documentation of File::Flush
/// Flush the file so that recently written data will survive an
/// application crash (but not necessarily an OS crash). For
/// instance, in LocalFile the data is flushed into the OS but not
/// necessarily to disk.
We will instead wait for the curl thread to finish consuming what ever
might be in the upload cache, but leave the connection open for
subsequent writes.
@petzeb petzeb force-pushed the dont_close_http_upstream_on_flush branch from bd957c3 to 1b5429e Compare April 24, 2023 14:22
@petzeb petzeb changed the title http_file: Don't close upstream on Flush fix: Don't close upstream on HttpFile::Flush Apr 24, 2023
@petzeb
Copy link
Contributor Author

petzeb commented Apr 24, 2023

This is an attempt to fix #1196
Open for other suggestions.

@joeyparrish
Copy link
Member

Should I tag this PR as closing issue #1196?

Could you add a unit test for this?

@petzeb
Copy link
Contributor Author

petzeb commented May 4, 2023

Should I tag this PR as closing issue #1196?

Yes, if this approach seems reasonable. I am quite new to the shaka-packager code base.

Could you add a unit test for this?

I'll see what I can do 👍

@petzeb
Copy link
Contributor Author

petzeb commented May 18, 2023

@joeyparrish Sorry for the delay, I was just about to start looking at adding a unit test but I noticed now that all the unit tests for http_file seem to be disabled. Any idea why this is?

@joeyparrish
Copy link
Member

I don't know the history for those tests being disabled, but in the cmake branch, they have been re-enabled. When the cmake porting work is complete and merges back into main, everything should be enabled and working.

I will merge this as is, but follow up with a new test in the cmake branch.

@joeyparrish
Copy link
Member

Assigned myself #1224 to follow up with a new test in the cmake branch, but if you want to do it yourself, please feel free to comment on #1224 so I can assign it to you.

Thanks!

@joeyparrish joeyparrish merged commit 53d91cd into shaka-project:main Jul 5, 2023
22 of 24 checks passed
@joeyparrish
Copy link
Member

This is causing issues with upload tests. Essentially, they hang forever because curl thinks there will be more chunks to upload.

I see how your change is more in line with the original intent of Flush(), and I have yet to find evidence of Flush() being used in the normal pipeline to write data to the server during an upload. It should perhaps be Close() that signals the last chunk has been sent, but there will need to be more changes to make that work, since right now Close() terminates the HTTP connection, possibly without reading the response.

joeyparrish added a commit to joeyparrish/shaka-packager that referenced this pull request Jul 13, 2023
All HTTP-based tests now use an embedded test server instead of
httpbin.org, which makes them much faster and more reliable.

These more reliable tests also exposed some issues that began recently
with PR shaka-project#1201.  HttpFile's Flush() semantics were different than those
documented for files in general.  Flush() used to close the file for
uploading, so that no further writes were allowed, but the documentation
stated that it would only flush data to its destination.  PR shaka-project#1201
brought HttpFile's Flush() in line with the docs, but gave us no way to
terminate a chunked upload.

This adds a new method to File called CloseForWriting(), which
terminates a chunked upload for HttpFile.  The only other implementation
that does anything is UdpFile, which uses the socket library function
shutdown() to terminate writes while allowing reads.

This also tweaks HttpFile::CloseWithStatus() so that it will not
generate an error if the file is closed before the HTTP response is
written to the download cache.

This modifies the test HttpFileTest.MultipleWrites so that the file is
Flushed after each chunk.  This adds test coverage for the changes
introduced in PR shaka-project#1201.

Fixes shaka-project#1224 (missing test coverage for HttpFile::Flush)
joeyparrish added a commit to joeyparrish/shaka-packager that referenced this pull request Jul 13, 2023
All HTTP-based tests now use an embedded test server instead of
httpbin.org, which makes them much faster and more reliable.

These more reliable tests also exposed some issues that began recently
with PR shaka-project#1201.  HttpFile's Flush() semantics were different than those
documented for files in general.  Flush() used to close the file for
uploading, so that no further writes were allowed, but the documentation
stated that it would only flush data to its destination.  PR shaka-project#1201
brought HttpFile's Flush() in line with the docs, but gave us no way to
terminate a chunked upload.

This adds a new method to File called CloseForWriting(), which
terminates a chunked upload for HttpFile.  The only other implementation
that does anything is UdpFile, which uses the socket library function
shutdown() to terminate writes while allowing reads.

This also tweaks HttpFile::CloseWithStatus() so that it will not
generate an error if the file is closed before the HTTP response is
written to the download cache.

This modifies the test HttpFileTest.MultipleWrites so that the file is
Flushed after each chunk.  This adds test coverage for the changes
introduced in PR shaka-project#1201.

Fixes shaka-project#1224 (missing test coverage for HttpFile::Flush)
@joeyparrish
Copy link
Member

I believe I have a solution now. Please see #1230, #1231, and #1232.

joeyparrish added a commit to joeyparrish/shaka-packager that referenced this pull request Jul 13, 2023
All HTTP-based tests now use an embedded test server instead of
httpbin.org, which makes them much faster and more reliable.

These more reliable tests also exposed some issues that began recently
with PR shaka-project#1201.  HttpFile's Flush() semantics were different than those
documented for files in general.  Flush() used to close the file for
uploading, so that no further writes were allowed, but the documentation
stated that it would only flush data to its destination.  PR shaka-project#1201
brought HttpFile's Flush() in line with the docs, but gave us no way to
terminate a chunked upload.

This adds a new method to File called CloseForWriting(), which
terminates a chunked upload for HttpFile.  The only other implementation
that does anything is UdpFile, which uses the socket library function
shutdown() to terminate writes while allowing reads.

This also tweaks HttpFile::CloseWithStatus() so that it will not
generate an error if the file is closed before the HTTP response is
written to the download cache.

This modifies the test HttpFileTest.MultipleWrites so that the file is
Flushed after each chunk.  This adds test coverage for the changes
introduced in PR shaka-project#1201.

Fixes shaka-project#1224 (missing test coverage for HttpFile::Flush)
joeyparrish added a commit that referenced this pull request Jul 14, 2023
All HTTP-based tests now use an embedded test server instead of
httpbin.org, which makes them much faster and more reliable.

These more reliable tests also exposed some issues that began recently
with PR #1201.  HttpFile's Flush() semantics were different than those
documented for files in general.  Flush() used to close the file for
uploading, so that no further writes were allowed, but the documentation
stated that it would only flush data to its destination.  PR #1201
brought HttpFile's Flush() in line with the docs, but gave us no way to
terminate a chunked upload.

This adds a new method to File called CloseForWriting(), which
terminates a chunked upload for HttpFile.  The only other implementation
that does anything is UdpFile, which uses the socket library function
shutdown() to terminate writes while allowing reads.

This also tweaks HttpFile::CloseWithStatus() so that it will not
generate an error if the file is closed before the HTTP response is
written to the download cache.

This modifies the test HttpFileTest.MultipleWrites so that the file is
Flushed after each chunk.  This adds test coverage for the changes
introduced in PR #1201.

Fixes #1224 (missing test coverage for HttpFile::Flush)
@github-actions github-actions bot added the status: archived Archived and locked; will not be updated label Sep 3, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: archived Archived and locked; will not be updated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WEBM: Init segment with HTTP PUT destination hangs
2 participants