Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: Resume interrupted http download during import. #43374

Merged
merged 1 commit into from
Dec 20, 2019

Conversation

miretskiy
Copy link
Contributor

Make importing data into cockroach from external http servers
resilient to connection interruption.

If the download is interrupted, and if the external server
supports "Accept-Ranges: bytes" (and most servers do support that),
we attempt to request the next "Content-Range" from the server.

Release note (enterprise change): more resilient http import

@miretskiy miretskiy requested a review from dt December 19, 2019 21:52
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change! However I'm confused by the failing test in CI.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @miretskiy)


pkg/storage/cloud/external_storage.go, line 563 at r1 (raw file):

		//   there is really nothing we can do: http standard says that
		//   the stream ends when the server terminates connection.
		if err == io.ErrUnexpectedEOF && r.h.canResume {

the new idiom is if errors.Is(err, io.ErrUnexpectedEOF) && ...

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @knz)


pkg/storage/cloud/external_storage.go, line 563 at r1 (raw file):

Previously, knz (kena) wrote…

the new idiom is if errors.Is(err, io.ErrUnexpectedEOF) && ...

Done.

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the failures -- those were legit. I had to make some changes in the code to make sure I store retry state in the bodyReader and not on the httpstorage object.
PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @knz)

@miretskiy miretskiy force-pushed the resume_interrupted branch 2 times, most recently from c63192b to cc8a611 Compare December 20, 2019 01:30
@miretskiy miretskiy requested a review from knz December 20, 2019 02:08
@@ -474,13 +474,120 @@ func (h *httpStorage) Conf() roachpb.ExternalStorage {
}
}

type reqSender = func(reqHeaders map[string]string) (io.ReadCloser, http.Header, error)

type bodyReader struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2¢: I might call this httpReader or resumingHTTPReader since there are other things in this package that might have bodies that can be read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// checkContentRangeHeader parses Content-Range header and
// ensures that range start offset is the same as 'pos'
// See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Range
func checkContentRangeHeader(h string, pos int64) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here: I’d add “HTTP” to the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Make importing data into cockroach from external http servers
resilient to connection interruption.

If the download is interrupted, and if the external server
supports "Accept-Ranges: bytes" (and most servers do support that),
we attempt to request the next "Content-Range" from the server.

Release note (enterprise change): more resilient http import
Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @knz)


pkg/storage/cloud/external_storage.go, line 479 at r2 (raw file):

Previously, dt (David Taylor) wrote…

2¢: I might call this httpReader or resumingHTTPReader since there are other things in this package that might have bodies that can be read.

Done.


pkg/storage/cloud/external_storage.go, line 497 at r2 (raw file):

Previously, dt (David Taylor) wrote…

Ditto here: I’d add “HTTP” to the name

Done.

@miretskiy
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Dec 20, 2019
43374: importccl: Resume interrupted http download during import. r=miretskiy a=miretskiy

Make importing data into cockroach from external http servers
resilient to connection interruption.

If the download is interrupted, and if the external server
supports "Accept-Ranges: bytes" (and most servers do support that),
we attempt to request the next "Content-Range" from the server.

Release note (enterprise change): more resilient http import

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Dec 20, 2019

Build succeeded

@craig craig bot merged commit b4f52dd into cockroachdb:master Dec 20, 2019
craig bot pushed a commit that referenced this pull request Jan 2, 2020
43687: release-19.2: importccl: resilient import over http r=miretskiy a=miretskiy

Backport:
  * 1/1 commits from "importccl: Resume interrupted http download during import." (#43374)
  * 1/1 commits from "importccl: improve handling of transient errors when importing over http" (#43558)

Please see individual PRs for details.

/cc @cockroachdb/release


Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>
@miretskiy miretskiy deleted the resume_interrupted branch April 27, 2020 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants