-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: Support nodelocal file access using NodeID #41990
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):
cfg roachpb.ExternalStorage_LocalFilePath // contains un-prefixed filepath -- DO NOT use for I/O ops. base string // relative filepath prefixed with externalIODir, for I/O ops on this node. blobClient blobs.BlobClient // inter-node file sharing service
One thing I was feeling when looking at the implementation of the blobs.BlobClient is that it's sort of a bummer that we create a new gRPC client connection for every RPC. I was worried to complicate that diff then but I wonder if what we really want here is a factory that takes a node ID and gives you back a client whose interface doesn't know about blob ID. It's not a big change and I think will clean up this code and make it more efficient when used to make a large number of RPCs.
pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):
func makeNodeLocalURIWithNodeID(nodeID roachpb.NodeID, path string) string { return fmt.Sprintf("nodelocal://%d%s", nodeID, path)
isn't there supposed to be a slash here between the %d
and the %s
?
pkg/storage/cloud/external_storage_test.go, line 106 at r1 (raw file):
sampleBytes := "hello world" // TODO(georgiah): is this a bug?
say more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):
Previously, ajwerner wrote…
One thing I was feeling when looking at the implementation of the blobs.BlobClient is that it's sort of a bummer that we create a new gRPC client connection for every RPC. I was worried to complicate that diff then but I wonder if what we really want here is a factory that takes a node ID and gives you back a client whose interface doesn't know about blob ID. It's not a big change and I think will clean up this code and make it more efficient when used to make a large number of RPCs.
Hmmm I'm unsure if I'm understanding the scope of these changes correctly. To do this, wouldn't we need to change the existing interface of the blob client? The instead of creating the grpc client connection within each call, we could have a client cache in the factory and pass that to every blob client created.
pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):
Previously, ajwerner wrote…
isn't there supposed to be a slash here between the
%d
and the%s
?
This takes in paths from blobClient.List() which already has a slash. I can add a trim, then add the slash to make this function more usable in other situations.
pkg/storage/cloud/external_storage_test.go, line 106 at r1 (raw file):
Previously, ajwerner wrote…
say more?
It was indeed a bug which it has since been fixed. Will remove this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):
To do this, wouldn't we need to change the existing interface of the blob client?
Sure, but nobody currently uses that interface and it will make the interface align with its one client's use case and seems like it will generally be an improvement.
The instead of creating the grpc client connection within each call, we could have a client cache in the factory and pass that to every blob client created.
Yes this could be a nice follow-on optimization but might require reference counting.
c1e1748
to
a9ec9d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems much cleaner, thanks for making the change!
Reviewed 3 of 12 files at r1, 10 of 10 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @g3orgia, and @lucy-zhang)
pkg/blobs/client.go, line 118 at r2 (raw file):
} func (c *remoteClient) GetNodeID() roachpb.NodeID {
Do we need this anymore?
pkg/blobs/client.go, line 159 at r2 (raw file):
} // BlobClientFactory creates a blob client based on the nodeID we are dialing.
How about moving the context up to here rather than taking it in NewBlobClientFactory?
pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):
Previously, g3orgia (Georgia Hong) wrote…
This takes in paths from blobClient.List() which already has a slash. I can add a trim, then add the slash to make this function more usable in other situations.
Oh it was okay then I just wanted to make sure that it was an invariant that you'd always have a leading slash. I think I slightly prefer the explicitness of this and it's pretty much free so 👍
pkg/storage/cloud/external_storage.go, line 374 at r2 (raw file):
for _, fileName := range matches { fileList = append(fileList, makeNodeLocalURIWithNodeID(l.cfg.NodeID, fileName))
What about the case where l.cfg.NodeID
is 0? I don't think this is going to do what we want, is it?
a9ec9d5
to
bb3a0dd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is definitely much cleaner :) Seems like there's a few bugs (CI previously didn't run) with existing tests, fixing them right now.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @lucy-zhang)
pkg/blobs/client.go, line 159 at r2 (raw file):
Previously, ajwerner wrote…
How about moving the context up to here rather than taking it in NewBlobClientFactory?
Hmm as in make the Factory an object rather than a function?
pkg/storage/cloud/external_storage.go, line 374 at r2 (raw file):
Previously, ajwerner wrote…
What about the case where
l.cfg.NodeID
is 0? I don't think this is going to do what we want, is it?
Ah yes, good catch. Will check for that case in that function makeNodeLocalURIWithNodeID
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)
pkg/blobs/client.go, line 159 at r2 (raw file):
Previously, g3orgia (Georgia Hong) wrote…
Hmm as in make the Factory an object rather than a function?
As in:
type BlobClientFactory func(ctx context.Context, dialing roachpb.NodeID) (BlobClient, error)
and
func NewBlobClientFactory(
localNodeID roachpb.NodeID, dialer *nodedialer.Dialer, externalIODir string,
) BlobClientFactory {
19e5c34
to
1e32933
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 13 of 13 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt, @g3orgia, and @lucy-zhang)
pkg/blobs/local_storage.go, line 99 at r3 (raw file):
var fileList []string for _, file := range matches { fileList = append(fileList, strings.TrimPrefix(file, l.externalIODir+"/"))
This is rather unfortunate because you're going to allocate a new string on every iteration through this loop.
How about:
strings.TrimPrefix(strings.TrimPrefix(file, l.externalIODir), "/")
1e32933
to
7377283
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactoring looks good, just a few minor comments.
Reviewed 1 of 12 files at r1, 3 of 10 files at r2, 12 of 13 files at r3, 1 of 1 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @dt and @g3orgia)
pkg/blobs/testutils.go, line 20 at r4 (raw file):
// TestBlobServiceClient can be used as a mock BlobClient // in tests that use nodelocal storage
nit: add a period (same for comment below)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
} func joinWithoutClean(path string, file string) string {
What is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @dt and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
What is this for?
filepath.Join()
also cleans the path. However, Join("/../../blob", "file")
would turn into /blob/file
, but we actually want to keep /../../
since it is now simplified within the blob service.
7377283
to
08ae4a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @dt and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, g3orgia (Georgia Hong) wrote…
filepath.Join()
also cleans the path. However,Join("/../../blob", "file")
would turn into/blob/file
, but we actually want to keep/../../
since it is now simplified within the blob service.
My understanding is that the only place where the behavior of this function as it differs from filepath.Join
are cases which would return an error.
filepath.Join
is indeed going to turn /../../
into /
because it's an absolute path. I suppose in this case we'd like to retain that prefix so in the implementation of the blob server it becomes<external io dir>/../../
which will lead to an error in the server. One approach might instead be to do filepath.Join(strings.TrimLeft(path, "/"), file)
which will make the path relative and will retain any leading /../
. That ends up being pretty similar to this in complexity so I think I'm okay with it but a comment explaining why this exists would be helpful.
pkg/storage/cloud/external_storage.go, line 1 at r5 (raw file):
// Copyright 2019 The Cockroach Authors.
I feel like now that I know this file exists I really want to break it up into different files for the different implementations (not an action item for you, just a remark).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @g3orgia)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, ajwerner wrote…
My understanding is that the only place where the behavior of this function as it differs from
filepath.Join
are cases which would return an error.
filepath.Join
is indeed going to turn/../../
into/
because it's an absolute path. I suppose in this case we'd like to retain that prefix so in the implementation of the blob server it becomes<external io dir>/../../
which will lead to an error in the server. One approach might instead be to dofilepath.Join(strings.TrimLeft(path, "/"), file)
which will make the path relative and will retain any leading/../
. That ends up being pretty similar to this in complexity so I think I'm okay with it but a comment explaining why this exists would be helpful.
I guess you could have a path like /../<external-io-dir>
or something, in which case we don't expect an error (I think). Anyway, I agree that a comment would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
I guess you could have a path like
/../<external-io-dir>
or something, in which case we don't expect an error (I think). Anyway, I agree that a comment would be good.
nit: I think everything in here (that is delegating to blobClient as opposed hitting disk directly after this change) needs to use path.{Join,Trim, ...}
instead of filepath
-- the former always uses the standard, host-independent separator, whereas the latter is host-specific, so as long as a path still has the potential to be put in a network request, it should only use path.Join
on it.
I'm still not quiet clear why the wrapping doing an extra pass of cleaning is wrong though? like, sure, it collapses /foo/../
to /
, but that still ends up the same path on the remote side after collapsing. and if they started their path relative (../
), that won't be collapsed here at all (and will trigger an error when it is collapsed when joined with the prefix since it'll stop matching the prefix). Can ou give an example of where doing the cleaning here results in reading the wrong path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, dt (David Taylor) wrote…
nit: I think everything in here (that is delegating to blobClient as opposed hitting disk directly after this change) needs to use
path.{Join,Trim, ...}
instead offilepath
-- the former always uses the standard, host-independent separator, whereas the latter is host-specific, so as long as a path still has the potential to be put in a network request, it should only usepath.Join
on it.I'm still not quiet clear why the wrapping doing an extra pass of cleaning is wrong though? like, sure, it collapses
/foo/../
to/
, but that still ends up the same path on the remote side after collapsing. and if they started their path relative (../
), that won't be collapsed here at all (and will trigger an error when it is collapsed when joined with the prefix since it'll stop matching the prefix). Can ou give an example of where doing the cleaning here results in reading the wrong path?
The example georgia provided I think is valid. Imagine the path is not relative but rather is absolute, say /foo/../../
and the file is bar
. filepath.Join
will turn it into /bar
which will then turn into <externalIODir>/bar
which might be a valid file. It seems like the correct thing for it to turn into is into <externalIODir>/../bar
which will be an error. I think the trick is that we want to make sure path
here is not absolute.
I think filepath.Join("./", path, file)
will do what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, ajwerner wrote…
The example georgia provided I think is valid. Imagine the path is not relative but rather is absolute, say
/foo/../../
and the file isbar
.filepath.Join
will turn it into/bar
which will then turn into<externalIODir>/bar
which might be a valid file. It seems like the correct thing for it to turn into is into<externalIODir>/../bar
which will be an error. I think the trick is that we want to make surepath
here is not absolute.I think
filepath.Join("./", path, file)
will do what we want.
Or rather filepath.Join(".", path, file)
08ae4a4
to
a61add6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, ajwerner wrote…
Or rather
filepath.Join(".", path, file)
This was discovered in this failing test. It backs up to nodelocal:///../../blah
and expects an external directory error, but could not find this error because the backup went to <externalIODir>/blah
.
Seems like @ajwerner 's suggestion, path.Join(".", path, file)
works, since it's a clean way to make this path relative. Going to change to that, making sure we use path
instead of filepath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)
pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):
Previously, g3orgia (Georgia Hong) wrote…
This was discovered in this failing test. It backs up to
nodelocal:///../../blah
and expects an external directory error, but could not find this error because the backup went to<externalIODir>/blah
.Seems like @ajwerner 's suggestion,
path.Join(".", path, file)
works, since it's a clean way to make this path relative. Going to change to that, making sure we usepath
instead offilepath
.
sounds good
a61add6
to
fecf6e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r6.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @g3orgia and @lucy-zhang)
pkg/testutils/lint/lint_test.go, line 1194 at r6 (raw file):
stream.GrepNot(`cockroach/pkg/server/debug/pprofui: path$`), stream.GrepNot(`cockroach/pkg/util/caller: path$`), stream.GrepNot(`cockroach/pkg/storage/cloud: path$`),
What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)
pkg/testutils/lint/lint_test.go, line 1194 at r6 (raw file):
Previously, ajwerner wrote…
What's going on here?
This lint test checks that we import path/filepath
instead of path
because that's what we should be using in most cases. This test used ignoring that check for storageccl
where thestorage/cloud
package was recently moved from. As David mentioned, we should be using path
here, so we need to ignore this package from the path
check.
bors r=ajwerner,lucy-zhang |
Build failed |
In order to support importing from a specific node's local file system, `ExternalStorage` needs an access to the blob service. This PR plumbs through the `blobClient` to `ExternalStorage` and uses that to access nodelocal files. Many of the changes are related to tests that depended on nodelocal and now need mock blob clients to run. The majority of code changes are in `external_storage.go`. Release notes (enterprise change): When specifying a nodelocal file URI (ex. within `IMPORT`,`BACKUP`), you can now specify which node's local file system you would like to use. This is done by specifying that node's ID in the URI: `nodelocal://<nodeID>/path/file.csv`
fecf6e3
to
60ccc62
Compare
bors r+ |
1 similar comment
bors r+ |
41990: storage: Support nodelocal file access using NodeID r=g3orgia a=g3orgia In order to support importing from a specific node's local file system, `ExternalStorage` needs an access to the blob service. This PR plumbs through the `blobClient` to `ExternalStorage` and uses that to access nodelocal files. Many of the changes are related to tests that depended on nodelocal and now need mock blob clients to run. The majority of code changes are in `external_storage.go`. Release notes (enterprise change): When specifying a nodelocal file URI (ex. within `IMPORT`,`BACKUP`), you can now specify which node's local file system you would like to use. This is done by specifying that node's ID: `nodelocal://<nodeID>/path/file.csv` Co-authored-by: Georgia Hong <georgiah@cockroachlabs.com>
Build succeeded |
In order to support importing from a specific node's local file
system,
ExternalStorage
needs an access to the blob service.This PR plumbs through the
blobClient
toExternalStorage
anduses that to access nodelocal files.
Many of the changes are related to tests that depended on nodelocal
and now need mock blob clients to run. The majority of code changes
are in
external_storage.go
.Release notes (enterprise change):
When specifying a nodelocal file URI (ex. within
IMPORT
,BACKUP
),you can now specify which node's local file system you would like to
use. This is done by specifying that node's ID:
nodelocal://<nodeID>/path/file.csv