Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Support nodelocal file access using NodeID #41990

Merged
merged 1 commit into from
Nov 4, 2019

Conversation

g3orgia
Copy link
Contributor

@g3orgia g3orgia commented Oct 29, 2019

In order to support importing from a specific node's local file
system, ExternalStorage needs an access to the blob service.
This PR plumbs through the blobClient to ExternalStorage and
uses that to access nodelocal files.

Many of the changes are related to tests that depended on nodelocal
and now need mock blob clients to run. The majority of code changes
are in external_storage.go.

Release notes (enterprise change):
When specifying a nodelocal file URI (ex. within IMPORT,BACKUP),
you can now specify which node's local file system you would like to
use. This is done by specifying that node's ID:
nodelocal://<nodeID>/path/file.csv

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):

	cfg        roachpb.ExternalStorage_LocalFilePath // contains un-prefixed filepath -- DO NOT use for I/O ops.
	base       string                                // relative filepath prefixed with externalIODir, for I/O ops on this node.
	blobClient blobs.BlobClient                      // inter-node file sharing service

One thing I was feeling when looking at the implementation of the blobs.BlobClient is that it's sort of a bummer that we create a new gRPC client connection for every RPC. I was worried to complicate that diff then but I wonder if what we really want here is a factory that takes a node ID and gives you back a client whose interface doesn't know about blob ID. It's not a big change and I think will clean up this code and make it more efficient when used to make a large number of RPCs.


pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):

func makeNodeLocalURIWithNodeID(nodeID roachpb.NodeID, path string) string {
	return fmt.Sprintf("nodelocal://%d%s", nodeID, path)

isn't there supposed to be a slash here between the %d and the %s?


pkg/storage/cloud/external_storage_test.go, line 106 at r1 (raw file):

		sampleBytes := "hello world"

		// TODO(georgiah): is this a bug?

say more?

Copy link
Contributor Author

@g3orgia g3orgia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):

Previously, ajwerner wrote…

One thing I was feeling when looking at the implementation of the blobs.BlobClient is that it's sort of a bummer that we create a new gRPC client connection for every RPC. I was worried to complicate that diff then but I wonder if what we really want here is a factory that takes a node ID and gives you back a client whose interface doesn't know about blob ID. It's not a big change and I think will clean up this code and make it more efficient when used to make a large number of RPCs.

Hmmm I'm unsure if I'm understanding the scope of these changes correctly. To do this, wouldn't we need to change the existing interface of the blob client? The instead of creating the grpc client connection within each call, we could have a client cache in the factory and pass that to every blob client created.


pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):

Previously, ajwerner wrote…

isn't there supposed to be a slash here between the %d and the %s?

This takes in paths from blobClient.List() which already has a slash. I can add a trim, then add the slash to make this function more usable in other situations.


pkg/storage/cloud/external_storage_test.go, line 106 at r1 (raw file):

Previously, ajwerner wrote…

say more?

It was indeed a bug which it has since been fixed. Will remove this comment.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 309 at r1 (raw file):

To do this, wouldn't we need to change the existing interface of the blob client?

Sure, but nobody currently uses that interface and it will make the interface align with its one client's use case and seems like it will generally be an improvement.

The instead of creating the grpc client connection within each call, we could have a client cache in the factory and pass that to every blob client created.

Yes this could be a nice follow-on optimization but might require reference counting.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems much cleaner, thanks for making the change!

Reviewed 3 of 12 files at r1, 10 of 10 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @g3orgia, and @lucy-zhang)


pkg/blobs/client.go, line 118 at r2 (raw file):

}

func (c *remoteClient) GetNodeID() roachpb.NodeID {

Do we need this anymore?


pkg/blobs/client.go, line 159 at r2 (raw file):

}

// BlobClientFactory creates a blob client based on the nodeID we are dialing.

How about moving the context up to here rather than taking it in NewBlobClientFactory?


pkg/storage/cloud/external_storage.go, line 321 at r1 (raw file):

Previously, g3orgia (Georgia Hong) wrote…

This takes in paths from blobClient.List() which already has a slash. I can add a trim, then add the slash to make this function more usable in other situations.

Oh it was okay then I just wanted to make sure that it was an invariant that you'd always have a leading slash. I think I slightly prefer the explicitness of this and it's pretty much free so 👍


pkg/storage/cloud/external_storage.go, line 374 at r2 (raw file):

	for _, fileName := range matches {
		fileList = append(fileList, makeNodeLocalURIWithNodeID(l.cfg.NodeID, fileName))

What about the case where l.cfg.NodeID is 0? I don't think this is going to do what we want, is it?

Copy link
Contributor Author

@g3orgia g3orgia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is definitely much cleaner :) Seems like there's a few bugs (CI previously didn't run) with existing tests, fixing them right now.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @lucy-zhang)


pkg/blobs/client.go, line 159 at r2 (raw file):

Previously, ajwerner wrote…

How about moving the context up to here rather than taking it in NewBlobClientFactory?

Hmm as in make the Factory an object rather than a function?


pkg/storage/cloud/external_storage.go, line 374 at r2 (raw file):

Previously, ajwerner wrote…

What about the case where l.cfg.NodeID is 0? I don't think this is going to do what we want, is it?

Ah yes, good catch. Will check for that case in that function makeNodeLocalURIWithNodeID.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @dt, @g3orgia, and @lucy-zhang)


pkg/blobs/client.go, line 159 at r2 (raw file):

Previously, g3orgia (Georgia Hong) wrote…

Hmm as in make the Factory an object rather than a function?

As in:

type BlobClientFactory func(ctx context.Context, dialing roachpb.NodeID) (BlobClient, error)

and

func NewBlobClientFactory(
	localNodeID roachpb.NodeID, dialer *nodedialer.Dialer, externalIODir string,
) BlobClientFactory {

@g3orgia g3orgia force-pushed the nodelocal-file-sharing branch 3 times, most recently from 19e5c34 to 1e32933 Compare October 31, 2019 15:53
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 13 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @g3orgia, and @lucy-zhang)


pkg/blobs/local_storage.go, line 99 at r3 (raw file):

	var fileList []string
	for _, file := range matches {
		fileList = append(fileList, strings.TrimPrefix(file, l.externalIODir+"/"))

This is rather unfortunate because you're going to allocate a new string on every iteration through this loop.

How about:

strings.TrimPrefix(strings.TrimPrefix(file, l.externalIODir), "/")

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring looks good, just a few minor comments.

Reviewed 1 of 12 files at r1, 3 of 10 files at r2, 12 of 13 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @g3orgia)


pkg/blobs/testutils.go, line 20 at r4 (raw file):

// TestBlobServiceClient can be used as a mock BlobClient
// in tests that use nodelocal storage

nit: add a period (same for comment below)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

}

func joinWithoutClean(path string, file string) string {

What is this for?

Copy link
Contributor Author

@g3orgia g3orgia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

What is this for?

filepath.Join() also cleans the path. However, Join("/../../blob", "file") would turn into /blob/file, but we actually want to keep /../../ since it is now simplified within the blob service.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, g3orgia (Georgia Hong) wrote…

filepath.Join() also cleans the path. However, Join("/../../blob", "file") would turn into /blob/file, but we actually want to keep /../../ since it is now simplified within the blob service.

My understanding is that the only place where the behavior of this function as it differs from filepath.Join are cases which would return an error.

filepath.Join is indeed going to turn /../../ into / because it's an absolute path. I suppose in this case we'd like to retain that prefix so in the implementation of the blob server it becomes<external io dir>/../../ which will lead to an error in the server. One approach might instead be to do filepath.Join(strings.TrimLeft(path, "/"), file) which will make the path relative and will retain any leading /../. That ends up being pretty similar to this in complexity so I think I'm okay with it but a comment explaining why this exists would be helpful.


pkg/storage/cloud/external_storage.go, line 1 at r5 (raw file):

// Copyright 2019 The Cockroach Authors.

I feel like now that I know this file exists I really want to break it up into different files for the different implementations (not an action item for you, just a remark).

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @dt, and @g3orgia)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, ajwerner wrote…

My understanding is that the only place where the behavior of this function as it differs from filepath.Join are cases which would return an error.

filepath.Join is indeed going to turn /../../ into / because it's an absolute path. I suppose in this case we'd like to retain that prefix so in the implementation of the blob server it becomes<external io dir>/../../ which will lead to an error in the server. One approach might instead be to do filepath.Join(strings.TrimLeft(path, "/"), file) which will make the path relative and will retain any leading /../. That ends up being pretty similar to this in complexity so I think I'm okay with it but a comment explaining why this exists would be helpful.

I guess you could have a path like /../<external-io-dir> or something, in which case we don't expect an error (I think). Anyway, I agree that a comment would be good.

Copy link
Member

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

I guess you could have a path like /../<external-io-dir> or something, in which case we don't expect an error (I think). Anyway, I agree that a comment would be good.

nit: I think everything in here (that is delegating to blobClient as opposed hitting disk directly after this change) needs to use path.{Join,Trim, ...} instead of filepath -- the former always uses the standard, host-independent separator, whereas the latter is host-specific, so as long as a path still has the potential to be put in a network request, it should only use path.Join on it.

I'm still not quiet clear why the wrapping doing an extra pass of cleaning is wrong though? like, sure, it collapses /foo/../ to /, but that still ends up the same path on the remote side after collapsing. and if they started their path relative (../), that won't be collapsed here at all (and will trigger an error when it is collapsed when joined with the prefix since it'll stop matching the prefix). Can ou give an example of where doing the cleaning here results in reading the wrong path?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, dt (David Taylor) wrote…

nit: I think everything in here (that is delegating to blobClient as opposed hitting disk directly after this change) needs to use path.{Join,Trim, ...} instead of filepath -- the former always uses the standard, host-independent separator, whereas the latter is host-specific, so as long as a path still has the potential to be put in a network request, it should only use path.Join on it.

I'm still not quiet clear why the wrapping doing an extra pass of cleaning is wrong though? like, sure, it collapses /foo/../ to /, but that still ends up the same path on the remote side after collapsing. and if they started their path relative (../), that won't be collapsed here at all (and will trigger an error when it is collapsed when joined with the prefix since it'll stop matching the prefix). Can ou give an example of where doing the cleaning here results in reading the wrong path?

The example georgia provided I think is valid. Imagine the path is not relative but rather is absolute, say /foo/../../ and the file is bar. filepath.Join will turn it into /bar which will then turn into <externalIODir>/bar which might be a valid file. It seems like the correct thing for it to turn into is into <externalIODir>/../bar which will be an error. I think the trick is that we want to make sure path here is not absolute.

I think filepath.Join("./", path, file) will do what we want.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, ajwerner wrote…

The example georgia provided I think is valid. Imagine the path is not relative but rather is absolute, say /foo/../../ and the file is bar. filepath.Join will turn it into /bar which will then turn into <externalIODir>/bar which might be a valid file. It seems like the correct thing for it to turn into is into <externalIODir>/../bar which will be an error. I think the trick is that we want to make sure path here is not absolute.

I think filepath.Join("./", path, file) will do what we want.

Or rather filepath.Join(".", path, file)

Copy link
Contributor Author

@g3orgia g3orgia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, ajwerner wrote…

Or rather filepath.Join(".", path, file)

This was discovered in this failing test. It backs up to nodelocal:///../../blah and expects an external directory error, but could not find this error because the backup went to <externalIODir>/blah.

Seems like @ajwerner 's suggestion, path.Join(".", path, file) works, since it's a clean way to make this path relative. Going to change to that, making sure we use path instead of filepath.

Copy link
Member

@dt dt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)


pkg/storage/cloud/external_storage.go, line 361 at r4 (raw file):

Previously, g3orgia (Georgia Hong) wrote…

This was discovered in this failing test. It backs up to nodelocal:///../../blah and expects an external directory error, but could not find this error because the backup went to <externalIODir>/blah.

Seems like @ajwerner 's suggestion, path.Join(".", path, file) works, since it's a clean way to make this path relative. Going to change to that, making sure we use path instead of filepath.

sounds good

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @g3orgia and @lucy-zhang)


pkg/testutils/lint/lint_test.go, line 1194 at r6 (raw file):

			stream.GrepNot(`cockroach/pkg/server/debug/pprofui: path$`),
			stream.GrepNot(`cockroach/pkg/util/caller: path$`),
			stream.GrepNot(`cockroach/pkg/storage/cloud: path$`),

What's going on here?

Copy link
Contributor Author

@g3orgia g3orgia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @lucy-zhang)


pkg/testutils/lint/lint_test.go, line 1194 at r6 (raw file):

Previously, ajwerner wrote…

What's going on here?

This lint test checks that we import path/filepath instead of path because that's what we should be using in most cases. This test used ignoring that check for storageccl where thestorage/cloud package was recently moved from. As David mentioned, we should be using path here, so we need to ignore this package from the path check.

@g3orgia
Copy link
Contributor Author

g3orgia commented Nov 4, 2019

bors r=ajwerner,lucy-zhang

@craig
Copy link
Contributor

craig bot commented Nov 4, 2019

Build failed

In order to support importing from a specific node's local file
system, `ExternalStorage` needs an access to the blob service.
This PR plumbs through the `blobClient` to `ExternalStorage` and
uses that to access nodelocal files.

Many of the changes are related to tests that depended on nodelocal
and now need mock blob clients to run. The majority of code changes
are in `external_storage.go`.

Release notes (enterprise change):
When specifying a nodelocal file URI (ex. within `IMPORT`,`BACKUP`),
you can now specify which node's local file system you would like to
use. This is done by specifying that node's ID in the URI:
`nodelocal://<nodeID>/path/file.csv`
@g3orgia
Copy link
Contributor Author

g3orgia commented Nov 4, 2019

bors r+

1 similar comment
@g3orgia
Copy link
Contributor Author

g3orgia commented Nov 4, 2019

bors r+

craig bot pushed a commit that referenced this pull request Nov 4, 2019
41990: storage: Support nodelocal file access using NodeID r=g3orgia a=g3orgia

In order to support importing from a specific node's local file
system, `ExternalStorage` needs an access to the blob service.
This PR plumbs through the `blobClient` to `ExternalStorage` and
uses that to access nodelocal files.

Many of the changes are related to tests that depended on nodelocal
and now need mock blob clients to run. The majority of code changes
are in `external_storage.go`.

Release notes (enterprise change):
When specifying a nodelocal file URI (ex. within `IMPORT`,`BACKUP`),
you can now specify which node's local file system you would like to
use. This is done by specifying that node's ID:
`nodelocal://<nodeID>/path/file.csv`

Co-authored-by: Georgia Hong <georgiah@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Nov 4, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants