-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,externalconn: introduce CREATE EXTERNAL CONNECTION #84310
Conversation
6ab9782
to
066577b
Compare
Sorry for the large review but the first two commits are just the general motions of adding a system table. If you would prefer the third commit in a separate PR I'm happy to pull it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great.
"net/url" | ||
|
||
"github.com/cockroachdb/cockroach/pkg/cloud/externalconn/connectionpb" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a "mouthful"... maybe just proto dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting renaming connectionpb
? I just went with the nomenclature some other packages like spanconfig
use.
if err != nil { | ||
// If we see a UniqueViolation it means that we are attempting to create an | ||
// External Connection with a `connection_name` that already exists. | ||
if pgerror.GetPGCode(err) == pgcode.UniqueViolation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the resolution re uniqueness? Do we need to also associate
"version" (i.e. some sort of unique number/id) with the record so that if some conn is dropped and then recreated with the same name as something else.... Or it's v2 stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main concern was around the renaming of External Connections. However, @dt rightly pointed out that we should not allow this since it opens us up to having to rewrite all persisted references of this object.
As far as reusing a name goes: this should be tackled by the v2 work around not allowing a drop if someone is actively using the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand these concerns.. However, just like you have e.g. column or table names, and those get converted to IDs during query execution, simply having a unique ID associated with each resource and then storing that ID with the job record allows you to handle "renames" -- those are drop/create, or name reuse after old one is dropped.
|
||
UNSPECIFIED = 0 [(gogoproto.enumvalue_customname) = "TypeUnspecified"]; | ||
STORAGE = 1 [(gogoproto.enumvalue_customname) = "TypeStorage"]; | ||
KMS = 2 [(gogoproto.enumvalue_customname) = "TypeKMS"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where/how is this used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KMS is not used anywhere so I've deleted it. Storage is returned by nodelocal. As such the type isn't used anywhere today, but it will be used by SHOW BACKUP EXTERNAL CONNECTIONS
when we do add that in.
066577b
to
7452a3b
Compare
7452a3b
to
1d2dfcc
Compare
@@ -56,16 +56,20 @@ func TestDescriptorRepairOrphanedDescriptors(t *testing.T) { | |||
// DROP DATABASE db CASCADE; | |||
// | |||
// This, due to #51782, leads to the table remaining public but with no | |||
// parent database (52). | |||
// parent database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHERE name = 't' | ||
); | ||
|
||
NB: As of 07/15/22 the injected descriptor has been edited to have an ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eb070cf
to
aba7945
Compare
This change introduces the `system.external_connections` table and its associated migration. The system table will be responsible for storing External Connection objects that represent resources that reside outside of CockroachDB, for example, a bucket being backed up into or a sink being written to by a changefeed. Informs: cockroachdb#84225 Release note: None
aba7945
to
2002878
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! thank you!
2002878
to
6129bcb
Compare
This change introduces the `CREATE EXTERNAL CONNECTION` syntax to CockroachDB. This statement can be used to create an External Connection object that represents an external resource. Majority of this change is introducing the required interfaces to persist an External Connection object in the `system.external_connections` table. We only register `nodelocal` as a supported External Connection to allow for end to end testing of functionality. A user can now execute: `CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo';` All the other providers will be registered as part of cockroachdb#84228. Furthermore, none of the permission model outlined in the RFC cockroachdb#84209 has been implemented, to minimize the scope of this change. Release note (sql change): introduce `CREATE EXTERNAL CONNECTION` syntax that can be used to create an External Connection representing a resource that resides outside of CockroachDB. The only supported resource at the moment is a `nodelocal` URI that can be represented as an External Connection object using: `CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'`; Fixes: cockroachdb#84225
6129bcb
to
fd4e5ba
Compare
Bazel is only failing on the known windows build error. TFTR!! bors r=miretskiy,benbardin |
84310: sql,externalconn: introduce CREATE EXTERNAL CONNECTION r=miretskiy,benbardin a=adityamaru systemschema: introduce system.external_connections table This change introduces the `system.external_connections` table and its associated migration. The system table will be responsible for storing External Connection objects that represent resources that reside outside of CockroachDB, for example, a bucket being backed up into or a sink being written to by a changefeed. Informs: #84225 Release note: None sql,cli,spanconfigccl: test updates when adding system table This is a purely mechanical commit to update all the tests that need updating when introducing a new system table. Release note: None sql,externalconn: introduce CREATE EXTERNAL CONNECTION This change introduces the `CREATE EXTERNAL CONNECTION` syntax to CockroachDB. This statement can be used to create an External Connection object that represents an external resource. Majority of this change is introducing the required interfaces to persist an External Connection object in the `system.external_connections` table. We only register `nodelocal` as a supported External Connection to allow for end to end testing of functionality. A user can now execute: `CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo';` All the other providers will be registered as part of #84228. Furthermore, none of the permission model outlined in the RFC #84209 has been implemented, to minimize the scope of this change. Release note (sql change): introduce `CREATE EXTERNAL CONNECTION` syntax that can be used to create an External Connection representing a resource that resides outside of CockroachDB. The only supported resource at the moment is a `nodelocal` URI that can be represented as an External Connection object using: `CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'`; Fixes: #84225 Co-authored-by: Aditya Maru <adityamaru@gmail.com>
Build failed: |
bors r+ |
Build failed (retrying...): |
Build succeeded: |
In cockroachdb#84310 we added the ability to create an external connection to represent a `nodelocal` endpoint. This diff is the second piece of the puzzle that allows systems in CRDB to interact with the external connection object. We introduce an `external` URI scheme to the `ExternalStorage` registry. URIs with an `external` scheme are required to contain a host component referring to the unique name of an existing extenral connection object. Optionally, the URI can also contain a path component that will be appended to the endpoint that was specified at the time the external connection object was created. This is necessary for operations such as backup and restore that read/write to subdirectories in the endpoint inputted by the user. A nice UX win is the abilility to have a single external connection object for the base bucket, and then interact with all the subdirectories without having to create an object for each directory. In the future we may want to clamp down on this, and allow the user to specify which objects permit subdirectory access. The `external://<object-name>/<optional-path>` URI is parsed and the underlying object is fetched from the `system.external_connections` table. The resource represented by the object is then `Dial()`ed to return an `ExternalStorage` handle that can be used to read, write, list etc. With this change all bulk operations and cdc are able to use external connections to represent a `nodelocal` endpoint. For example, a backup can now be run as: ``` CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'; BACKUP INTO 'external://foo'; RESTORE FROM LATEST IN 'external://foo'; ``` Gradually, we will add support for all other external storage endpoints as well. Note, we do not register limiter settings for the `external` provider `ExternalStorage` interface, nor do we wrap it with an ioRecorder. This is because the underlying resource of the external connection object will already have its registered limiters and recorder. Informs: cockroachdb#84753 Release note (sql change): Bulk operations and CDC will accept an `external` scheme URI that points to a previously created External Connection object, These operations can then interact with the underlying resource represented by the object as they did before.
In cockroachdb#84310 we added the ability to create an external connection to represent a `nodelocal` endpoint. This diff is the second piece of the puzzle that allows systems in CRDB to interact with the external connection object. We introduce an `external` URI scheme to the `ExternalStorage` registry. URIs with an `external` scheme are required to contain a host component referring to the unique name of an existing extenral connection object. Optionally, the URI can also contain a path component that will be appended to the endpoint that was specified at the time the external connection object was created. This is necessary for operations such as backup and restore that read/write to subdirectories in the endpoint inputted by the user. A nice UX win is the abilility to have a single external connection object for the base bucket, and then interact with all the subdirectories without having to create an object for each directory. In the future we may want to clamp down on this, and allow the user to specify which objects permit subdirectory access. The `external://<object-name>/<optional-path>` URI is parsed and the underlying object is fetched from the `system.external_connections` table. The resource represented by the object is then `Dial()`ed to return an `ExternalStorage` handle that can be used to read, write, list etc. With this change all bulk operations and cdc are able to use external connections to represent a `nodelocal` endpoint. For example, a backup can now be run as: ``` CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'; BACKUP INTO 'external://foo'; RESTORE FROM LATEST IN 'external://foo'; ``` Gradually, we will add support for all other external storage endpoints as well. Note, we do not register limiter settings for the `external` provider `ExternalStorage` interface, nor do we wrap it with an ioRecorder. This is because the underlying resource of the external connection object will already have its registered limiters and recorder. Informs: cockroachdb#84753 Release note (sql change): Bulk operations and CDC will accept an `external` scheme URI that points to a previously created External Connection object, These operations can then interact with the underlying resource represented by the object as they did before.
In cockroachdb#84310 we added the ability to create an external connection to represent a `nodelocal` endpoint. This diff is the second piece of the puzzle that allows systems in CRDB to interact with the external connection object. We introduce an `external` URI scheme to the `ExternalStorage` registry. URIs with an `external` scheme are required to contain a host component referring to the unique name of an existing extenral connection object. Optionally, the URI can also contain a path component that will be appended to the endpoint that was specified at the time the external connection object was created. This is necessary for operations such as backup and restore that read/write to subdirectories in the endpoint inputted by the user. A nice UX win is the abilility to have a single external connection object for the base bucket, and then interact with all the subdirectories without having to create an object for each directory. In the future we may want to clamp down on this, and allow the user to specify which objects permit subdirectory access. The `external://<object-name>/<optional-path>` URI is parsed and the underlying object is fetched from the `system.external_connections` table. The resource represented by the object is then `Dial()`ed to return an `ExternalStorage` handle that can be used to read, write, list etc. With this change all bulk operations and cdc are able to use external connections to represent a `nodelocal` endpoint. For example, a backup can now be run as: ``` CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'; BACKUP INTO 'external://foo'; RESTORE FROM LATEST IN 'external://foo'; ``` Gradually, we will add support for all other external storage endpoints as well. Note, we do not register limiter settings for the `external` provider `ExternalStorage` interface, nor do we wrap it with an ioRecorder. This is because the underlying resource of the external connection object will already have its registered limiters and recorder. Informs: cockroachdb#84753 Release note (sql change): Bulk operations and CDC will accept an `external` scheme URI that points to a previously created External Connection object, These operations can then interact with the underlying resource represented by the object as they did before.
84931: externalconn,nodelocal: add `external` ExternalStorage provider r=adityamaru a=adityamaru In #84310 we added the ability to create an external connection to represent a `nodelocal` endpoint. This diff is the second piece of the puzzle that allows systems in CRDB to interact with the external connection object. We introduce an `external` URI scheme to the `ExternalStorage` registry. URIs with an `external` scheme are required to contain a host component referring to the unique name of an existing extenral connection object. Optionally, the URI can also contain a path component that will be appended to the endpoint that was specified at the time the external connection object was created. This is necessary for operations such as backup and restore that read/write to subdirectories in the endpoint inputted by the user. A nice UX win is the abilility to have a single external connection object for the base bucket, and then interact with all the subdirectories without having to create an object for each directory. In the future we may want to clamp down on this, and allow the user to specify which objects permit subdirectory access. The `external://<object-name>/<optional-path>` URI is parsed and the underlying object is fetched from the `system.external_connections` table. The resource represented by the object is then `Dial()`ed to return an `ExternalStorage` handle that can be used to read, write, list etc. With this change all bulk operations and cdc are able to use external connections to represent a `nodelocal` endpoint. For example, a backup can now be run as: ``` CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'; BACKUP INTO 'external://foo'; RESTORE FROM LATEST IN 'external://foo'; ``` Gradually, we will add support for all other external storage endpoints as well. Note, we do not register limiter settings for the `external` provider `ExternalStorage` interface, nor do we wrap it with an ioRecorder. This is because the underlying resource of the external connection object will already have its registered limiters and recorder. Informs: #84753 Release note (sql change): Bulk operations and CDC will accept an `external` scheme URI that points to a previously created External Connection object, These operations can then interact with the underlying resource represented by the object as they did before. 85041: sql: better logging about the optimizer as well as some cancellation checks r=yuzefovich a=yuzefovich **sql: add logging about the optimizer into the trace** This commit derives a separate tracing span on the main query path that covers the whole optimizer as well as adds a few logging messages around the optbuilder and the optimization. It also audits some of the already present logs to be written down before performing possibly long operations. Release note: None **xform: check for context cancellation** reviously, the optimizer didn't check the context cancellation ever which could lead to surprising behavior - e.g. if the statement timeout occurs in the middle of the optimization, we would still finish the whole optimization phase only to realize that we were canceled at the execution time. This situation is now improved by periodically checking for the context cancellation every time a group is optimized by `optimizeGroup`. It's still possible for the optimizer to continue running long after cancellation if there is a long running procedure at a deeper level. For example, a long-running, individual invocation of an exploration rule will not halt due to context cancellation. However, the cancellation improvement in this commit should halt long-running optimization in most cases. Fixes: #70245. Addresses: #70314. Release note: None 85101: kvcoord: remove an artifact of unsetting WriteTooOld flag r=yuzefovich a=yuzefovich It seems like during 20.1 release cycle we had a change so that the errors don't carry `WriteTooOld` flag set to `true` and now only the BatchResponse can have that. We kept the ability of unsetting that flag so that in a mixed-version scenario we wouldn't run into any issues, but now that unsetting behavior is no longer needed and is removed in this commit. My particular interest in removing this is because it is the only modification of the `SetupFlowRequest` proto (which carries the `Txn` proto inside) that occurs during the distributed query setup, and I want to have more parallelization there. Release note: None 85118: roachpb: remove `RangeDescriptor.deprecated_generation_comparable` r=pavelkalinnikov,andreimatei a=erikgrinaker This patch removes the unused field `deprecated_generation_comparable` from `RangeDescriptor`. The preparatory work for this was done in 20.2, see a578719. Release note: None Co-authored-by: Aditya Maru <adityamaru@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
systemschema: introduce system.external_connections table
This change introduces the
system.external_connections
tableand its associated migration. The system table will be responsible
for storing External Connection objects that represent resources
that reside outside of CockroachDB, for example, a bucket being backed
up into or a sink being written to by a changefeed.
Informs: #84225
Release note: None
sql,cli,spanconfigccl: test updates when adding system table
This is a purely mechanical commit to update all the tests that
need updating when introducing a new system table.
Release note: None
sql,externalconn: introduce CREATE EXTERNAL CONNECTION
This change introduces the
CREATE EXTERNAL CONNECTION
syntaxto CockroachDB. This statement can be used to create an External
Connection object that represents an external resource.
Majority of this change is introducing the required interfaces
to persist an External Connection object in the
system.external_connections
table. We only register
nodelocal
as a supported External Connectionto allow for end to end testing of functionality. A user can now execute:
CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo';
All the other providers will be registered as part of #84228.
Furthermore, none of the permission model outlined in the
RFC #84209 has been implemented, to minimize the scope of this change.
Release note (sql change): introduce
CREATE EXTERNAL CONNECTION
syntax that can be used to create an External Connection representing
a resource that resides outside of CockroachDB. The only supported resource
at the moment is a
nodelocal
URI that can be represented as an ExternalConnection object using:
CREATE EXTERNAL CONNECTION foo AS 'nodelocal://1/foo'
;Fixes: #84225