Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colexec: add casts between natively supported types #48135

Open
yuzefovich opened this issue Apr 29, 2020 · 1 comment
Open

colexec: add casts between natively supported types #48135

yuzefovich opened this issue Apr 29, 2020 · 1 comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Apr 29, 2020

After #67733, we are only missing casts between natively supported types. We should add those.

Jira issue: CRDB-4353

@yuzefovich yuzefovich added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Apr 29, 2020
@yuzefovich yuzefovich moved this from Triage to [VECTORIZED BACKLOG] Enhancements/Features/Investigations in BACKLOG, NO NEW ISSUES: SQL Execution Apr 30, 2020
craig bot pushed a commit that referenced this issue Jun 15, 2020
49284: opt: synthesize check constraints on enum columns r=rohany a=rohany

Fixes #49263.

This PR teaches the optimizer how to synthesize check constraints on
columns of an ENUM type, allowing queries like:

```
CREATE TYPE t AS ENUM ('howdy', 'hello');
CREATE TABLE tt (x t, y INT, PRIMARY KEY (x, y));
SELECT x, y FROM tt WHERE y = 2
```

to be planned using constrained spans on the enum values, rather than a
full table scan.

Release note (performance improvement): Allow the optimizer to use enum
information to generate better query plans.

50001: colexec: minor miscellaneous additions r=yuzefovich a=yuzefovich

**colexec: add casts from datum-backed types to bools**

While investigating unrelated test failures, I added this cast, so we
might as well merge it.

Addresses: #48135.

Release note: None

**colexec: add support for Values core with zero rows**

Release note: None

50188: geo/geogfn: implement ST_Azimuth({geometry,geometry}) r=otan a=hueypark

Fixes #48887

Release note (sql change): This PR implement adds the ST_Azimuth({geometry,geometry})

50205: Makefile: ensure redact_safe.md generation is stable r=RichardJCai a=knz

Fixes #50146

On some platforms, the default behavior of `git grep` is to number
the lines (i.e. `-n` is implicitly added).

This patch makes the `-n` explicit and tweaks the sed patterns to
ensure that the behavior is stable.

Release note: None

50239: kv/kvserver: disable the below-raft proto tracking in race builds r=petermattis a=petermattis

The below-raft proto tracking is meant to catch bugs where we
inadvertently starting marshaling a proto below Raft. This tracking is a
source of signficant memory allocations which measurably slow down race
builds. Since we're already doing this tracking in non-race builds,
there is little benefit to also doing it in race builds. Disabling
below-raft proto tracking for race builds reduces the time to run
`testrace` on the `kv/kvserver` package from 605s to 517s on my laptop.

Release note: None

50243: kv/kvserver: fix TestSideloadingSideloadedStorage w/ RocksDB r=jbowens a=jbowens

Fix the TestSideloadingSideloadedStorage test when running with
COCKROACH_STORAGE_ENGINE=rocksdb. The test has always depended on the
exact error message surfaced from os.Remove. In #49717,
diskSideloadStorage was modified to use the engine's RemoveDir operation
rather than interacting with the filesystem directly. Since Pebble uses
os.Remove for its implementation and emulates its error messages in its
MemFS implementation, the exact message comparison continued to succeed.

After #49717, when running with RocksDB, the RocksDB env's error message
was surfaced, with its own context prefixing. This updates the test to
case-insensitively check for 'directory not empty' instead.

Fixes #50226.

Release note: None

Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Jaewan Park <jaewan.huey.park@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Peter Mattis <petermattis@gmail.com>
Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Jun 16, 2021
66303: colexec: optimize the external sort for top K case r=yuzefovich a=yuzefovich

**colexec: extend external sort benchmark for top K case**

Release note: None

**colexec: optimize the external sort for top K case**

Previously, if the top K sort spilled to disk, we used the general
external sort. However, we could easily optimize that case with the
knowledge that only K tuples are needed by the output.

Namely, we can use the in-memory top K sort (in order to create each new
partition of the desired size) and also limit the size of each merged
partition by K tuples. This commit adds these optimizations.

Addresses: #45192.

Release note: None

66379: colexecbase: add casts from decimals to ints and floats r=yuzefovich a=yuzefovich

This commit adds vectorized casts from decimals to ints (of all widths)
and floats.

Addresses: #48135.

Release note: None

66412: sqlproxyccl: minor fixes and enhancements to the proxy handler and denylist r=jaylim-crl a=jaylim-crl

#### sqlproxyccl: allow denylist entries that do not expire

Previously, we assumed that all denylist entries have an expiration key. When
denylist entries do not specify an expiration key, the entries are marked as
expired right away since their values default to the zero instant time. This
might be cumbersome for operators to specify an expiration when the intention
was to not allow the rule to expire at all. This patch changes the behavior of
the denylist such that entries without any expiration keys represent rules
that do not expire.

#### sqlproxyccl: minor fixes around the proxy handler

In #65164, we migrated the sqlproxy in the CC code to the DB repository, and
there were a few buglets:
- sqlproxy crashes when the tenant ID supplied in the connection string is 0
  because roachpb.MakeTenantID panics when the tenant ID is 0.
- sqlproxy leaks internal parsing errors to the client.

This patch hides internal parsing errors, and replaces them with friendly
user-facing errors (e.g. "Invalid cluster name"). We also add a bounds check
to the parsed tenant ID so that the process does not crash on an invalid
tenant ID. More tests were added as well.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Jay Lim <jay@cockroachlabs.com>
@jlinder jlinder added the T-sql-queries SQL Queries Team label Jun 16, 2021
craig bot pushed a commit that referenced this issue Jul 19, 2021
66459: colexecbase: refactor cast templating to be type-specific r=yuzefovich a=yuzefovich

See individual commits for details.

Informs: #48135.

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@yuzefovich yuzefovich changed the title colexec: add missing casts colexec: add casts between natively supported types Jul 22, 2021
craig bot pushed a commit that referenced this issue Jul 22, 2021
67733: colexecbase: extend support of casts r=yuzefovich a=yuzefovich

Addresses: #48135

See individual commits for details. After this PR we only need to add
more casts between natively supported types.

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
craig bot pushed a commit that referenced this issue Jul 22, 2021
67067: server: require admin role to access node status r=bdarnell a=knz

Release note (security update): The node status retrieval endpoints
over HTTP (`/_status/nodes`, `/_status/nodes/<N>` and the web UI
`/#/reports/nodes`) have been updated to require the `admin` role from
the requesting user. This ensures that operational details such as
network addresses and command-line flags do not leak to unprivileged
users.

67733: colexecbase: extend support of casts r=yuzefovich a=yuzefovich

Addresses: #48135

See individual commits for details. After this PR we only need to add
more casts between natively supported types.

67768: sql, server: add skeleton TokenBucket connector and tenant resource limits configuration APIs r=RaduBerinde a=RaduBerinde

This PR is a scaled back version of #67508 where we don't use the system table at all. It's meant to put some of the infrastructure pieces in place and provide a stub API for reconfiguration.

The plan is to add consumption metrics on top of this soon so that CC can develop in parallel.

---

#### server: add TokenBucket connector API

This change adds the TokenBucket API proposed in the RFC (#66436), a
stub implementation and client for it, and the corresponding KV
connector interface.

The client and server-side code lives in
ccl/multitenantccl/tenantcostclient and tenantcostserver.

Release note: None

#### sql: tenant resource limits configuration API

This commit adds a `crdb_internal.update_tenant_resource_limits`
internal SQL function (to be used by the system tenant) which updates
the token bucket configuration for a specific tenant.

Release note: None


67840: sql: add test for creating stats on tables with expression indexes r=mgartner a=mgartner

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
@yuzefovich yuzefovich removed this from [VECTORIZED BACKLOG] Enhancements/Features/Investigations in BACKLOG, NO NEW ISSUES: SQL Execution Jul 7, 2022
@yuzefovich yuzefovich added this to Triage in SQL Queries via automation Jul 7, 2022
@yuzefovich yuzefovich moved this from Triage to Backlog in SQL Queries Jul 7, 2022
craig bot pushed a commit that referenced this issue Jul 11, 2022
83719: storage: remove dependency to sql/catalog/bootstrap r=Xiang-Gu a=Xiang-Gu

Previously, tests in `pkg/storage` depended on `sql/catalog/bootstrap`.
This was inadequate/weird because `storage` is a much lower layer in the
architectural stack. It will also help prevent dependency cycles in
other PRs when we introduce depencies (see #82172 if interested).

Release note: None

83958: colexecbase: add all remaining casts from strings r=yuzefovich a=yuzefovich

**tree: minor cleanup**

This commit does a few minor things:
- actually uses the error in a few places when constructing a ParseError
- refactors some of the interval-parsing functions to expose them to be
used in the follow-up commit
- extracts a helper method to construct an error when timestamp exceeds
bounds.

Release note: None

**colexecbase: sort native cast info lexicographically**

This commit sorts the information about natively supported casts
lexicographically so that it is easier to see what is actually
supported. This is simply a mechanical change.

Release note: None

**colexecbase: add all remaining casts from strings**

This commit adds the native casts from strings to all remaining
natively-supported types (dates, decimals, floats, ints, intervals,
timestamps, jsons).

I was inspired to do this because the combination of this
commit and the vectorized rendering on top of the wrapped row-by-row
processors would expose some bugs (e.g. #83094).

Addresses: #48135.

Release note: None

83984: storageccl: use NewPebbleIterator in restore data processor r=erikgrinaker a=msbutler

This PR replaces the multiIterator used in the restore data processor with the
PebbleSSTIterator, which has baked in range key support.

This patch is apart of a larger effort to teach backup and restore about MVCC
bulk operations. Next, the readAsOfIterator will need to learn how to
deal with range keys.

Informs #71155

Release note: none

84049: sql: fix memory accounting of prepared statements and portals in error cases r=yuzefovich a=yuzefovich

**sql: make sure to close mem acc of prepared stmt in case of an error**

Previously, it was possible that we would not close the memory account
created for a prepared statement when an error is encountered. This was
the case because we would not include the prepared stmt into the prep
stmts namespace, so it would just get lost. However, up until recently
this was not an issue since we mistakenly cleared that memory account
when creating the prepared statement.

Release note: None

**sql: only increment ref count of prep stmt in non-error case of portals**

Previously, it was possible to "leak" a reference to a prepared
statement if we made a portal from it (i.e. took a reference to the
prepared statement) and the memory reservation was denied. This could
lead to "leftover bytes" errors when stopping the "session" monitors.
However, the impact is minor because on release builds we'd still return
those "leftover bytes" and would just file a sentry issue. This is now
fixed.

Fixes: #83935

Release note: None

84097: DOC-4899: Remove linking on subdirectory from show_backup diagram r=RichardJCai a=nickvigilante

Release note: None

84143: Revert "kvstreamer: reuse incomplete Get requests on resume batches" r=yuzefovich a=yuzefovich

This reverts commit 21f2390.

Previously, I didn't realize that the KV layer would modify all requests
included into the BatchRequest in `txnSeqNumAllocator`, so we cannot
reuse even incomplete GetRequests. It is unfortunate, but not a big
deal.

Fixes: #83974.

Release note: None

84169: opt: fix incorrect column indexing in index recommendations r=mgartner a=mgartner

#### opt: fix incorrect column indexing in index recommendations

Fixes #83965

Release note (bug fix): A minor bug has been fixed that caused internal
errors and poor index recommendations when running `EXPLAIN` statements.

#### opt: clarify logic in Metadata.UpdateTableMeta

Release note: None


84188: roachtest: update supported tag for gorm r=ZhouXing19 a=ZhouXing19

fixes #83794
fixes #83797
fixes #83885

Release note: None

Co-authored-by: Xiang Gu <xiang@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Michael Butler <butler@cockroachlabs.com>
Co-authored-by: Nick Vigilante <vigilante@cockroachlabs.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Jane Xing <zhouxing@uchicago.edu>
craig bot pushed a commit that referenced this issue Aug 8, 2022
85681: colexecbase: add all casts of native types to strings r=yuzefovich a=yuzefovich

**colexecbase: propagate bytes.Buffer as argument in cast templates**

This is not used currently but will be in the following commit.

Release note: None

**colexecbase: add all casts of native types to strings**

This commit adds all of the casts from the types natively supported by
the vectorized engine to strings. This has an additional benefit (apart
from better performance on a lot of data: the concat binary projection
with a string on one side is now supported natively (we recently fixed
an issue to plan a cast of a non-string type to a string for a concat
operation, but since we didn't have the cast support, we'd be falling
back to the row-by-row engine in most cases).

Addresses: #48135.
Fixes: #49463.
Fixes: #55841.

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Status: Backlog
SQL Queries
Backlog (DO NOT ADD NEW ISSUES)
Development

No branches or pull requests

2 participants