Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: queries with a gin index discard predicate #88047

Closed
dbist opened this issue Sep 16, 2022 · 2 comments · Fixed by #88079
Closed

opt: queries with a gin index discard predicate #88047

dbist opened this issue Sep 16, 2022 · 2 comments · Fixed by #88079
Assignees
Labels
branch-release-22.1 Used to mark release blockers and technical advisories for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team
Projects

Comments

@dbist
Copy link
Contributor

dbist commented Sep 16, 2022

Describe the problem

Considering a jsonb column with a payload like

{"$k": ["_id", "theaterId", "location"], "_id": {"$o": "59a47286cfa9a3a73e51e732"}, "location": {"$k": ["address", "geo"], "address": {"$k": ["street1", "city", "state", "zipcode"], "city": "Sparks", "state": "NV", "street1": "155 Los Altos Pkwy", "zipcode": "89436"}, "geo": {"$k": ["type", "coordinates"], "coordinates": [{"$f": -119.7412}, {"$f": 39.579536}], "type": "Point"}}, "theaterId": 1014}

Having a GIN index on the column

CREATE INDEX ON sample_mflix.theaters_cf846063 USING GIN(_jsonb);

Executing a query like

SELECT gateway_region(), "_jsonb"->'location'->'address'->>'state' FROM sample_mflix.theaters_cf846063 WHERE "_jsonb"->'location'->'address' @> '{"state":"NY"}' LIMIT 10;

returns the following output

  gateway_region | ?column?
-----------------+-----------
  aws-us-east-1  | MI
  aws-us-east-1  | MN
  aws-us-east-1  | MI
  aws-us-east-1  | MD
  aws-us-east-1  | MI
  aws-us-east-1  | MI
  aws-us-east-1  | MI
  aws-us-east-1  | MI
  aws-us-east-1  | MI
  aws-us-east-1  | NV

The incorrect result is returned irrelevant of the LIMIT.

Dropping the index

DROP INDEX sample_mflix.theaters_cf846063@theaters_cf846063__jsonb_idx;

The result is incorrect again

artem@artem-mr-7xw.aws-us-east-1.cockroachlabs.cloud:26257/ferretdb> SELECT gateway_region(), "_jsonb"->'location'->'address'->>'state' FROM sample_mflix.theaters_cf846063 WHERE "_jsonb"->'location'->'address' @> '{"state":"NY"}' LIMIT 1;
  gateway_region | ?column?
-----------------+-----------
  aws-us-east-1  | MI

The table has RBR locality

CREATE TABLE sample_mflix.theaters_cf846063 (
      _jsonb JSONB NULL,
      rowid INT8 NOT VISIBLE NOT NULL DEFAULT unique_rowid(),
      region public.crdb_internal_region NOT NULL AS (CASE WHEN (((_jsonb->'location':::STRING)->'address':::STRING)->>'state':::STRING) IN ('AK':::STRING, 'AL':::STRING, 'AR':::STRING, 'AZ':::STRING, 'CA':::STRING, 'CO':::STRING, 'CT':::STRING, 'DC':::STRING, 'DE':::STRING, 'FL':::STRING, 'GA':::STRING, 'HI':::STRING, 'IA':::STRING, 'ID':::STRING, 'IL':::STRING, 'IN':::STRING, 'KS':::STRING, 'KY':::STRING, 'LA':::STRING) THEN 'aws-us-west-2':::public.crdb_internal_region WHEN (((_jsonb->'location':::STRING)->'address':::STRING)->>'state':::STRING) IN ('MA':::STRING, 'MD':::STRING, 'ME':::STRING, 'MI':::STRING, 'MN':::STRING, 'MO':::STRING, 'MS':::STRING, 'MT':::STRING, 'NC':::STRING, 'ND':::STRING, 'NE':::STRING, 'NH':::STRING, 'NJ':::STRING, 'NM':::STRING, 'NV':::STRING, 'NY':::STRING) THEN 'aws-us-east-1':::public.crdb_internal_region WHEN (((_jsonb->'location':::STRING)->'address':::STRING)->>'state':::STRING) IN ('OH':::STRING, 'OK':::STRING, 'OR':::STRING, 'PA':::STRING, 'PR':::STRING, 'RI':::STRING, 'SC':::STRING, 'SD':::STRING, 'TN':::STRING, 'TX':::STRING, 'UT':::STRING, 'VA':::STRING, 'VT':::STRING, 'WA':::STRING, 'WI':::STRING, 'WV':::STRING, 'WY':::STRING) THEN 'aws-us-east-2':::public.crdb_internal_region END) STORED,
      state STRING NOT NULL AS (((_jsonb->'location':::STRING)->'address':::STRING)->>'state':::STRING) STORED,
      CONSTRAINT theaters_cf846063_pkey PRIMARY KEY (rowid ASC),
      INDEX theaters_cf846063_state_idx (state ASC),
      INVERTED INDEX theaters_cf846063__jsonb_idx1 (_jsonb)
  ) LOCALITY REGIONAL BY ROW AS region

The table schema

ALTER TABLE sample_mflix.theaters_cf846063 ADD COLUMN region crdb_internal_region NOT NULL AS (
  CASE WHEN ("_jsonb"->'location'->'address'->>'state') IN ('AK','AL','AR','AZ','CA','CO','CT','DC','DE','FL','GA','HI','IA','ID','IL','IN','KS','KY','LA') THEN 'aws-us-west-2'
  WHEN ("_jsonb"->'location'->'address'->>'state') IN ('MA','MD','ME','MI','MN','MO','MS','MT','NC','ND','NE','NH','NJ','NM','NV','NY') THEN 'aws-us-east-1'
  WHEN ("_jsonb"->'location'->'address'->>'state') IN ('OH','OK','OR','PA','PR','RI','SC','SD','TN','TX','UT','VA','VT','WA','WI','WV','WY') THEN 'aws-us-east-2'
  END
) STORED;
ALTER TABLE sample_mflix.theaters_cf846063 SET LOCALITY REGIONAL BY ROW AS "region";

The OPT for the select query

                                     info
------------------------------------------------------------------------------
  project
   ├── index-join theaters_cf846063
   │    └── locality-optimized-search
   │         ├── scan theaters_cf846063@theaters_cf846063__jsonb_idx1
   │         │    ├── constraint: /13: [/'aws-us-east-1' - /'aws-us-east-1']
   │         │    └── limit: 10
   │         └── scan theaters_cf846063@theaters_cf846063__jsonb_idx1
   │              ├── constraint: /20
   │              │    ├── [/'aws-us-east-2' - /'aws-us-east-2']
   │              │    └── [/'aws-us-west-2' - /'aws-us-west-2']
   │              └── limit: 10
   └── projections
        ├── 'aws-us-east-1'
        └── (("_jsonb"->'location')->'address')->>'state'

Expected behavior
The query should return all rows with state NY.

Environment:

  • CockroachDB version 22.1.6 Dedicated
  • Server OS: Cockroach Cloud
  • Client app cockroach sql

Jira issue: CRDB-19660

@dbist dbist added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 16, 2022
@dbist dbist added this to Triage in SQL Queries via automation Sep 16, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Sep 16, 2022
@dbist
Copy link
Contributor Author

dbist commented Sep 16, 2022

without LIMIT query returns correct results

artem@artem-mr-7xw.aws-us-east-1.cockroachlabs.cloud:26257/ferretdb_with_bug> SELECT gateway_region(), "_jsonb"->'location'->'address'->>'state' FROM sample_mflix.theaters_cf846063 WHERE "_jsonb"->'location'->'address' @> '{"state":"NY"}';
  gateway_region | ?column?
-----------------+-----------
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY
  aws-us-east-1  | NY

@msirek
Copy link
Contributor

msirek commented Sep 16, 2022

@dbist Opened #88079 for this

@msirek msirek self-assigned this Sep 16, 2022
@msirek msirek moved this from Triage to Active in SQL Queries Sep 16, 2022
msirek pushed a commit to msirek/cockroach that referenced this issue Sep 17, 2022
Fixes cockroachdb#88047

Previously, a query which scans from a table with REGIONAL BY ROW
partitioning utilizing an inverted index with an inverted constraint
and locality-optimized search fails to apply the inverted constraint
and may return wrong results.

To address this, when locality-optimized search is built from an
inverted constrained scan, the inverted constraint is copied into the
local and remote scans of the locality-optimized search.

Release note (bug fix): This patch fixes incorrect results from queries
which utilize locality-optimized search on the inverted index of a table
with REGIONAL BY ROW partitioning.
craig bot pushed a commit that referenced this issue Sep 18, 2022
88079: xform: fix wrong results from locality-optimized scan of inverted index r=msirek a=msirek

Fixes #88047

Previously, a query which scans from a table with REGIONAL BY ROW
partitioning utilizing an inverted index with an inverted constraint
and locality-optimized search fails to apply the inverted constraint
and may return wrong results.

To address this, when locality-optimized search is built from an
inverted constrained scan, the inverted constraint is copied into the
local and remote scans of the locality-optimized search.

Release note (bug fix): This patch fixes incorrect results from queries
which utilize locality-optimized search on the inverted index of a table
with REGIONAL BY ROW partitioning.

Co-authored-by: Mark Sirek <sirek@cockroachlabs.com>
@craig craig bot closed this as completed in 4482999 Sep 18, 2022
SQL Queries automation moved this from Active to Done Sep 18, 2022
blathers-crl bot pushed a commit that referenced this issue Sep 18, 2022
Fixes #88047

Previously, a query which scans from a table with REGIONAL BY ROW
partitioning utilizing an inverted index with an inverted constraint
and locality-optimized search fails to apply the inverted constraint
and may return wrong results.

To address this, when locality-optimized search is built from an
inverted constrained scan, the inverted constraint is copied into the
local and remote scans of the locality-optimized search.

Release note (bug fix): This patch fixes incorrect results from queries
which utilize locality-optimized search on the inverted index of a table
with REGIONAL BY ROW partitioning.
blathers-crl bot pushed a commit that referenced this issue Sep 18, 2022
Fixes #88047

Previously, a query which scans from a table with REGIONAL BY ROW
partitioning utilizing an inverted index with an inverted constraint
and locality-optimized search fails to apply the inverted constraint
and may return wrong results.

To address this, when locality-optimized search is built from an
inverted constrained scan, the inverted constraint is copied into the
local and remote scans of the locality-optimized search.

Release note (bug fix): This patch fixes incorrect results from queries
which utilize locality-optimized search on the inverted index of a table
with REGIONAL BY ROW partitioning.
@rytaft rytaft added the S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. label Sep 19, 2022
mgartner added a commit to mgartner/cockroach that referenced this issue Oct 5, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., cockroachdb#88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes cockroachdb#89440

Release note: None
craig bot pushed a commit that referenced this issue Oct 6, 2022
87763: changefeedccl: mark kv senderrors retryable r=samiskin a=samiskin

Resolves #87300

Changefeeds can encounter senderrors during a normal upgrade procedure and therefore should retry.  This was done in the kvfeed however is apparently not high level enough as a send error was still observed to cause a permanent failure.

This PR moves the senderror checking to the top level IsRetryable check to handle it regardless of its source.

Release justification: low risk important bug fix

Release note (bug fix): Changefeeds will now never permanently error on a "failed to send RPC" error.

89445: opt: assert that inverted scans have inverted constraints r=mgartner a=mgartner

This commit adds an assertion to ensure that inverted index scans have inverted constraints. If they do not, there is a likely a bug that can cause incorrect query results (e.g., #88047). This assertion is made in release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None

89482: roachtests: introduce admission-control/elastic-backup r=irfansharif a=irfansharif

Informs #89208. This test sets up a 3-node CRDB cluster on 8vCPU machines running 1000-warehouse TPC-C with an aggressive (every 20m) full backup schedule. We've observed latency spikes during backups because of its CPU-heavy nature -- it can elevate CPU scheduling latencies which in turn translates to an increase in foreground latency. In #86638 we introduced admission control mechanisms to dynamically pace such work while maintaining acceptable CPU scheduling latencies (sub millisecond p99s). This roachtest exercises that machinery. In future commits we'll add libraries to the roachtest package to automatically spit out the degree to which {CPU-scheduler,foreground} latencies are protected.

Release note: None

Co-authored-by: Shiranka Miskin <shiranka.miskin@gmail.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
blathers-crl bot pushed a commit that referenced this issue Oct 6, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
blathers-crl bot pushed a commit that referenced this issue Oct 6, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
blathers-crl bot pushed a commit that referenced this issue Oct 6, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
blathers-crl bot pushed a commit that referenced this issue Oct 6, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
mgartner added a commit that referenced this issue Oct 13, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
mgartner added a commit that referenced this issue Oct 13, 2022
This commit adds an assertion to ensure that inverted index scans have
inverted constraints. If they do not, there is a likely a bug that can
cause incorrect query results (e.g., #88047). This assertion is made in
release builds,not just test builds, because it is cheap to perform.

Fixes #89440

Release note: None
@rytaft rytaft added C-technical-advisory Caused a technical advisory branch-release-22.1 Used to mark release blockers and technical advisories for 22.1 labels Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-22.1 Used to mark release blockers and technical advisories for 22.1 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory S-0-visible-logical-error Database stores inconsistent data in some cases, or queries return invalid results silently. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants