Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Upgrade to 2.20.3.0 version fails with inconsistency errors. #22057

Closed
1 task done
shantanugupta-yb opened this issue Apr 18, 2024 · 7 comments
Closed
1 task done

Comments

@shantanugupta-yb
Copy link

shantanugupta-yb commented Apr 18, 2024

Jira Link: DB-10979

Description

TPCC workload running while cluster upgrade from 2.18.7.0-b38 to 2.21.1.0-b90 is failing.

16:06:35,834 (Worker.java:546) WARN - The DBMS rejected the transaction without an error code:ERROR: null value in column "w_tax" violates not-null constraint Detail: Failing row contains (999, 2308909.22, null, null, null, null, null, null, null).

Even post cluster upgrade the workload is failing cause of same error.

On manually validating the contents of the row were null/empty for the warehouse id which failed in the test
yugabyte=# select * from warehouse where w_id=999 LIMIT 1 yugabyte-# ; 999 | 2304875.01 | | | | | | |

Jenkins logs: http://10.9.9.254:54422/job/TPCC-benchmark/5847/console
Perf studio job: https://perf.dev.yugabyte.com/perfstudio-dashboard/status/5424302
Cluster: http://10.9.131.126/universes/54cea849-a569-4a4b-aae6-4f9ef2123162/nodes

To reproduce the issue perf studio job mentioned above can be retriggered.

The same perf studio job passed when the cluster was upgraded from 2.18.7.0-b38 to 2.18.7.0-b40
Perf studio job: https://perf.dev.yugabyte.com/perfstudio-dashboard/status/5430102
Perf report: Link

Issue Type

kind/bug

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@shantanugupta-yb shantanugupta-yb added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage priority/highest Highest priority issue 2.21.1_blocker labels Apr 18, 2024
@yugabyte-ci yugabyte-ci added the kind/bug This issue is a bug label Apr 18, 2024
@rthallamko3 rthallamko3 added area/ysql Yugabyte SQL (YSQL) and removed area/docdb YugabyteDB core features labels Apr 19, 2024
@rthallamko3 rthallamko3 changed the title [DocDB] TPCC workload running while cluster upgrade to 2.21.1.0-b90 is failing. [YSQL] TPCC workload running while cluster upgrade to 2.21.1.0-b90 is failing. Apr 19, 2024
@sushantrmishra
Copy link

yugabyte=# \d warehouse
                      Table "public.warehouse"
   Column   |         Type          | Collation | Nullable | Default
------------+-----------------------+-----------+----------+---------
 w_id       | integer               |           | not null |
 w_ytd      | numeric(12,2)         |           | not null |
 w_tax      | numeric(4,4)          |           | not null |
 w_name     | character varying(10) |           | not null |
 w_street_1 | character varying(20) |           | not null |
 w_street_2 | character varying(20) |           | not null |
 w_city     | character varying(20) |           | not null |
 w_state    | character(2)          |           | not null |
 w_zip      | character(9)          |           | not null |
Indexes:
    "warehouse_pkey" PRIMARY KEY, lsm (w_id HASH)
Referenced by:
    TABLE "district" CONSTRAINT "d_fkey_w" FOREIGN KEY (d_w_id) REFERENCES warehouse(w_id) NOT VALID
    TABLE "stock" CONSTRAINT "s_fkey_w" FOREIGN KEY (s_w_id) REFERENCES warehouse(w_id) NOT VALID

yugabyte=# \d district
                       Table "public.district"
   Column    |         Type          | Collation | Nullable | Default
-------------+-----------------------+-----------+----------+---------
 d_w_id      | integer               |           | not null |
 d_id        | integer               |           | not null |
 d_ytd       | numeric(12,2)         |           | not null |
 d_tax       | numeric(4,4)          |           | not null |
 d_next_o_id | integer               |           | not null |
 d_name      | character varying(10) |           | not null |
 d_street_1  | character varying(20) |           | not null |
 d_street_2  | character varying(20) |           | not null |
 d_city      | character varying(20) |           | not null |
 d_state     | character(2)          |           | not null |
 d_zip       | character(9)          |           | not null |
Indexes:
    "district_pkey" PRIMARY KEY, lsm ((d_w_id, d_id) HASH)
Foreign-key constraints:
    "d_fkey_w" FOREIGN KEY (d_w_id) REFERENCES warehouse(w_id) NOT VALID
Referenced by:
    TABLE "customer" CONSTRAINT "c_fkey_d" FOREIGN KEY (c_w_id, c_d_id) REFERENCES district(d_w_id, d_id) NOT VALID
    TABLE "history" CONSTRAINT "h_fkey_d" FOREIGN KEY (h_w_id, h_d_id) REFERENCES district(d_w_id, d_id) NOT VALID
yugabyte=# \d stock
                         Table "public.stock"
    Column    |         Type          | Collation | Nullable | Default
--------------+-----------------------+-----------+----------+---------
 s_w_id       | integer               |           | not null |
 s_i_id       | integer               |           | not null |
 s_quantity   | numeric(4,0)          |           | not null |
 s_ytd        | numeric(8,2)          |           | not null |
 s_order_cnt  | integer               |           | not null |
 s_remote_cnt | integer               |           | not null |
 s_data       | character varying(50) |           | not null |
 s_dist_01    | character(24)         |           | not null |
 s_dist_02    | character(24)         |           | not null |
 s_dist_03    | character(24)         |           | not null |
 s_dist_04    | character(24)         |           | not null |
 s_dist_05    | character(24)         |           | not null |
 s_dist_06    | character(24)         |           | not null |
 s_dist_07    | character(24)         |           | not null |
 s_dist_08    | character(24)         |           | not null |
 s_dist_09    | character(24)         |           | not null |
 s_dist_10    | character(24)         |           | not null |
Indexes:
    "stock_pkey" PRIMARY KEY, lsm (s_w_id HASH, s_i_id ASC)
Foreign-key constraints:
    "s_fkey_i" FOREIGN KEY (s_i_id) REFERENCES item(i_id) NOT VALID
    "s_fkey_w" FOREIGN KEY (s_w_id) REFERENCES warehouse(w_id) NOT VALID
Referenced by:
    TABLE "order_line" CONSTRAINT "ol_fkey_s" FOREIGN KEY (ol_supply_w_id, ol_i_id) REFERENCES stock(s_w_id, s
_i_id) NOT VALID

@sushantrmishra
Copy link

yugabyte=# select count(*) from warehouse where w_zip is null;
 count
-------
     0
(1 row)

Though there are rows retrieved with NULL clause

yugabyte=# select * from warehouse where  w_zip is  null LIMIT 10;
 w_id |   w_ytd    | w_tax | w_name | w_street_1 | w_street_2 | w_city | w_state | w_zip
------+------------+-------+--------+------------+------------+--------+---------+-------
  987 | 2109908.69 |       |        |            |            |        |         |
  972 | 2221007.07 |       |        |            |            |        |         |
  957 | 2104341.40 |       |        |            |            |        |         |
  926 | 2194777.42 |       |        |            |            |        |         |
  362 | 2309262.09 |       |        |            |            |        |         |
  931 | 2263624.52 |       |        |            |            |        |         |
  900 | 2293347.70 |       |        |            |            |        |         |
  922 | 2196015.07 |       |        |            |            |        |         |
  890 | 2329138.36 |       |        |            |            |        |         |
  100 | 2317436.11 |       |        |            |            |        |         |
(10 rows)

@sushantrmishra
Copy link

yugabyte=# explain (analyze) select * from warehouse where  w_zip is  null;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=296) (actual time=2.946..3.117 rows=334 loops=1)
   Remote Filter: (w_zip IS NULL)
 Planning Time: 0.040 ms
 Execution Time: 3.239 ms
 Peak Memory Usage: 24 kB
(5 rows)
yugabyte=# explain analyze select count(*) from warehouse where w_zip is null;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=2.347..2.347 rows=1 loops=1)
   ->  Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=0) (actual time=2.342..2.342 rows=0 loops=1)
         Remote Filter: (w_zip IS NULL)
         Partial Aggregate: true
 Planning Time: 0.041 ms
 Execution Time: 2.400 ms
 Peak Memory Usage: 62 kB
(7 rows)

@sushantrmishra
Copy link

yugabyte=# explain (analyze, dist)  select * from warehouse where w_zip is null;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=296) (actual time=3.087..3.254 rows=334 loops=1)
   Remote Filter: (w_zip IS NULL)
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 2.820 ms
   Storage Table Rows Scanned: 1000
 Planning Time: 0.039 ms
 Execution Time: 3.385 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 2.820 ms
 Storage Rows Scanned: 1000
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 2.820 ms
 Peak Memory Usage: 24 kB
(16 rows)
yugabyte=# explain (analyze, dist)  select count(*) from warehouse where w_zip is null;
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=2.524..2.525 rows=1 loops=1)
   ->  Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=0) (actual time=2.519..2.519 rows=0 loops=1)
         Remote Filter: (w_zip IS NULL)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 2.307 ms
         Storage Table Rows Scanned: 666
         Partial Aggregate: true
 Planning Time: 0.043 ms
 Execution Time: 2.575 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 2.307 ms
 Storage Rows Scanned: 666
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 2.307 ms
 Peak Memory Usage: 62 kB
(18 rows)

@sushantrmishra
Copy link

with count(*) 334 are obsolete keys and without there are no obsolete keys.

         Metric docdb_obsolete_keys_found: 334.000
         Metric docdb_obsolete_keys_found_past_cutoff: 334.000
yugabyte=# explain (analyze, dist, debug) select * from warehouse where w_zip is null;
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=296) (actual time=3.076..3.245 rows=334 loops=1)
   Remote Filter: (w_zip IS NULL)
   Storage Table Read Requests: 1
   Storage Table Read Execution Time: 2.807 ms
   Storage Table Rows Scanned: 1000
   Metric rocksdb_block_cache_hit: 196.000
   Metric rocksdb_block_cache_index_hit: 8.000
   Metric rocksdb_block_cache_data_hit: 188.000
   Metric rocksdb_block_cache_bytes_read: 5993580.000
   Metric rocksdb_number_db_seek: 1325.000
   Metric rocksdb_number_db_next: 10999.000
   Metric rocksdb_number_db_seek_found: 1325.000
   Metric rocksdb_number_db_next_found: 10996.000
   Metric rocksdb_iter_bytes_read: 517857.000
   Metric rocksdb_block_cache_single_touch_hit: 12.000
   Metric rocksdb_block_cache_single_touch_bytes_read: 381227.000
   Metric rocksdb_block_cache_multi_touch_hit: 184.000
   Metric rocksdb_block_cache_multi_touch_bytes_read: 5612353.000
   Metric docdb_keys_found: 1000.000
   Metric ql_read_latency: sum: 5404.000, count: 3.000
 Planning Time: 0.044 ms
 Execution Time: 3.386 ms
yugabyte=# explain (analyze, dist, debug) select count(*) from warehouse where w_zip is null;                [58/1755]
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=102.50..102.51 rows=1 width=8) (actual time=2.524..2.524 rows=1 loops=1)
   ->  Seq Scan on warehouse  (cost=0.00..100.00 rows=1000 width=0) (actual time=2.519..2.519 rows=0 loops=1)
         Remote Filter: (w_zip IS NULL)
         Storage Table Read Requests: 1
         Storage Table Read Execution Time: 2.285 ms
         Storage Table Rows Scanned: 666
         Metric rocksdb_block_cache_hit: 196.000
         Metric rocksdb_block_cache_index_hit: 8.000
         Metric rocksdb_block_cache_data_hit: 188.000
         Metric rocksdb_block_cache_bytes_read: 5993580.000
         Metric rocksdb_number_db_seek: 1325.000
         Metric rocksdb_number_db_next: 4999.000
         Metric rocksdb_number_db_seek_found: 1325.000
         Metric rocksdb_number_db_next_found: 4996.000
         Metric rocksdb_iter_bytes_read: 266909.000
         Metric rocksdb_block_cache_single_touch_hit: 12.000
         Metric rocksdb_block_cache_single_touch_bytes_read: 381227.000
         Metric rocksdb_block_cache_multi_touch_hit: 184.000
         Metric rocksdb_block_cache_multi_touch_bytes_read: 5612353.000
         Metric docdb_keys_found: 1000.000
         Metric docdb_obsolete_keys_found: 334.000
         Metric docdb_obsolete_keys_found_past_cutoff: 334.000
         Metric ql_read_latency: sum: 4102.000, count: 3.000
         Partial Aggregate: true

@rthallamko3 rthallamko3 removed the status/awaiting-triage Issue awaiting triage label Apr 22, 2024
@yugabyte-ci yugabyte-ci added 2024.1_blocker area/docdb YugabyteDB core features labels Apr 23, 2024
@rthallamko3 rthallamko3 removed their assignment May 3, 2024
ttyusupov added a commit that referenced this issue May 8, 2024
Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry

Reviewed By: pjain, hsunder, dmitry

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34733
ttyusupov added a commit that referenced this issue May 8, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34856
ttyusupov added a commit that referenced this issue May 8, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34857
ttyusupov added a commit that referenced this issue May 8, 2024
…pdate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34858
ttyusupov added a commit that referenced this issue May 8, 2024
…e to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34865
ttyusupov added a commit that referenced this issue May 8, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34868
@yugabyte-ci yugabyte-ci assigned pkj415 and unassigned ttyusupov May 10, 2024
pkj415 pushed a commit that referenced this issue May 12, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: smishra, hsunder, dmitry, timur, rthallam

Reviewed By: rthallam

Subscribers: aaruj, ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34916
@rthallamko3
Copy link
Contributor

All backports have landed.

@rthallamko3
Copy link
Contributor

Background: In version 2.20.3.0, the row-locking feature was introduced to fix various issues arising from concurrent updates. While this fix was added under a gflag, the row locking feature is on by default in 2.20.3.0 build. If a cluster is being upgraded from a version lower than 2.20.0.0 , say, for example, 2.18.0, then during rolling upgrades, different nodes in the cluster could be running on different versions and that results in the row-locking feature being enabled on some nodes (upgraded nodes), and feature disabled on the remaining nodes (which are not upgraded yet), that continue to use the previous form of column level locking. The nodes on the old versions incorrectly handle the writes generated by the new version, resulting in corrupting table data.

Fix: The solution is to convert the existing gflag - ysql_skip_row_lock_for_update to an autoflag, so that the new logic is enabled only after all nodes are on the new version. (As long as all the nodes have new version that is able to interpret row level locking, toggling the gflag - ysql_skip_row_lock_for_update does not result in data loss, however the feature is essential for enforcing check constraints issue solved as part of [#15196].

@rthallamko3 rthallamko3 changed the title [YSQL] TPCC workload running while cluster upgrade to 2.21.1.0-b90 is failing. [YSQL] Upgrade to 2.20.3.0 version fails with inconsistency May 14, 2024
@rthallamko3 rthallamko3 changed the title [YSQL] Upgrade to 2.20.3.0 version fails with inconsistency [YSQL] Upgrade to 2.20.3.0 version fails with inconsistency errors. May 14, 2024
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry

Reviewed By: pjain, hsunder, dmitry

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34733
svarnau pushed a commit that referenced this issue May 29, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34857
svarnau pushed a commit that referenced this issue May 30, 2024
…ate to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: ybase, yql

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34856
svarnau pushed a commit that referenced this issue May 30, 2024
…e to auto-flag

Summary:
Combination of two commits: decb104111 + 6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34865
ZhenYongFan pushed a commit to ZhenYongFan/yugabyte-db that referenced this issue Jun 15, 2024
…ck_for_update to auto-flag

Summary:
Combination of two commits: yugabyte@decb104111 + yugabyte@6de97906 introduced a compatibility issue between pre-decb104111 and post-6de97906 tserver versions if they are running at the same time during upgrade.

With the post-6de97906 logic `WriteQuery::DoCompleteExecute` when isolation level is not `SERIALIZABLE_ISOLATION` it can add WRITE_OP that contains both `write_pairs` and `read_pairs` to the Raft log. Corresponding `read_pair` will have key set to encoded row key and value set to `KeyEntryTypeAsChar::kNullLow`.

This WRITE_OP is then processed by `TransactionalWriter::Apply` and corresponding intents are generated and written to the intents DB.

In post-6de97906 version `TransactionalWriter::Apply` processes such `read_pair` and as a result generates intent `<row> -> kNullLow` of type `[kStrongRead]`.
But in pre-decb104111 version `TransactionalWriter::Apply` processes such `read_pair` in a different way and generates intent `<row> -> kNullLow` of type `[kStrongRead, kStrongWrite]`.

Then `ApplyIntentsContext::Entry` processes intents and in pre-decb104111 version due to presence of `kStrongWrite` type this intent gets written into regular DB and results in the following record:
`<row> -> kNullLow`, which is incorrect regular DB record. Effect of handing this record by DocDB is different depending on WHERE filter and presence of aggregation, it may result in either affected rows be not visible by the user statement or visible but with non-PK columns set to null.

Given that `TransactionalWriter::Apply` only writes `read_pairs` for non-SERIALIZABLE_ISOLATION when new logic is enabled by `ysql_skip_row_lock_for_update`, the solution is to convert it to an auto-flag, so new logic is only enabled after all nodes are on the new version.

Also added `PgsqlWriteOperation::use_row_lock_for_update_` which is initialized in constructor to avoid changing behaviour in context of the same `PgsqlWriteOperation` instance (since flag is now runtime and the auto-flag will change at runtime from false to true).
Jira: DB-10979

Original commit: 987163c / D34733

Test Plan: Run TPCC workload in parallel with upgrade from 2.18.7.0-b38 to build with this fix incorporated.

Reviewers: pjain, smishra, hsunder, dmitry, rthallam

Reviewed By: pjain

Subscribers: yql, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34858
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants