Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Columns with default values may be initialized to null instead #2021

Closed
aphyr opened this issue Aug 12, 2019 · 8 comments
Closed

[YSQL] Columns with default values may be initialized to null instead #2021

aphyr opened this issue Aug 12, 2019 · 8 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug
Milestone

Comments

@aphyr
Copy link

aphyr commented Aug 12, 2019

In YB 1.3.1.0, on Debian, it looks like columns with a default value of NOW() may, occasionally, be initialized to NULL instead. With a schema like

CREATE TABLE foo (
  k TIMESTAMP DEFAULT NOW(),
  v INT
)  IF NOT EXISTS;

And a workload comprised of inserts (INSERT INTO foo (v) VALUES (?)), and reads (SELECT k, v FROM foo ORDER BY k), it appears that sometimes, k may wind up NULL, rather than being a timestamp. This log line shows a series of [k, v] pairs from a read; notice that it contains one row (13) with a timestamp, and one row (9) with null. 20190812T154729.000-0400.zip

2019-08-12 15:50:01,257{GMT}	INFO	[jepsen worker 5] yugabyte.ysql.append-table: table append11 has ([#inst "2019-08-12T19:49:57.609143000-00:00" 13] [nil 9])

Other tables may contain no null entries, or many nulls. These null values appear stable over multiple seconds. To reproduce, try Jepsen e324b33, and run

lein run test --os debian --version 1.3.1.0 -w ysql/append-table --concurrency 2n --time-limit 300

The checker doesn't verify this property, so the test failing or succeeding isn't meaningful. Instead, grep the logs for a line with a table with nil entries.

grep -e 'table append.\+ has.\+nil' store/latest/jepsen.log 
@kmuthukk kmuthukk added the area/ysql Yugabyte SQL (YSQL) label Aug 12, 2019
@kmuthukk kmuthukk added this to To do in YSQL via automation Aug 12, 2019
@kmuthukk kmuthukk added this to To do in Jepsen Testing via automation Aug 12, 2019
@yugabyte-ci yugabyte-ci added community/request Issues created by external users and removed area/ysql Yugabyte SQL (YSQL) labels Aug 12, 2019
@kmuthukk
Copy link
Collaborator

Notes:

@ndeodhar ndeodhar assigned frozenspider and unassigned ndeodhar Aug 12, 2019
@frozenspider frozenspider added kind/bug This issue is a bug area/ysql Yugabyte SQL (YSQL) and removed community/request Issues created by external users labels Aug 13, 2019
@frozenspider frozenspider changed the title Tables with a foo timestamp default NOW() column may not be initialized [YSQL] Tables with a foo timestamp default NOW() column may not be initialized Aug 13, 2019
@frozenspider
Copy link
Contributor

frozenspider commented Aug 14, 2019

I've changed PG NOW function code (timestamp.c) to log its state:

Datum
now(PG_FUNCTION_ARGS)
{
    elog(LOG, " === Calling NOW() ===");
    TimestampTz txnStartTs = GetCurrentTransactionStartTimestamp();
    Datum result = TimestampTzGetDatum(txnStartTs);
    elog(LOG, " === Start TS: %ld, result: %ld ===", txnStartTs, result);
    return result;
}

When using ysqlsh, it triggers both when using original schema/query, or even simple SELECT NOW().

When tested with Jepsen, let's take a look at this log excerpt

(click to expand)
2019-08-14 15:37:32,297{GMT}	INFO	[jepsen worker 5] jepsen.util: 5	:invoke	:txn	[[:append 1 3] [:r 1 nil]]
2019-08-14 15:37:32,297{GMT}	INFO	[jepsen worker 0] jepsen.util: 0	:invoke	:txn	[[:append 0 2]]
2019-08-14 15:37:32,297{GMT}	INFO	[jepsen worker 4] jepsen.util: 4	:invoke	:txn	[[:r 1 nil] [:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:32,299{GMT}	INFO	[jepsen worker 6] jepsen.util: 6	:invoke	:txn	[[:append 1 5] [:r 2 nil]]
2019-08-14 15:37:32,299{GMT}	INFO	[jepsen worker 1] jepsen.util: 1	:invoke	:txn	[[:append 1 1] [:append 1 2] [:append 0 1] [:append 2 1]]
2019-08-14 15:37:32,299{GMT}	INFO	[jepsen worker 9] jepsen.util: 9	:invoke	:txn	[[:r 2 nil] [:append 0 3]]
2019-08-14 15:37:32,299{GMT}	INFO	[jepsen worker 3] jepsen.util: 3	:invoke	:txn	[[:r 0 nil] [:append 1 4] [:append 2 2] [:append 2 3]]
2019-08-14 15:37:32,300{GMT}	INFO	[jepsen worker 8] jepsen.util: 8	:invoke	:txn	[[:append 1 6] [:r 1 nil] [:r 2 nil]]
2019-08-14 15:37:32,300{GMT}	INFO	[jepsen worker 2] jepsen.util: 2	:invoke	:txn	[[:append 2 4] [:append 1 7] [:r 0 nil] [:r 2 nil]]
2019-08-14 15:37:32,300{GMT}	INFO	[jepsen worker 7] jepsen.util: 7	:invoke	:txn	[[:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:32,345{GMT}	INFO	[jepsen worker 4] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,345{GMT}	INFO	[jepsen worker 3] yugabyte.ysql.append-table: Creating table append0 and retrying
2019-08-14 15:37:32,346{GMT}	INFO	[jepsen worker 9] yugabyte.ysql.append-table: Creating table append2 and retrying
2019-08-14 15:37:32,346{GMT}	INFO	[jepsen worker 7] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,347{GMT}	INFO	[jepsen worker 1] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,350{GMT}	INFO	[jepsen worker 6] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,354{GMT}	INFO	[jepsen worker 8] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,360{GMT}	INFO	[jepsen worker 5] yugabyte.ysql.append-table: Creating table append1 and retrying
2019-08-14 15:37:32,360{GMT}	INFO	[jepsen worker 2] yugabyte.ysql.append-table: Creating table append2 and retrying
2019-08-14 15:37:32,362{GMT}	INFO	[jepsen worker 0] yugabyte.ysql.append-table: Creating table append0 and retrying
2019-08-14 15:37:40,485{GMT}	WARN	[jepsen worker 7] yugabyte.ysql.append-table: Encountered error with conn "n3"; reopening
2019-08-14 15:37:40,499{GMT}	WARN	[jepsen worker 8] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:40,500{GMT}	WARN	[jepsen worker 2] yugabyte.ysql.append-table: Encountered error with conn "n3"; reopening
2019-08-14 15:37:40,527{GMT}	WARN	[jepsen worker 0] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:40,531{GMT}	WARN	[jepsen worker 1] yugabyte.ysql.append-table: Encountered error with conn "n2"; reopening
2019-08-14 15:37:40,546{GMT}	WARN	[jepsen worker 5] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:40,672{GMT}	INFO	[jepsen worker 8] jepsen.util: 8	:info	:txn	[[:append 1 6] [:r 1 nil] [:r 2 nil]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,680{GMT}	INFO	[jepsen worker 2] jepsen.util: 2	:info	:txn	[[:append 2 4] [:append 1 7] [:r 0 nil] [:r 2 nil]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,685{GMT}	INFO	[jepsen worker 8] jepsen.util: 18	:invoke	:txn	[[:r 1 nil] [:r 1 nil]]
2019-08-14 15:37:40,685{GMT}	INFO	[jepsen worker 7] jepsen.util: 7	:info	:txn	[[:r 1 nil] [:r 1 nil]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,690{GMT}	INFO	[jepsen worker 7] jepsen.util: 17	:invoke	:txn	[[:r 2 nil]]
2019-08-14 15:37:40,690{GMT}	INFO	[jepsen worker 1] jepsen.util: 1	:info	:txn	[[:append 1 1] [:append 1 2] [:append 0 1] [:append 2 1]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,697{GMT}	INFO	[jepsen worker 2] jepsen.util: 12	:invoke	:txn	[[:append 2 5] [:append 2 6]]
2019-08-14 15:37:40,697{GMT}	INFO	[jepsen worker 1] jepsen.util: 11	:invoke	:txn	[[:append 0 4] [:append 2 7]]
2019-08-14 15:37:40,730{GMT}	INFO	[jepsen worker 0] jepsen.util: 0	:info	:txn	[[:append 0 2]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,734{GMT}	INFO	[jepsen worker 0] jepsen.util: 10	:invoke	:txn	[[:append 0 5]]
2019-08-14 15:37:40,757{GMT}	INFO	[jepsen worker 5] jepsen.util: 5	:info	:txn	[[:append 1 3] [:r 1 nil]]	[:psql-exception "ERROR: duplicate key value violates unique constraint \"pg_type_typname_nsp_index\""]
2019-08-14 15:37:40,764{GMT}	INFO	[jepsen worker 5] jepsen.util: 15	:invoke	:txn	[[:append 1 8] [:append 0 6] [:r 0 nil] [:r 1 nil]]
2019-08-14 15:37:40,964{GMT}	INFO	[jepsen worker 2] jepsen.util: 12	:ok	:txn	[[:append 2 5] [:append 2 6]]
2019-08-14 15:37:40,964{GMT}	INFO	[jepsen worker 2] jepsen.util: 12	:invoke	:txn	[[:r 1 nil]]
2019-08-14 15:37:41,010{GMT}	INFO	[jepsen worker 0] jepsen.util: 10	:ok	:txn	[[:append 0 5]]
2019-08-14 15:37:41,010{GMT}	INFO	[jepsen worker 0] jepsen.util: 10	:invoke	:txn	[[:r 1 nil] [:r 2 nil]]
2019-08-14 15:37:41,090{GMT}	INFO	[jepsen worker 1] jepsen.util: 11	:ok	:txn	[[:append 0 4] [:append 2 7]]
2019-08-14 15:37:41,091{GMT}	INFO	[jepsen worker 1] jepsen.util: 11	:invoke	:txn	[[:append 2 8] [:r 2 nil] [:r 0 nil] [:append 0 7]]
2019-08-14 15:37:41,484{GMT}	INFO	[jepsen worker 7] yugabyte.ysql.append-table: table append2 has ([nil 5] [nil 6] [nil 7])
2019-08-14 15:37:41,484{GMT}	INFO	[jepsen worker 7] jepsen.util: 17	:ok	:txn	[[:r 2 [5 6 7]]]
2019-08-14 15:37:41,485{GMT}	INFO	[jepsen worker 7] jepsen.util: 17	:invoke	:txn	[[:append 1 9] [:r 1 nil] [:append 2 9] [:r 1 nil]]
2019-08-14 15:37:41,556{GMT}	INFO	[jepsen worker 5] yugabyte.ysql.append-table: table append0 has ([nil 6] [nil 4] [nil 5])
2019-08-14 15:37:43,273{GMT}	WARN	[jepsen worker 0] yugabyte.ysql.append-table: Encountered error with conn "n1"; reopening
2019-08-14 15:37:43,889{GMT}	WARN	[jepsen worker 3] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:43,985{GMT}	WARN	[jepsen worker 8] yugabyte.ysql.append-table: Encountered error with conn "n4"; reopening
2019-08-14 15:37:44,389{GMT}	INFO	[jepsen worker 9] yugabyte.ysql.append-table: table append2 has ([nil 5] [nil 6] [nil 7])

We can search for the first NOW() invocation in the logs:

alex@alex-debian-jessie:~/code/jepsen/yugabyte$ grep -iR '=== Start TS' store/latest/ | cut -d' ' -f2 | sort | head -n 1
15:37:43.851

Note that first append transaction returned :ok at 15:37:40,964 - meaning that NOW() wasn't invoked at all!
Logs: logs-jepsen_2019-08-14_20-40-08__append-table_c2n_t60_logs2.tar.gz

@frozenspider frozenspider moved this from To do to In progress in Jepsen Testing Aug 14, 2019
@frozenspider
Copy link
Contributor

Interesting thing: we're actually writing tombstone instead of NOW

I0815 11:58:14.998617  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [SystemColumnId(0)]) = null (type = kNullLow) (int64 = 4067611415479844864), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214A800D08238001C41760777966804A
I0815 11:58:15.000723  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [ColumnId(1)]) = DEL (type = kTombstone) (int64 = 232), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214B810D08238001C41760777966803FAB
I0815 11:58:15.001297  3349 docdb.cc:1409] === Apply: SubDocKey(DocKey(0x04cf, ["B\xe8Z2mUO\xba\xb1R\xeaO\xb0\x18v\xff"], []), [ColumnId(2)]) = 9 (type = kInt32) (int64 = 38654705824), time: { physical: 1565870294982314 }, txn: 7af7b080-2dd3-4dbb-9a3b-f371dd089576, raw: 4704CF5342E85A326D554FBAB152EA4FB01876FF000021214B820D08238001C41760777966803F8B

@frozenspider
Copy link
Contributor

@m-iancu made an assumption about the cause:

What can happen is that:
a. conn 1 creates the table (this does multiple ops internally):

  1. create YB table
  2. add pg_class/pg_attribute entries
  3. modify entries to set default (done as a separate step even when part of create table query itself)

b. conn 2 does an insert between step 2 and 3 then it could cause this issue.

This appears to be correct, the following Java test reliably reproduces this:
https://gist.github.com/frozenspider/9c5719968f5acf21a89c82fdf22f37a3

@ndeodhar ndeodhar moved this from To do to In progress in YSQL Aug 28, 2019
@frozenspider
Copy link
Contributor

Should be fixed as part of #1404

@aphyr aphyr changed the title [YSQL] Tables with a foo timestamp default NOW() column may not be initialized [YSQL] Columns with default values may be initialized to null instead Sep 5, 2019
@aphyr
Copy link
Author

aphyr commented Sep 5, 2019

I've expanded the test which uncovered this issue into a workload which specifically demonstrates the issue and reports its presence. Note that this problem affects any default value, not just timestamps or NOW(). With Jepsen d9e6a09, try

lein run test --os debian --url https://s3-us-west-2.amazonaws.com/downloads.yugabyte.com/new/yugabyte-1.3.1.2-b1-release-centos-x86_64.tar.gz -w ysql/default-value --time-limit 300

@ndeodhar ndeodhar assigned mbautin and unassigned frozenspider Sep 18, 2019
@ddorian ddorian added this to To do in Query optimization Dec 12, 2019
mbautin added a commit that referenced this issue Dec 12, 2019
…ction

Summary:
This diff makes a step towards enabling transactional DDL in YugabyteDB
by switching to transactional execution of multiple YSQL system catalog
read/write operations that happen as part of the same DDL operation.

We are making YSQL system catalog tables transactional at the DocDB
level, backed by the usual transaction status table that resides on
tablet servers. This introduces an additional dependency of the master
on tablet servers, so that accessing YSQL system catalog tables in the
common case now requires tablet servers to be up and running and the
transaction status table to exist, because there could be provisional
records whose status is determined by a status tablet living on tablet
servers. This new dependency could be removed later by making the system
catalog tablet a transaction status tablet in addition to everything
else it already stores.

Previously, we used to create the transaction status table whenever the
first request to create a transactional table (including any YSQL table)
was received. Now, the transaction status table is also required by
operations such as creating a role in YSQL, or anything that reads or
changes the state of YSQL system catalog tables. Therefore, if YSQL is
enabled, we now create the transaction status table automatically on the
master leader right after the number of tablet servers equal to the
cluster's replication factor have registered.

After we've added DocDB transactional capabilities to YSQL system
catalog tables, there are multiple possible choices in how we could wrap
formerly non-transactional DDL operations that access the YSQL system
catalog into transactions.  Ideally, we would want to allow arbitrary
user-controlled transactions to include DDL statements that could be
committed/rolled back similarly to DML statements. That would require us
to commit and roll back the corresponding Yugabyte-level operations
(creating namespaces, tablets, and tablets in the Yugabyte system
catalog, which is also stored in the master but is distinct from the
YSQL system catalog) along with YSQL system catalog changes. This more
complete feature is tracked at #3109 and is not part of this diff.

This diff's main goal is an incremental improvement over the existing
state of completely non-transactional YSQL system catalog changes that
would allow a reader to e.g. see a state of a half-done CREATE TABLE
statement. With this diff, all DDL statements are wrapped in a separate
DocDB-level transaction that is distinct from the regular DocDB-level
transaction matching the user's YSQL transaction. One such separate
DDL-only transaction is created per each top-level DML statement,
including any internal sub-statements that it issues. This should solve
the problem of leaving the effects of a half-done CREATE TABLE statement
in the system catalog, and the problem of observing results of an
incomplete CREATE TABLE statement (e.g. before it sets the default value
of a column, as in #2021). However, this could produce some anomalies,
e.g. #3110, that will be addressed as we implement full transactional
DDL support (see #3109).  Even though the example given in #3110
involves row-level security, this diff does not negatively affect row
level security specificially. Any CREATE TABLE statement executed as
part of a user-controlled transaction will now create and commit rows as
part of a separate DDL-only tranasction, and if the enclosing
user-controlled transaction is running at REPEATABLE READ isolation,
further statements in the user-controlled transaction won't see those
rows. They would still see the newly created table, because internal
read operations on system catalog tables are always done outside of any
transaction.  YugabyteDB's SERIALIZABLE transactions always read the
most recent state of the data and lock it against concurrent changes, so
switching the isolation level to SERIALIZABLE is sufficient to fix
the yb_pg_rowsecurity test.

These separate DDL-only transactions are executed at the REPEATABLE READ
isolation level by default, but can be set to serializable by the
setting the new `--ysql_enable_manual_sys_table_txn_ctl` flag.

We are adding a new flag to table properties, is_ysql_catalog_table. To
initialize this flag for existing clusters, we loop over all YSQL system
catalog tables and replicate the necessary changes to table entities in
the YugabyteDB system catalog, as well as updates to the system catalog
tablet metadata. For the tablet metadata change we use the existing
add_table field in ChangeMetadataRequestPB. This was originally intended
for use with new tables only, but would only produce an error message on
cluster upgrades in release builds (due to a DCHECK), and effectively
replace the old metadata protobuf with the new one.

Test Plan:
Jenkins

Upgrade test:

~/code/yugabyte-db-old/bin/yb-ctl create

Load the Northwind database as described at
https://docs.yugabyte.com/latest/sample-data/northwind/

~/code/yugabyte-db-old/bin/yb-ctl stop

~/code/yugabyte-db-new/bin/yb-ctl start

Look for these messages in the master log:

I1212 22:13:36.832412 2963256 sys_catalog_initialization.cc:312] Made 655 YSQL sys catalog tables transactional
I1212 22:13:36.832451 2963256 sys_catalog_initialization.cc:315] Marking YSQL system catalog as transactional in YSQL catalog config

Verify that the Northwind data is intact.

Repeat the above steps with `--replication_factor=3` as well.

Reviewers: mihnea, sergei, neha

Reviewed By: sergei, neha

Subscribers: bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D7156
@m-iancu m-iancu added this to To do in DDL improvements via automation Mar 4, 2020
@m-iancu m-iancu added this to the v2.2 milestone Mar 4, 2020
@frozenspider
Copy link
Contributor

frozenspider commented May 14, 2020

This has been resolved and is no longer reproducible. This was confirmed by verifying the following:

  • These issues still reproduce when test is run against the original version v1.3.1.0
  • These issues no longer reproduce when test is run against v2.1.3 (as of commit 9f4448d)

Note that in order to be safe, the testing was done using 100 iterations of the Jepsen test

lein run test --os debian ... -w ysql/append-table --concurrency 2n --time-limit 300

(using Jepsen commit e324b33 and corresponding --version or --url)

Jepsen Testing automation moved this from In progress to Done May 14, 2020
YSQL automation moved this from In progress to Done May 14, 2020
@aphyr
Copy link
Author

aphyr commented May 14, 2020

Heyyyy congats!!! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug
Projects
YSQL
  
Done
Development

No branches or pull requests

7 participants