Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: row updates being emitting at the wrong timestamp after a column drop #41961

Closed
aayushshah15 opened this issue Oct 28, 2019 · 0 comments · Fixed by #42053
Closed
Assignees

Comments

@aayushshah15
Copy link
Contributor

Currently, the changefeed poller emits row updates at the ModificationTime of the table descriptor where the column backfill mutation has been completely applied. If the schema change was started when the table descriptor was at version 1, the table descriptor version at which the changfeed level backfill row updates are emitted would be version 4.

The above is correct in case of ADD COLUMN since descriptor version 4 is the first table descriptor version such that the new column is logically available. However, in case of a DROP COLUMN, the dropped column would logically get dropped way before this (at version 2 if the schema change started on version 1).

We should be emitting backfill row updates for dropped columns at the ModificationTime of the table descriptor at which they're logically dropped (which is version 2 in the above example).

@aayushshah15 aayushshah15 self-assigned this Oct 28, 2019
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Oct 31, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Nov 19, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Nov 19, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
craig bot pushed a commit that referenced this issue Nov 21, 2019
41806: docs/RFCs: draft RFC for protected timestamps r=nvanbenschoten a=ajwerner

This commit provides a draft RFC for a subsystem to protect values in spans
of data alive at specific timestamps from being garbage collected.

Release note: None

42053: changefeedccl: add scan boundaries based on change in set of columns r=danhhz a=aayushshah15

Currently, the changefeed poller detects a scan boundary when 
detects that the last version of a table descriptor has a 
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point
at which the schema change backfill completes. This is incorrect since 
the dropped column is logically dropped before this point.

This PR corrects this problem by instead checking that the last version of the descriptor has a mutation AND that the number of columns in the current table descriptor is different from the number of columns in the last table descriptor.

Fixes #41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                            dropped column when the table descriptor drops
                            that column.

42494: storage: Implement teeing Engine to test/compare pebble and rocksdb r=itsbilal a=itsbilal

This change adds a new Engine implementation: TeePebbleRocksDB, as well
as associated objects (batch, snapshots, files, iterators). This engine
type spins up instances of both pebble and rocksdb under the hood, writes
to both of them in all write paths, and compares their outputs in the
read path. A panic is thrown if there's a mismatch.

This engine can be used in practice with `--storage-engine pebble+rocksdb`, or
the related env variable `COCKROACH_STORAGE_ENGINE` as in
`COCKROACH_STORAGE_ENGINE=pebble+rocksdb make test ...`.

Fixes #42381.

Release note: None

42635: importccl: don't diasable nullif option when escape character is spec… r=spaskob a=spaskob

…ified

By default the string 'NULL' is parsed as null value in DELIMITED IMPORT mode.
It's usefull to treat 'NULL' as just a normal string and parse it verbatim;
this can be accomplished by providing option `WITH fields_escaped_by = '\'` and
using '\N' to specify null values and so in this case the parser upon seeing
the string 'NULL' will treat it as a normal string. However if the customer
provides their own null string via `WITH nullif = 'my_null'` it does not make
sense to disregard it if they use escaping as well for some other purposes,
for example escaping the field separator. A typical use case is when
the empty string should be treated as null.

Fixes: https://github.com/cockroachlabs/support/issues/345.

Release note (bug fix): when custom nullif is provided always treat it as null.

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
Co-authored-by: Aayush Shah <aayush.shah15@gmail.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Spas Bojanov <spas@cockroachlabs.com>
@craig craig bot closed this as completed in 7f59958 Nov 21, 2019
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Nov 21, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
aayushshah15 added a commit to aayushshah15/cockroach that referenced this issue Dec 2, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
danhhz pushed a commit to danhhz/cockroach that referenced this issue Dec 6, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
danhhz pushed a commit to danhhz/cockroach that referenced this issue Dec 10, 2019
Currently, the changefeed poller detects a scan boundary when it
detects that the last version of a table descriptor has a pending
mutation but the current version doesn't. In case of an `ALTER TABLE
DROP COLUMN` statement, the point at which this happens is the point at
which the schema change backfill completes. However, this is incorrect
since the dropped column gets logically dropped before that.

This PR corrects this problem by instead using the set of column
descriptors within the current and previous table descriptors to detect a
scan boundary.

Fixes cockroachdb#41961

Release note (bug fix): Changefeeds now emit backfill row updates for
                        dropped column when the table descriptor drops
                        that column.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant