Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

20.1 release blockers list #45599

Closed
83 tasks done
jordanlewis opened this issue Mar 2, 2020 · 50 comments
Closed
83 tasks done

20.1 release blockers list #45599

jordanlewis opened this issue Mar 2, 2020 · 50 comments
Assignees
Labels
meta-issue Contains a list of several other issues.

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Mar 2, 2020

PSA: The 20.1 release branch is now cut. The lucky winner SHA is 1225203.

As we entering an important stabilization period, do not backport anything into the release-20.1 branch unless it is part of the release blockers list. Please also add a comment if it blocks the beta releases.

Possible blockers for the 20.1 release.

General

  • Mint the 20.1 cluster version before picking 20.1 SHA [rel-eng / @dt ]

AppDev

Bulk IO

Admin UI (Observability)

Server / UI

KV

SQL Schema

SQL Execution

SQL Planning

Storage

Performance Regressions

@jordanlewis jordanlewis added the meta-issue Contains a list of several other issues. label Mar 2, 2020
@isaactwong isaactwong self-assigned this Mar 3, 2020
@nvanbenschoten nvanbenschoten pinned this issue Mar 11, 2020
@ajwerner ajwerner unpinned this issue Mar 16, 2020
@ajwerner ajwerner pinned this issue Mar 16, 2020
@thoszhang
Copy link
Contributor

I'm adding #46018, which is a correctness bug in schema change rollbacks, to the list. I think it should be considered blocking for the beta since it also affects the correctness of #46152.

@andreimatei
Copy link
Contributor

Added kv: idempotency failure across implicit commit breaks refreshing a parallel commit request #46341

@andreimatei
Copy link
Contributor

Added sql: rollback to savepoint broken in mixed-version (19.2/20.1) clusters #46372

@ajwerner
Copy link
Contributor

Added importccl,gcjob: failed and canceled imports don't seem to clean up their data #46684

@andreimatei
Copy link
Contributor

Added kvserver: v20.1.0-beta.3: received ... results, limit was ... #46652

@tbg
Copy link
Member

tbg commented Mar 30, 2020

I am mitigating #46652 by turning it into a sentry-reported error instead of a crash (https://github.com/cockroachdb/cockroach/pull/46720/files). This may not be enough to remove it from the release blocker list, though.

@tbg
Copy link
Member

tbg commented Mar 30, 2020

@andreimatei looks like you didn't actually add that issue to the list? Done now, though.

@nvanbenschoten
Copy link
Member

Added the following three blockers:

@yuzefovich
Copy link
Member

Added two release blockers (#46664 and #46646).

@thoszhang
Copy link
Contributor

Adding #46715 (PR to fix bug where table/index GC could be delayed indefinitely)

@RaduBerinde
Copy link
Member

Added sql telemetry fixes as a "soft" blocker.

@dhartunian
Copy link
Collaborator

Added Admin UI/Observability blockers.

@otan otan mentioned this issue Mar 30, 2020
24 tasks
@RaduBerinde
Copy link
Member

@dhartunian none of the diagnostics-related ones are on there.. Most importantly #46331. Are we not planning to finish those?

@dhartunian
Copy link
Collaborator

Added statement diagnostics bundle download to Admin UI release blockers as per @RaduBerinde's comment above.

@thoszhang
Copy link
Contributor

Adding #46792 which is a test-only change that fixes a flake.

@thoszhang
Copy link
Contributor

Adding #46818.

@dhartunian
Copy link
Collaborator

Adding two more regressions on our end. Both have PRs already up for review.

@nvanbenschoten
Copy link
Member

Adding #46752 to the list.

@andreimatei
Copy link
Contributor

I've added "sql: recent regression in stopper quiescence time #47011 [knz]". Depending on what ends up being the problem, it might turn out to not be a release blocker, but also perhaps it will be.

@andreimatei
Copy link
Contributor

andreimatei commented Apr 3, 2020

I've added "[ ] kvserver/closedts: setting kv.closed_timestamp.target_duration to 0 does not disable #47010 [ajwerner]"
Not a new issue.

@ajwerner
Copy link
Contributor

ajwerner commented Apr 6, 2020

Added sql: WITH HASH hash column is not null but formula can return NULL #47055

@irfansharif
Copy link
Contributor

Added "kv: invalid Raft truncation decision panic in cli unittests #43605", which I think can cause panics out in the wild.

@miretskiy
Copy link
Contributor

miretskiy commented Apr 6, 2020 via email

@irfansharif
Copy link
Contributor

Will you be able to get #43605 fix in in a day or two?

I think so, I'm typing it up today. I was just going to downgrade the panic.

@tbg
Copy link
Member

tbg commented Apr 7, 2020

Checking off: kvserver: v20.1.0-beta.3: received ... results, limit was ... #46652 [tbg]

@nvanbenschoten will give this bug another look to see if he can spot a bug but either way we don't want to block the release on it. Previous code inspection plus randomized testing was not able to reproduce it. Not satisfying but better than not releasing.

@thoszhang
Copy link
Contributor

I checked off #46152 (19.2 to 20.1 migration for schema change jobs) earlier. The fix for #46818 (sqlmigrations: create GC jobs for failed import/restore jobs from 19.2) is being merged right now; the backport PR is #47144.

@miretskiy
Copy link
Contributor

We are all clear on the release blockers list; starting rc1

@yuzefovich
Copy link
Member

Added the need to backport #47165 as a release blocker (it doesn't need to go into rc1 though).

@jordanlewis
Copy link
Member Author

Added: system.namespace unreadable from SQL from 19.2 nodes in mixed-version state #47167

This issue causes problems for 19.2 nodes in a mixed-version cluster. The nodes will not be able to read from crdb_internal.zones, which causes problems for debug zip, some parts of the admin ui, and a few other related areas.

@otan
Copy link
Contributor

otan commented Apr 8, 2020

adding #47156 - not important for RC.

@dt
Copy link
Member

dt commented Apr 8, 2020

#47167 is a hard blocker -- it probably doesn't block a beta, but it does block release, so that means any sha we start qualifying right now would be considered a beta, not an RC.

@dt
Copy link
Member

dt commented Apr 8, 2020

cc @miretskiy re above w.r.t. #47167 and RC.1

@nvanbenschoten
Copy link
Member

Adding #47219 - which is an old bug responsible for a new assertion failing.

@miretskiy
Copy link
Contributor

@nvanbenschoten is 47219 rc1 blocker as well?

@pbardea
Copy link
Contributor

pbardea commented Apr 9, 2020

Adding #44453.

@nvanbenschoten
Copy link
Member

@nvanbenschoten is 47219 rc1 blocker as well?

I don't know exactly how that's being defined. It is a hard release blocker, so it should block the final release. But I don't think it should stop any intermediate release. Whether this means we should consider a currently qualifying release as a beta instead of an RC is up for discussion (#45599 (comment))

@nvanbenschoten
Copy link
Member

Also, adding "kv: avoid excessively wide range tombstones during Raft snapshot reception" deemed unsafe #44048 (comment) as a hard release blocker. We need to back that change out of the release, as it's too risky to do anything more with it at this point in the release cycle and it hypothetically risks replica inconsistencies.

@knz
Copy link
Contributor

knz commented Apr 9, 2020

Adding:

sql: upgrade from 19.x to 20.x eliminates all computed columns #47263

@RaduBerinde
Copy link
Member

Added sql: hard scan limit removed incorrectly #47283. ETA for a fix is today

@thoszhang
Copy link
Contributor

Adding #47324 as a possible blocker.

@yuzefovich
Copy link
Member

yuzefovich commented Apr 11, 2020

Adding the need to backport #47365 (sql: ignore soft limits on scan nodes for distsql planning) as a release blocker (not RC blocker though).

@petermattis
Copy link
Collaborator

Adding the need to backport #47350. I think we could technically release without this backport as it only affects Pebble, but we'd have to recommend against using Pebble and it would set back our rollout plans for Pebble. These fixes should only affect Pebble.

@otan
Copy link
Contributor

otan commented Apr 13, 2020

Adding #47425 for release blocker - not important for RC.

@nvanbenschoten
Copy link
Member

Adding #47187 as a hard release blocker.

@andreimatei
Copy link
Contributor

Added "kv: txn recovery false positive #47337 [andrei]". It's a bug causing transactions to be wrongfully committed, so it's quite bad I think.

@nvanbenschoten
Copy link
Member

Adding #47471 as a soft release blocker. It's an old bug, but it's worse now that we perform significantly more ranged intent resolution. The fix is targetted and should be up today.

@thoszhang
Copy link
Contributor

thoszhang commented Apr 14, 2020

Adding #47312 as a soft release blocker. Edit: Forgot to mention that the fix is targeted and already merged to master; #47490 is the backport PR.

@thoszhang
Copy link
Contributor

Adding #47532 as a potential "soft" release blocker.

For context, this is a regression I introduced as part of fixing #47324, and it'll take a one-line change to fix.

@petermattis
Copy link
Collaborator

Added #47406 which is fixed by cockroachdb/pebble#629. There is a subtle incompatibility between RocksDB and Pebble bloom filters which was causing the failures in #47406. While we've only seen problems in a test which switches back and forth between RocksDB and Pebble, I'm very anxious this could be causing other rare problems as Pebble is used to create sstables used for ingestion (e.g. during rebalancing, import, and restore) even with RocksDB is the storage engine in use.

@otan
Copy link
Contributor

otan commented May 11, 2020

considering 20.1 released.

@otan otan closed this as completed May 11, 2020
@otan otan unpinned this issue May 11, 2020
@otan otan added this to Released / Cancelled in Release (obsolete) May 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-issue Contains a list of several other issues.
Projects
None yet
Development

No branches or pull requests