-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YSQL] Spurious "Catalog Version Mismatch: A DDL occurred while processing this query" #1457
Comments
same problem with a very basic table postgres postgres=# \c postgres postgres=# select * from employee; in my case, something strange between varchar .. text ... W0613 15:44:12.054183 2119 tablet_rpc.cc:327] Query error (yb/tserver/tablet_service.cc:1090): Failed Read(tablet: 18b0d1e813314be988f355007142cc25, num_ops: 1, num_attempts: 1, txn: 00000000-0000-0000-0000-000000000000) to tablet 18b0d1e813314be988f355007142cc25 on tablet server { uuid: 9a71da20626a4691b75c9c92b512e41b private: [host: "10.39.0.2" port: 9100] pub lic: [host: "yb-tserver-2.yb-tservers" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" after 1 attempt(s): Catalog Version Mi smatch: A DDL occurred while processing this query. Try Again. W0613 15:45:11.522013 35 tablet_server.h:171] Ignoring ysql catalog version update: new version too old. New: 288, Old: 620 |
I encountered the issue when using ysqlsh binary from yugabyte-ce-1.2.11.0. Use the ysqlsh in the ~/bin folder in the t-server. Every query fails, even after retry. Initially there's some SQL workloads running when the issue occurs. After stopping the workload, i reconnect ysqlsh, issue still occurs. The workload is 'SqlInserts' from yb-sample-apps(https://github.com/YugaByte/yb-sample-apps). |
… server shared memory Summary: This diff adds a generic `SharedMemorySegment` class, which can be used to create or open anonymous shared memory segments. `SharedMemorySegment` provides an abstraction around platform specific details of shared memory, provides ownership (RAII), and allows anonymous shared memory to be passed through the `exec` family of system calls (using a file descriptor). Another class `TServerSharedMemory` was added as a thin wrapper around a `SharedMemorySegment`, which is used to share memory between a tablet server and any local Postgres backends. Each tablet server stores its `ysql_catalog_version` in shared memory, which is accessed by the local postgres instance before every query. This allows postgres to refresh the catalog cache before executing a query, rather than checking on the t-server, which fails and retries if a refresh is needed. This change is also required for #869, as certain queries like `SET ROLE` never reach a tablet server. May fix/mitigate #1457 + #1358 as well. Test Plan: Added integration tests to `TestPgCacheConsistency.java`. Without the refreshing change made in this diff, most of the added tests fail. Some fail due to failed catalog version checks during a non-retryable query, and others simply slip under the radar and proceed with the query using the stale catalog (giving an incorrect result). Added shared memory specific unit tests to `shared_mem-test.cc` and `tserver_shared_mem-test.cc`. Manually tested in the following configurations: - Building from source and running in OSX, Centos 7, and Ubuntu 18.04. - Building from source and running in Centos 7 and Ubuntu 18.04 docker containers, running on Centos 7. - Building a release on Centos 7 and deploying the release in an Ubuntu 18.04 docker container, running on Centos 7 and OSX. Manual testing was performed primarily to ensure that the system-specific compile-time and run-time checks in `shared_mem.cc` work as expected. The expected (and observed) results were as follows: - When running outside docker, OSX creates shared memory files under `/tmp`, Centos 7 creates shared memory files under `/dev/shm`, and Ubuntu 18.04 uses `memfd_create`. - When running inside Ubuntu 18.04 docker containers, `/dev/shm` is used when the container runs on Centos 7 (using host kernel), and `memfd_create` is used when running on OSX. - Releases built on a version of linux which does not support `memfd_create` still use `memfd_create` when deployed on a version of the kernel which does support the system call. Reviewers: mihnea, dmitry, sergei, mikhail Reviewed By: sergei, mikhail Subscribers: kannan, yql Differential Revision: https://phabricator.dev.yugabyte.com/D6716
Summary: When the catalog config was being changed (e.g. increment metadata version), in some instances, we only updated the in-memory state, but not the sys catalog. So, if the master leader changed the catalog config ends up being reset. This can lead to a `Catalog Version Mismatch` error that does not get fixed by retries/cache refreshes -- because the (new) master leader will keep returning an old version that doesn't match the (logically correct) one that tservers cached from the original master leader. Test Plan: TestPgCatalogVersionPersistence.java Reviewers: mikhail, neha, robert, srhickma, bogdan Reviewed By: bogdan Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D6656
I reproduced the issue described here using a single node YugaByte DB cluster created using Version 1.2.9.0. I used a MacBook running macOS Mojave Verison 10.14.5.
The following account provides the text of two
.sql
files. Save them as: (1)Startup.sql
; and (2)Catalog_Version_Mismatch.sql
as indicated. The second file uses the first file via the\i
command. You will need two concurrentysqlsh
sessions as explained in comments in the second file. I refer to these as "BLUE" and "YELLOW" because I found it helpful to set the background colors of my two terminal windows differently.The issue manifests, or does not, according to the precise moment at which you start the second
ysqlsh
session.Step through
Catalog_Version_Mismatch.sql
by hand, pasting the sections of code into the BLUE or the YELLOW session as explained in the comments.The text was updated successfully, but these errors were encountered: