Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] Spurious "Catalog Version Mismatch: A DDL occurred while processing this query" #1457

Closed
bllewell opened this issue May 30, 2019 · 3 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL)

Comments

@bllewell
Copy link
Contributor

bllewell commented May 30, 2019

I reproduced the issue described here using a single node YugaByte DB cluster created using Version 1.2.9.0. I used a MacBook running macOS Mojave Verison 10.14.5.

The following account provides the text of two .sql files. Save them as: (1) Startup.sql; and (2) Catalog_Version_Mismatch.sql as indicated. The second file uses the first file via the \i command. You will need two concurrent ysqlsh sessions as explained in comments in the second file. I refer to these as "BLUE" and "YELLOW" because I found it helpful to set the background colors of my two terminal windows differently.

The issue manifests, or does not, according to the precise moment at which you start the second ysqlsh session.

Step through Catalog_Version_Mismatch.sql by hand, pasting the sections of code into the BLUE or the YELLOW session as explained in the comments.

----- Startup.sql ------------------------------------------
-- Improve readability 
\set ECHO 'none'
\set QUIET 'on'
\pset footer off
\set PROMPT1 '%R%  '
\set PROMPT2 '%R%  '

-- Reduce confusion in demos of "isolation level" functionality
\set AUTOCOMMIT 'off'
------------------------------------------------------------
----- Catalog_Version_Mismatch.sql -------------------------
--
-- ERROR: Catalog Version Mismatch:
-- A DDL occurred while processing this query. Try Again.
--
-- To provoke the error, start ysqlsh in two sessions so that
-- each sees the same schema in the same database and is running
-- as the same user who can create, drop, and populate tables.
-- Then follow the instructions.
--
-- To the avoid the error, start only the BLUE ysqlsh session.
-- Follow the "In the BLUE ysqlsh session" instructions.
-- Only then, start the YELLOW ysqlsh session and continue from
-- "In the YELLOW ysqlsh session"

-- In the BLUE ysqlsh session
\i Startup.sql
begin isolation level read committed;
drop table if exists t;
create table t(n int primary key);
insert into t(n)
select generate_series(0, 9);
commit;

select n from t order by n; rollback;

begin isolation level read committed;
insert into t(n) values(17);
select n from t order by n;

-- In the YELLOW ysqlsh session
\i Startup.sql
begin isolation level read committed;

-- OK so far.
-- But the "insert" causes this:
-- ERROR: Catalog Version Mismatch: A DDL occurred while processing this query. Try Again.
insert into t(n) values(42);

------------------------------------------------------------
@bllewell bllewell added the area/ysql Yugabyte SQL (YSQL) label May 30, 2019
@bllewell bllewell changed the title Spurious "Catalog Version Mismatch: A DDL occurred while processing this query" [YSQL] Spurious "Catalog Version Mismatch: A DDL occurred while processing this query" May 30, 2019
@eskuai
Copy link

eskuai commented Jun 13, 2019

same problem with a very basic table postgres

postgres=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# \d employee
Table "public.employee"
Column | Type | Collation | Nullable | Default
----------+-------------------+-----------+----------+---------
id | integer | | not null |
name | character varying | | |
age | integer | | |
language | character varying | | |
Indexes:
"employee_pkey" PRIMARY KEY, lsm (id HASH)

postgres=# select * from employee;
select * from employee;
ERROR: Query error: Catalog Version Mismatch: A DDL occurred while processing this query. Try Again.
postgres=#

in my case, something strange between varchar .. text ...

W0613 15:44:12.054183 2119 tablet_rpc.cc:327] Query error (yb/tserver/tablet_service.cc:1090): Failed Read(tablet: 18b0d1e813314be988f355007142cc25, num_ops: 1, num_attempts: 1, txn: 00000000-0000-0000-0000-000000000000) to tablet 18b0d1e813314be988f355007142cc25 on tablet server { uuid: 9a71da20626a4691b75c9c92b512e41b private: [host: "10.39.0.2" port: 9100] pub lic: [host: "yb-tserver-2.yb-tservers" port: 9100] cloud_info: placement_cloud: "cloud1" placement_region: "datacenter1" placement_zone: "rack1" after 1 attempt(s): Catalog Version Mi smatch: A DDL occurred while processing this query. Try Again.
2019-06-13 15:44:12.054 UTC [2110] ERROR: Query error: Catalog Version Mismatch: A DDL occurred while processing this query. Try Again.
2019-06-13 15:44:12.054 UTC [2110] STATEMENT: select * from users;

W0613 15:45:11.522013 35 tablet_server.h:171] Ignoring ysql catalog version update: new version too old. New: 288, Old: 620
W0613 15:45:12.523046 35 tablet_server.h:171] Ignoring ysql catalog version update: new version too old. New: 288, Old: 620
W0613 15:45:13.523861 35 tablet_server.h:171] Ignoring ysql catalog version update: new version too old. New: 288, Old: 620
W0613 15:45:14.524498 35 tablet_server.h:171] Ignoring ysql catalog version update: new version too old. New: 288, Old: 620

@louissheng
Copy link

I encountered the issue when using ysqlsh binary from yugabyte-ce-1.2.11.0. Use the ysqlsh in the ~/bin folder in the t-server. Every query fails, even after retry.
ysqlsh (11.2)
Type "help" for help.
postgres=# \d link;
id | integer | | not null | nextval('link_id_seq1'::regclass)
url | character varying(255) | | not null |
name | character varying(255) | | not null |
description | character varying(255) | | |
rel | character varying(50) | | |
postgres=# select * from link;
ERROR: Query error: Catalog Version Mismatch: A DDL occurred while processing this query. Try Again.

Initially there's some SQL workloads running when the issue occurs. After stopping the workload, i reconnect ysqlsh, issue still occurs. The workload is 'SqlInserts' from yb-sample-apps(https://github.com/YugaByte/yb-sample-apps).

yugabyte-ci pushed a commit that referenced this issue Jul 10, 2019
… server shared memory

Summary:
This diff adds a generic `SharedMemorySegment` class, which can be used to create or open anonymous shared memory segments. `SharedMemorySegment` provides an abstraction around platform specific details of shared memory, provides ownership (RAII), and allows  anonymous shared memory to be passed through the `exec` family of system calls (using a file descriptor). Another class `TServerSharedMemory` was added as a thin wrapper around a `SharedMemorySegment`, which is used to share memory between a tablet server and any local Postgres backends.

Each tablet server stores its `ysql_catalog_version` in shared memory, which is accessed by the local postgres instance before every query. This allows postgres to refresh the catalog cache before executing a query, rather than checking on the t-server, which fails and retries if a refresh is needed.

This change is also required for #869, as certain queries like `SET ROLE` never reach a tablet server. May fix/mitigate #1457 + #1358 as well.

Test Plan:
Added integration tests to `TestPgCacheConsistency.java`. Without the refreshing change made in this diff, most of the added tests fail. Some fail due to failed catalog version checks during a non-retryable query, and others simply slip under the radar and proceed with the query using the stale catalog (giving an incorrect result).

Added shared memory specific unit tests to `shared_mem-test.cc` and `tserver_shared_mem-test.cc`.

Manually tested in the following configurations:
 - Building from source and running in OSX, Centos 7, and Ubuntu 18.04.
 - Building from source and running in Centos 7 and Ubuntu 18.04 docker containers, running on Centos 7.
 - Building a release on Centos 7 and deploying the release in an Ubuntu 18.04 docker container, running on Centos 7 and OSX.
Manual testing was performed primarily to ensure that the system-specific compile-time and run-time checks in `shared_mem.cc` work as expected. The expected (and observed) results were as follows:
 - When running outside docker, OSX creates shared memory files under `/tmp`, Centos 7 creates shared memory files under `/dev/shm`, and Ubuntu 18.04 uses `memfd_create`.
 - When running inside Ubuntu 18.04 docker containers, `/dev/shm` is used when the container runs on Centos 7 (using host kernel), and `memfd_create` is used when running on OSX.
 - Releases built on a version of linux which does not support `memfd_create` still use `memfd_create` when deployed on a version of the kernel which does support the system call.

Reviewers: mihnea, dmitry, sergei, mikhail

Reviewed By: sergei, mikhail

Subscribers: kannan, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D6716
yugabyte-ci pushed a commit that referenced this issue Jul 11, 2019
Summary:
When the catalog config was being changed (e.g. increment metadata version),
in some instances, we only updated the in-memory state, but not the sys catalog.
So, if the master leader changed the catalog config ends up being reset.

This can lead to a `Catalog Version Mismatch` error that does not get fixed by
retries/cache refreshes -- because the (new) master leader will keep returning an old version
that doesn't match the (logically correct) one that tservers cached from the original master leader.

Test Plan: TestPgCatalogVersionPersistence.java

Reviewers: mikhail, neha, robert, srhickma, bogdan

Reviewed By: bogdan

Subscribers: yql

Differential Revision: https://phabricator.dev.yugabyte.com/D6656
@m-iancu
Copy link
Contributor

m-iancu commented Jul 11, 2019

Should be fixed now, there were two separate issues fixed in c7b3b0d and ee336d6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL)
Projects
None yet
Development

No branches or pull requests

6 participants