Problem with query result on db different than the default #12489

natalia-hw · 2020-07-14T10:39:15Z

We have a problem with data counted for a specific query run in DB another than default in our ClickHouse cluster (sharding and replication).

Describe the bug
We have two DBs: default and test.
In each DB we have the same tables:

foo (ReplicatedMergeTree)
bar (Distributed on bar_shard) and bar_shard (ReplicatedMergeTree)

We fill tables with the same data:

for foo, separately on each shard
for bar, by a distributed table

We run the query:
SELECT count(something_else) FROM bar WHERE something_else IN (SELECT something FROM foo);

on DB default result count looks as expected
on DB test result count is equal to count of items from bar_shard (not bar as we expect)

Maybe there is some misunderstanding from our side, but for DB default works just as we expected, only with DB different than default we face the issue.

How to reproduce
I've scripted the reproduction steps on the LTS version 20.3.10.75. Please see https://github.com/nfpp/another_db_issue_repro. It creates the DBs and tables, inserts some data, and performs the queries.

Expected behavior
The same result of query regardless of DB in usage.

Additional context
For us, it looks like the query is partially counted on the default shard (without default DB in the cluster, the query fails).
Temporary we fix it by replacing foo by "{db_name}".foo in the query.
Worth to mention that with GLOBAL IN result is correct, but we by design put those data on each shard separately.

The text was updated successfully, but these errors were encountered:

den-crane · 2020-07-14T12:58:55Z

probably the same as #10471
default database is used on shards
Technically it's not a bug, because this behavior existed from the beginning, though I believe it's not documented.

use system

select * from remote(two_shards,system,one) where dummy in (select dummy from one);
(version 19.13.7): DB::Exception: Table default.one doesn't exist..


select * from remote(two_shards,system,one) where dummy in (select dummy from system.one);
┌─dummy─┐
│     0 │
└───────┘
┌─dummy─┐
│     0 │
└───────┘

select * from remote(two_shards,system,one) where dummy global in (select dummy from one);
┌─dummy─┐
│     0 │
└───────┘
┌─dummy─┐
│     0 │
└───────┘

natalia-hw · 2020-07-27T06:15:51Z

@den-crane I understand, but from our point of view, this looks like rather as a critical bug, because the query is a run in the wrong database. Especially when some kind of ORM is in usage is really inconvenient to replace the usual table name with table and database name. Are you planning to change this behavior? If this is a duplicate I could close it and watch appropriate issue.

alexey-milovidov · 2020-11-18T17:17:15Z

It's non obvious how to solve it because there is no guarantee that the database exists on the remote server.

alexey-milovidov · 2020-11-18T17:19:56Z

Maybe we can always substitute the current database when sending a query to remote server?
It will break some queries but will look more logical.

dgzdot · 2021-01-20T02:57:20Z

@alexey-milovidov Are there any plans for this issue?

alexey-milovidov · 2021-06-13T23:29:47Z

No. It is in "discussion" stage.

filimonov · 2022-12-07T08:16:59Z

One more option is to pass the current database with query and make the target server look fist in that db, and later in it's default database.

natalia-hw added the bug Confirmed user-visible misbehaviour in official release label Jul 14, 2020

den-crane added the comp-distributed Distributed tables label Jul 14, 2020

alexey-milovidov added the st-discussion The story requires discussion /research / expert help / design & decomposition before will be taken label Nov 18, 2020

den-crane mentioned this issue Jan 19, 2021

"Table default.xxx doesn't exist.." Exception occurs when both distributed and local tables are involved #19278

Closed

alexey-milovidov added unexpected behaviour and removed bug Confirmed user-visible misbehaviour in official release labels Jun 13, 2021

den-crane mentioned this issue Jul 16, 2022

databases are not resolved in left join on a multi nodes cluster #39278

Closed

msmans mentioned this issue Oct 6, 2022

Clickhouse: Table default.person_distinct_id2 doesn't exist PostHog/posthog#12117

Closed

2 tasks

guidoiaquinti mentioned this issue Oct 24, 2022

ClickHouse: specify 'default_database' PostHog/charts-clickhouse#607

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with query result on db different than the default #12489

Problem with query result on db different than the default #12489

natalia-hw commented Jul 14, 2020 •

edited

den-crane commented Jul 14, 2020 •

edited

natalia-hw commented Jul 27, 2020

alexey-milovidov commented Nov 18, 2020

alexey-milovidov commented Nov 18, 2020

dgzdot commented Jan 20, 2021

alexey-milovidov commented Jun 13, 2021

filimonov commented Dec 7, 2022

Problem with query result on db different than the default #12489

Problem with query result on db different than the default #12489

Comments

natalia-hw commented Jul 14, 2020 • edited

den-crane commented Jul 14, 2020 • edited

natalia-hw commented Jul 27, 2020

alexey-milovidov commented Nov 18, 2020

alexey-milovidov commented Nov 18, 2020

dgzdot commented Jan 20, 2021

alexey-milovidov commented Jun 13, 2021

filimonov commented Dec 7, 2022

natalia-hw commented Jul 14, 2020 •

edited

den-crane commented Jul 14, 2020 •

edited