Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Columns not properly resolving on clusterAllReplicas + merge + INNER JOIN + Analyzer #63105

Open
jrdi opened this issue Apr 29, 2024 · 0 comments
Labels
analyzer Issues and pull-requests related to new analyzer unexpected behaviour

Comments

@jrdi
Copy link
Contributor

jrdi commented Apr 29, 2024

First of all, I recognize the query is a bit weird, but I guess it's worth sharing the issue and deciding later whether it should be tackled.

You can reproduce the setup with the following snippet:

DROP DATABASE IF EXISTS analyzer ON CLUSTER tinybird;
CREATE DATABASE analyzer ON CLUSTER tinybird;

CREATE TABLE analyzer.ids ON CLUSTER tinybird (id UUID, whatever String) Engine=MergeTree ORDER BY tuple();
CREATE TABLE analyzer.data ON CLUSTER tinybird (id UUID, event_time DateTime, status String) Engine=MergeTree ORDER BY tuple();

INSERT INTO analyzer.ids VALUES ('a1451105-722e-4fe7-bfaa-65ad2ae249c2', 'whatever');
INSERT INTO analyzer.data VALUES ('a1451105-722e-4fe7-bfaa-65ad2ae249c2', now(), 'CREATED');

When running the following query with the analyzer (in master) you get this error:

SELECT
    id,
    whatever
FROM analyzer.ids AS l
INNER JOIN clusterAllReplicas(tinybird, merge(analyzer, 'data*')) AS s ON l.id = s.id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 1;

Received exception from server (version 24.4.1):
Code: 8. DB::Exception: Received from localhost:9000. DB::Exception: Cannot find column `__table2.event_time` in source stream, there are only columns: [id, event_time, status]. (THERE_IS_NO_COLUMN)

I have tried different approaches to see if I can get those columns properly resolved:

  • Removing table aliases: FAILS
SELECT
    id,
    whatever
FROM analyzer.ids
INNER JOIN clusterAllReplicas(tinybird, merge(analyzer, 'data*')) USING id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 1;

Received exception from server (version 24.4.1):
Code: 8. DB::Exception: Received from localhost:9000. DB::Exception: Cannot find column `__table2.event_time` in source stream, there are only columns: [id, event_time, status]. (THERE_IS_NO_COLUMN)
  • Using LEFT JOIN: WORKS
SELECT
    id,
    whatever
FROM analyzer.ids AS l
LEFT JOIN clusterAllReplicas(tinybird, merge(analyzer, 'data*')) AS s ON l.id = s.id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 1;
  • Removing the merge: WORKS
SELECT
    id,
    whatever
FROM analyzer.ids
INNER JOIN clusterAllReplicas(tinybird, analyzer.data) USING id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 1;
  • Removing the clusterAllReplicas: WORKS
SELECT
    id,
    whatever
FROM analyzer.ids AS l
INNER JOIN merge(analyzer, 'data*') AS s ON l.id = s.id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 1;
  • Removing the analyzer: WORKS
SELECT
    id,
    whatever
FROM analyzer.ids AS l
INNER JOIN clusterAllReplicas(tinybird, merge(analyzer, 'data*')) AS s ON l.id = s.id
WHERE (status IN ['CREATED', 'CREATING'])
ORDER BY event_time DESC
SETTINGS allow_experimental_analyzer = 0;

I can't figure out why the combinations of these three are making the query fail, but it's related somehow.

@Algunenano Algunenano added the analyzer Issues and pull-requests related to new analyzer label Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer Issues and pull-requests related to new analyzer unexpected behaviour
Projects
None yet
Development

No branches or pull requests

2 participants