Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]Receive inconsistent query results when connected to Clickhouse(?) #16276

Closed
3 tasks
IKkSssDendi opened this issue Aug 16, 2021 · 15 comments
Closed
3 tasks
Assignees
Labels
data:connect:clickhouse Related to Clickhouse inactive Inactive for >= 30 days need:followup Requires followup

Comments

@IKkSssDendi
Copy link

Using the same Clickhouse database and the same SQL statement, the results obtained by superset and idea are inconsistent; Specifically, superset should only query some data, but I don't know why

Expected results

When I used the pie chart to visualize the data in a Clickhouse database, I found this problem. The queried data was incorrect. To be exact, I only got some data, but I didn't make any settings. I felt that there was a problem somewhere

Screenshots

PIE BAR
REAL RESULT
WRONG

Environment

(please complete the following information):

  • superset version: docker pull apache/superset:master
  • python version: python --version 3.8

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.
@IKkSssDendi IKkSssDendi added the #bug Bug report label Aug 16, 2021
@junlincc junlincc added the data:connect:clickhouse Related to Clickhouse label Aug 16, 2021
@junlincc
Copy link
Member

junlincc commented Aug 16, 2021

Is it only happening in clickhouse data?
What's the expected results from your query? the ones showing in the spreadsheet screenshot? both results from Explore and SQL lab are incorrect? can you double check if any filters, same as in superset was added to your query in excel?

@zhaoyongjie please follow up in the issue, if it's validated by the users with more info come back, let me know.

@junlincc junlincc added need:followup Requires followup and removed #bug Bug report labels Aug 16, 2021
@junlincc junlincc changed the title For the same SQL statement, the results in superset and idea are different [Question]Receive inconsistent query results when connected to Clickhouse(?) Aug 16, 2021
@IKkSssDendi
Copy link
Author

Is it only happening in clickhouse data?
What's the expected results from your query? the ones showing in the spreadsheet screenshot? both results from Explore and SQL lab are incorrect? can you double check if any filters, same as in superset was added to your query in excel?

@zhaoyongjie please follow up in the issue, if it's validated by the users with more info come back, let me know.

So far, this problem has only been found in Clickhouse. The data is migrated from Mysql to Clickhouse. I have used MySQL data for the same query and the same chart, and have not found such problems; Overdue results should look like the second picture

@IKkSssDendi
Copy link
Author

Is it only happening in clickhouse data?
What's the expected results from your query? the ones showing in the spreadsheet screenshot? both results from Explore and SQL lab are incorrect? can you double check if any filters, same as in superset was added to your query in excel?

@zhaoyongjie please follow up in the issue, if it's validated by the users with more info come back, let me know.
In the third figure, the field name 印花税 should be one of the data, but it is regarded as the field name by superset. Is this related to the data storage structure of Clickhouse?

@IKkSssDendi
Copy link
Author

Is it only happening in clickhouse data?
What's the expected results from your query? the ones showing in the spreadsheet screenshot? both results from Explore and SQL lab are incorrect? can you double check if any filters, same as in superset was added to your query in excel?

@zhaoyongjie please follow up in the issue, if it's validated by the users with more info come back, let me know.

same data,same sql.

The first figure queries the data in MySQL

result of superset connect mysql
The second figure queries the data in the Clickhouse
result of superset connect clickhouse

@zhaoyongjie
Copy link
Member

zhaoyongjie commented Aug 17, 2021

@IKkSssDendi Let us double-check this issue

  1. SQLLab treats first data as column name in Clickhous with Superset.
  2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

@IKkSssDendi
Copy link
Author

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

Both。Found a problem that not only first data as column name ,lost some data

@IKkSssDendi
Copy link
Author

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?
just now,i found that all query of clickhouse , the first data alaways be the colunm name

@zhaoyongjie
Copy link
Member

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

Both。Found a problem that not only first data as column name ,lost some data

Okay, thank you for the report.
Which Clickhouse version do you use? Which SQLAlchemy driver do you use?

@IKkSssDendi
Copy link
Author

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

Both。Found a problem that not only first data as column name ,lost some data

Okay, thank you for the report.
Which Clickhouse version do you use? Which SQLAlchemy driver do you use?

Clickhouse --version 20.12.5.14
SQLAlchemy --version 1.3.24
Clickhouse-sqlalchemy --version 0.1.6
clickhouse-driver --verison 0.2.1

Thank you

@IKkSssDendi
Copy link
Author

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

Both。Found a problem that not only first data as column name ,lost some data

Okay, thank you for the report.
Which Clickhouse version do you use? Which SQLAlchemy driver do you use?

If the problem is solved or the root cause of the problem is found, please let me know。Thank you

@IKkSssDendi
Copy link
Author

@IKkSssDendi Let us double-check this issue

1. SQLLab treats first data as column name in Clickhous with Superset.

2. I had observed that the first Pie Chart was missing 2 lines of data, "印花税" and “所得税”.

Can you please confirm which issue is the specific one?

Both。Found a problem that not only first data as column name ,lost some data

Okay, thank you for the report.
Which Clickhouse version do you use? Which SQLAlchemy driver do you use?

Through my test, I found that it is related to clickhouse-sqlalchemy. When I use sqlalchemy-clickhouse and reduce the version of infi.clickhouse_orm to 1.0.4, the query results are normal. However, the above problems will occur when using the latest clickhouse-sqlalchemy

@save-my-heart
Copy link

When you use clickhouse-sqlalchemy, it is better to use native tcp interface.
Change your sqlalchemy uri to this: clickhouse+native://<user>:<password>@<host>:<port>/<database>[?key=value..] .
If you prefer http interface, see this: clickhouse-sqlalchemy #134

@IKkSssDendi
Copy link
Author

When you use clickhouse-sqlalchemy, it is better to use native tcp interface.
Change your sqlalchemy uri to this: clickhouse+native://<user>:<password>@<host>:<port>/<database>[?key=value..] .
If you prefer http interface, see this: clickhouse-sqlalchemy #134

Thank you ,let me try it

@stale
Copy link

stale bot commented Apr 19, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Apr 19, 2022
@rusackas
Copy link
Member

Assuming since this thread went quiet that the driver change fixed the situation. We actually now support clickhouse-connect for what it's worth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data:connect:clickhouse Related to Clickhouse inactive Inactive for >= 30 days need:followup Requires followup
Projects
None yet
Development

No branches or pull requests

5 participants