Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I join two tables in Superset ? I need to combine the result in order to visualize the data #8645

Closed
ghost opened this issue Nov 25, 2019 · 16 comments
Labels
inactive Inactive for >= 30 days

Comments

@ghost
Copy link

ghost commented Nov 25, 2019

How do I join two tables in Superset ? I need to combine the result in order to visualize the data

@issue-label-bot
Copy link

Issue-Label Bot is automatically applying the label #question to this issue, with a confidence of 0.98. Please mark this comment with 👍 or 👎 to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

@suddjian
Copy link
Member

You can use SQL Lab (+new button > SQL Query) to write any sql you want, and then hit the "Explore" button to make a chart from your query.

@mistercrunch
Copy link
Member

You may also want to create a view a view in your database and use this view as a datasource. It's effectively the same as the solution above, except the view is stored in the database and can/could be reused outside of Superset

@barmic12
Copy link

barmic12 commented Nov 26, 2019

I'm not sure that I understand SQL Lab View correctly. It doesn't create SQL view, right? I checked that in my database, but I didn't find them.
So, why don't you mention about SQL Lab View in FAQ?
https://superset.incubator.apache.org/faq.html#can-i-query-join-multiple-tables-at-one-time

It took me a long time to discover that with a button Vizualize I can create graph from multiple tables. Before that I was sure that I had to create views in SQL to achieve this.

@ghost
Copy link
Author

ghost commented Nov 27, 2019 via email

@stale
Copy link

stale bot commented Jan 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label Jan 26, 2020
@mistercrunch
Copy link
Member

mistercrunch commented Jan 28, 2020

The "Explore" button in SQL Lab is effectively creating a Superset-managed view. The "datasource" object that gets created stores that SQL in its definition, and runs queries against it the same way that it would against a table, but as a subquery.

We did a lot of design work as part of SIP-34 that should make this flow much more clear.

@stale stale bot removed the inactive Inactive for >= 30 days label Jan 28, 2020
@ManZzup
Copy link

ManZzup commented Mar 4, 2020

Is this still open? like to work on this since it aligns with a current requirement we are having

@thunderbird57
Copy link

thunderbird57 commented Mar 9, 2020

Is this still open? like to work on this since it aligns with a current requirement we are having

I don't think we need to do something here. There are 2 options

  • 1 Join in SQL Lab - It will be stored in Superset internal database and the view will not be available outside Superset.
  • 2 Create a view in source database by joining those tables and use the view inside Superset

Number 2 is preffered option mostly as it pushes data prep to Database itself and make it available for other BI platforms too.

@stale
Copy link

stale bot commented May 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. For admin, please label this issue .pinned to prevent stale bot from closing the issue.

@stale stale bot added the inactive Inactive for >= 30 days label May 9, 2020
@stale stale bot closed this as completed May 16, 2020
@linganmm
Copy link

Can the 2nd option be done with superset? Or does the user have to log on the backend database and create a view on it?

@asemprini
Copy link

So... if I have a fact table with dimensions, I have to create a super view with all the fields I will be using? It's the only way?

@m-ocean-it
Copy link

@linganmm:

Can the 2nd option be done with superset? Or does the user have to log on the backend database and create a view on it?

You can do that right from Superset! You just need to make sure the "Allow CREATE VIEW AS" is enabled for that specific database.

@ttauzo
Copy link

ttauzo commented Mar 18, 2024

it is still not an efficient solution because:

  1. you should be able to join by click all fact and dimension tables bases on relationships (why to use sql for sth that can be clickable in a tool for business users)
  2. still if you join all tables you can apply filter either:
  • on technical ids that use db index but are not understandable for business user or
  • on "business name" but then supserset will run a query with condition on a column without index (you would have to duplicate indexes and have indexes on both technical and business columns which is waste of resources)
    Techncial and business "layer" should be separated

@mistercrunch
Copy link
Member

This write up explains a bit more where Superset stands:
https://preset.io/blog/dataset-centric-visualization/

@rusackas
Copy link
Member

I'll add that there are many tools and techniques to do this lower in the stack, including semantic layers like Cube or DB-like tools such as Trino. It might also be worth mentioning here that Superset does have an experimental in-memory "meta database" that one can use to join... but I wouldn't use it on large joins... it's more for lighter tasks like CSV-driven workflows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive Inactive for >= 30 days
Projects
None yet
Development

No branches or pull requests

10 participants