Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom SQL queries with filters #1996

Closed
merkylove opened this issue Jan 17, 2017 · 10 comments · Fixed by #5547
Closed

Custom SQL queries with filters #1996

merkylove opened this issue Jan 17, 2017 · 10 comments · Fixed by #5547

Comments

@merkylove
Copy link

merkylove commented Jan 17, 2017

Is it possible to specify custom SQL query like using SQL Lab but with an opportunity to apply interactive FilterBox to it?

It would be really nice in tasks where you have to visualize aggregative metrics.
For example, if you have smth like that:
user1, log1,
user2, log2,
...
userN, logM

And you want to plot a distribution of counters of logs for users in a following way:

SELECT counter, COUNT(*)
FROM
(SELECT user, COUNT(*) AS counter 
FROM table_name 
WHERE conditions_from_interactive_filter_box 
GROUP BY user
) AS foo
GROUP BY counter;

If I am not mistaken you can not do it directly from Superset slices but it is a very common case.

@tianlinzx
Copy link

I am also interesting on this request. Otherwise, a tableview based on the SQL query without Filtering will do scan on the whole table we are querying. It will be a large cost for the database.

@mistercrunch
Copy link
Member

Well so the tables defined with SQL, or "Superset Views" if we can call them that, do support templating, and from the template it's possible to read the form_data. It's not super well documented but you can certainly experiment with it.

Here's what we squeeze in as template args:
https://github.com/apache/incubator-superset/blob/master/superset/connectors/sqla/models.py#L370

We did use that to make sure partition pruning was occurring in complex use cases, specifically on the date which was the partition column.

You can imagine a table defined as SQL like

SELECT  * FROM tbl
WHERE 
    ds >= '{{ from_dttm.isoformat() }}' AND
    ds < '{{ to_dttm.isoformat() }}' 

It's also possible to do conditional logic and much more in Jinja (templating language) so you could do something like this (this is not tested at all, would have to be teaked...)

{% set country = form_data.get('filters')[0].get('val')[0] %}
SELECT  * FROM tbl
WHERE 
    0=0
{% if country %}
    AND country = '{{ country }}'
{% endif %}

@gdeepali
Copy link

Is it possible to get current logged in user using jinja templates?
For example, I want to do something like

SELECT * FROM tbl
WHERE
uername = {{current_username}}

@mistercrunch
Copy link
Member

more like:

SELECT * FROM tbl
WHERE
uername = '{{ current_username() }}'

@pranavoyo
Copy link

@mistercrunch, I am getting this error in sql lab when running {{ current_username() }},

Template rendering failed: '_AppCtxGlobals' object has no attribute 'user'

Please help!

@gwbx
Copy link

gwbx commented May 2, 2018

I tried to access "time_grain_sqla" (or any other key) in form_data dict in Explore but I am afraid that it is empty.

I think it was added at some point:
a0ddbb9#diff-f25a00842d88e3e5cc4e4bcc822d40f7

but removed since i don't know why:

  def get_sqla_query(  # sqla
           self,
           groupby, metrics,
           granularity,
           from_dttm, to_dttm,
           filter=None,  # noqa
           is_timeseries=True,
           timeseries_limit=15,
           timeseries_limit_metric=None,
           row_limit=None,
           inner_from_dttm=None,
           inner_to_dttm=None,
           orderby=None,
           extras=None,
           columns=None,
           order_desc=True,
           prequeries=None,
           is_prequery=False,
       ):
       """Querying any sqla table from this common interface"""
       template_kwargs = {
           'from_dttm': from_dttm,
           'groupby': groupby,
           'metrics': metrics,
           'row_limit': row_limit,
           'to_dttm': to_dttm,
           'filter': filter,
           'columns': {col.column_name: col for col in self.columns},

       }

So what I get is the empty dict from the constructor:

        self.context = {
            'url_param': url_param,
            'current_user_id': current_user_id,
            'current_username': current_username,
            'form_data': {},
        }

https://github.com/apache/incubator-superset/blob/510ae84b3b3c3880393b84ba1db6d9979ecb63e0/superset/jinja_context.py#L92

@ksaagariconic
Copy link
Contributor

Hello there - We had a huge stack of our reports built using 0.20.4 which let us use custom filters that were not part of the data source in our queries, and now when we upgrade to 0.24.3 or higher we find that these custom filters are rejected or made blank at the jinja_parser stage.

Question - Is there a reason this change was done to exclude external custom filters, because we are wondering if there has been a more elegant and cleaner way to do this?

We are concerned that this feature might have been completely deprecated which sometimes aggravates our use cases by pulling a lot more data at every pull and making all the dashboards slower. Can someone please help? Thank you!

@villebro
Copy link
Member

villebro commented Aug 1, 2018

@ksaagariconic Could you elaborate with an example what the custom filters looked like, and where they were defined? I believe form_data was deprecated a few versions ago, but some of those fields were made available as own variables, e.g. groupby and filter.

@ksaagariconic
Copy link
Contributor

Thanks for the instant response @villebro - we had a lot of queries structured with custom filter predicates like this,

SELECT country, sum(population) as population 
FROM 
tbl
WHERE 0=0
{% if  form_data.get('filters')[0].get('val')[0] %}
    AND continent = '{{  form_data.get('filters')[0].get('val')[0] }}'
{% endif %}
GROUP BY 1

What's happening now is that this predicate is completely getting ignored. It shows up in the frontend as "filter" but doesn't show up in the queryobj or eventual query.

We tried this alternative, to use the new filter context, but it says filter is not defined.

SELECT country, sum(population) as population 
FROM 
tbl
WHERE 0=0
{% if  filter[0].get('val')[0] %}
    AND continent = '{{  filter[0].get('val')[0] }}'
{% endif %}
GROUP BY 1

What however works is if we have a column by the same name as the filter, e.g. a filter called country with the same variable name works dynamically - but as a predicate on the super query, and not on the actual query in itself.

Does this make it clearer? Thanks for offering to help

@thuylevn
Copy link

thuylevn commented Mar 1, 2019

Hi all,
I am using superset. Do you know how to Get value of Grain in filter from a query ( as a parameter {{ ??? }} )?

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants