Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variable support #5918

Open
gregology opened this issue Aug 22, 2016 · 43 comments
Open

Variable support #5918

gregology opened this issue Aug 22, 2016 · 43 comments

Comments

@gregology
Copy link

gregology commented Aug 22, 2016

We're writing adhoc queries for data interested business users. Variable support or alike would be great, something like MySQL;

SET @user_email = 'foo@bar.com';
SELECT *
FROM users
WHERE email_address = @user_email;

Alternatively parameters similar to Impala's implementation;

SELECT *
FROM users
WHERE email_address = $user_email;

In the meantime, I've come up with a hacky solution using WITH statements :(

WITH variables AS (SELECT 'foo@bar.com' AS user_email)
SELECT *
FROM
  users u
  JOIN variables v ON u.email_address = v.user_email
@GodenYao
Copy link

when this is going to get merged and released?

@textboy
Copy link

textboy commented Nov 17, 2017

When could it be ready ?

@talgalili
Copy link
Contributor

(PING)
I'm also interested in this feature. Any chance this could get added in the near future?
Thanks.

@kokosing
Copy link
Contributor

kokosing commented Apr 5, 2018

You can also use PREPARED STATEMENTS, see https://prestodb.io/docs/current/sql/prepare.html

@talgalili
Copy link
Contributor

Hi @kokosing
I tried and couldn't figure out how to use it as an alternative to SET, can you suggest the code to do it?

@kokosing
Copy link
Contributor

kokosing commented Apr 5, 2018

The above example would look like:

PREPARE my_select FROM
SELECT *
FROM users
WHERE email_address = ?;
EXECUTE my_select USING 'foo@bar.com';
EXECUTE my_select USING 'foo2@bar.com';
DEALLOCATE PREPARE my_select;

@talgalili
Copy link
Contributor

Interesting, thanks. Could it be done with more than one argument? (?1, ?2 etc?)

@kokosing
Copy link
Contributor

kokosing commented Apr 5, 2018

Please see the second example from https://docs.starburstdata.com/latest/sql/execute.html

@hockeydave
Copy link

+1.. This would really be great to have.

@ilopezfr
Copy link

+1 Would make my life much easier!

@ankitdixit
Copy link
Contributor

ankitdixit commented Aug 13, 2018

@kokosing The issue with using prepared statements is that it does not serve all usecases.
For example:

  • Let us say I am using presto via jupyter/Zeppelin notebooks and and execution 4 queries in order.
set var_a = 'xxx';
Q1: select var_a from T1;
Q2: select var_a from T2;
and so on...

And i need to see outputs after execution of each of those queries. Since prepared statements support just once command at a time, I will have to write a prepared statement for each one of them, which defeats the purpose.
Please let me know if i am somehow mistaken.
The above solution server the usecase of one query multiple values for params but now multiple queries with one value for params.
I can take this up, if this makes sense.

@yotamoron
Copy link

query variable will surely make my queries much shorter and human readable - +1!

@rongrong
Copy link
Contributor

PostgreSQL supports this on the client side (https://www.postgresql.org/docs/current/app-psql.html). Seems like a reasonable thing to do on presto cli, thoughts?

@alexWhitworth
Copy link

I'd love to be able to do something like below:

WITH CTE_p80 as (
  SELECT APPROX_PERCENTILE(<column>, 0.8) as col_p80
  FROM <table>
)

SET my_var_p80 = SELECT col_p80 FROM CTE_p80;

SELECT 
...
FROM
...
WHERE <colnew>   >= '${hiveconf:my_var_p80}'

@dwgillies
Copy link

I just need variables with numbers, such as

DAYS=180
...
WHERE (ds >= CAST(date_add('day', -DAYS, NOW()) as VARCHAR))

It's extremely paintful to have to edit this number in literally 6 different lines of my script every time PRESTO cluster times out because the query exceeds the standard interactive time limit.

@BrazilForever11
Copy link

+1, it would be very helpful

@rongrong
Copy link
Contributor

I'd propose to support the basic form of this, which seems to be what people are asking for most:
SET variable = constant.

If we can agree on this syntax we can move forward to discuss implementation. Once we settle on how this should be implemented, we can mark this as available-to-pickup and see whether anyone would volunteer to work on this.

@rschlussel
Copy link
Contributor

This generally looks good to me. My only question is whether we should add another word to the syntax to distinguish it from SET SESSION/ SET ROLE/ SET PATH

@talgalili
Copy link
Contributor

talgalili commented Aug 26, 2019 via email

@rschlussel
Copy link
Contributor

set var sounds good to me

@rongrong
Copy link
Contributor

rongrong commented Aug 26, 2019

SQL spec seems to suggest it's just SET variable_name = variable_value. But I'm not that good at reading spec, so if someone else want to double check that would be great.

15.2
This Subclause modifies Subclause 14.5, “”, in ISO/IEC 9075-4.
Function
Assign a value to an SQL variable, SQL parameter, host parameter, or host variable.
Format
<singleton variable assignment> ::=
SET <assignment target> <equals operator> <assignment source>
[ <XML passing mechanism> ]

@talgalili
Copy link
Contributor

talgalili commented Aug 26, 2019 via email

@rongrong
Copy link
Contributor

Do you have a link?

Nope, SQL Spec is only available for money. 🤣

@rschlussel
Copy link
Contributor

(From 9075-4:2011, which is the part on sql/psm, sections 14.4 and 14.5)

The standard also requires you to first declare your variables using

14.4 <SQL variable declaration> Function 
Declare one or more variables. 
Format 
<SQL variable declaration> ::=
  DECLARE <SQL variable name list> <data type> [ <default clause> ]
<SQL variable name list> ::=
  <SQL variable name> [ { <comma> <SQL variable name> }... ]
Syntax Rules 
1) The specified <data type> is the declared type of each variable declared by the <SQL variable declaration>. 
...
General Rules 
1) If <SQL variable declaration> contains <default clause> DC, then let DV be the <default option> contained in DC. Otherwise let DV be <null specification>. Let SV be the variable defined by the <SQL variable declaration>. The following SQL-statement is effectively executed: 
SET SV = DV 
Conformance Rules 
1) Without Feature P002, “Computational completeness”, conforming SQL language shall not contain a <SQL variable declaration>.

Then you have the variable assignment syntax. (There are a bunch of rules about it, but they aren't relevant if we only allow constants)

14.5 <assignment statement>
This Subclause is modified by Subclause 11.16, “<assignment statement>”, in ISO/IEC 9075-10. This Subclause is modified by Subclause 15.2, “<assignment statement>”, in ISO/IEC 9075-14.
Function
Assign a value to an SQL variable, SQL parameter, host parameter, or host variable.
Format
<assignment statement> ::=
    <singleton variable assignment>
  | <multiple variable assignment>
<multiple variable assignment> ::=
  SET <assignment target list> <equals operator> <assigned row>
<assignment target list> ::=
  <left paren> <assignment target> [ { <comma> <assignment target> }... ] <right paren>

  SET <assignment target> <equals operator> <assignment source>
<assignment target> ::=
    <target specification>
  | <modified field reference>
  | <mutator reference>
<assignment source> ::=
    <value expression>
  | <contextually typed source>
<contextually typed source> ::=
    <implicitly typed value specification>
  | <contextually typed row value expression>

So I guess the syntax should be as follows (though it's clunkier than a one line statement)

DECLARE my_var type
SET my_var=value

@talgalili
Copy link
Contributor

talgalili commented Aug 26, 2019 via email

@rongrong
Copy link
Contributor

I was thinking about the DECLARE. It seems not necessary for type since we are only allowing constant, but it would solve the problem of distinguish from other SET statements. So I'm ok either ways. If we ever want to support other type of variables, it makes sense to introduce DECLARE from the beginning.

@talgalili I don't think @ is part of SQL spec so I don't think we need to require @. I didn't find any specification on the format of variable name. There is a mention that it should not be equivalent to any other SQL parameter name or column name. So maybe any identifier would do. What do you think @rschlussel?

@talgalili
Copy link
Contributor

talgalili commented Aug 27, 2019 via email

@rschlussel
Copy link
Contributor

I don't think VAR would be required either way (the grammar should parse fine without it). given that there is a specification for this feature, I think we shouldn't have var in the syntax. I would be okay leaving out the DECLARE and only requiring it later for non-constants, but we should be careful to make sure that plan is feasible.

Implementation-wise, I think this should be similar to prepared statements, since the server doesn't keep any session state. That means it requires changes both on the server side and on the client side.

@rongrong
Copy link
Contributor

I don't know what's the business with "@" in general and how to justify whether to allow it or not. It's not a valid identifier. If we want to allow it, what is a "variable name"?

@rschlussel
Copy link
Contributor

The spec says a variable should be a <basic identifier chain>, which is just an identifier or period separated chain of identifiers. Identifiers only allow letters and numbers, so @ wouldn't be allowed

@carlshan
Copy link

Hi, I'd like to check in on what the current process/consensus by the Presto team is on including variables is. Is there a thread I could follow?

I would love to see variables in Presto. This would save me a lot of headache.

@rongrong
Copy link
Contributor

@kaikalur Any thoughts on the syntax? @carlshan are you interested in working on this?

@talgalili
Copy link
Contributor

@rongrong just to say, I still hope this feature can be introduced.
I come across the need for it at least once a month (I just wrote a bunch of code that I shared with others that needed to be run regularly with different dates (DS) manually, and having this feature would have made the process more robust (since DS was used in several places in the query).

So I do hope someone can take over developing this feature (I would, accept that I have no familiarity with the code architecture or language, so it's not feasible that I'll work on it).

@JuanSuarezMino
Copy link

Is there an ETA for this to be available?

@rongrong
Copy link
Contributor

Nobody is actively working on this. Contribution is much appreciated!

@Myvar
Copy link

Myvar commented Apr 27, 2020

Thanks Rebecca, What you wrote makes sense. Also, while searching I came across also the following text about T-SQL: https://en.m.wikipedia.org/wiki/Transact-SQL The structure there is similar but uses @ before the variable name, i.e.: DECLARE @Myvar VARCHAR(10) SET @Myvar = 'a string' So I'm wondering if it makes sense to allow (but not force) the use of @ before the var name, for people coming from some other SQL flavors.

My GitHub username is Myvar so its very funny that i randomly got mentioned on some random project's issue, this happens why to often. :)

@talgalili
Copy link
Contributor

talgalili commented Apr 27, 2020 via email

@coupcoup
Copy link

+1 would love to have this functionality!

@dingyunxing
Copy link

It's 2022 now and just wonder if this feature is available or not?

@rschlussel
Copy link
Contributor

We don't have anyone working on this, but if you would like to contribute this feature, we'd be happy to review.

@evanvolgas
Copy link

evanvolgas commented Feb 27, 2022 via email

@AlexXQJ
Copy link

AlexXQJ commented Aug 17, 2023

2016 - 2023 LOL

@yosiasz
Copy link

yosiasz commented Sep 5, 2023

time to migrate to influxdb y'all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests