Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31465][SQL][DOCS] Document Literal in SQL Reference #28237

Closed
wants to merge 12 commits into from

Conversation

huaxingao
Copy link
Contributor

@huaxingao huaxingao commented Apr 17, 2020

What changes were proposed in this pull request?

Document Literal in SQL Reference

Why are the changes needed?

Make SQL Reference complete

Does this PR introduce any user-facing change?

Yes
Screen Shot 2020-04-22 at 8 50 04 PM

Screen Shot 2020-04-22 at 8 50 29 PM

Screen Shot 2020-04-22 at 8 50 54 PM

Screen Shot 2020-04-22 at 8 51 15 PM

Screen Shot 2020-04-22 at 8 51 44 PM

Screen Shot 2020-04-22 at 8 52 03 PM

Screen Shot 2020-04-22 at 8 52 28 PM

Screen Shot 2020-04-22 at 8 53 14 PM

Screen Shot 2020-04-22 at 8 53 34 PM

Screen Shot 2020-04-22 at 8 53 56 PM

Screen Shot 2020-04-22 at 8 54 12 PM

How was this patch tested?

Manually build and check

@SparkQA
Copy link

SparkQA commented Apr 17, 2020

Test build #121391 has finished for PR 28237 at commit 71353a0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

cc @maropu

docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
<dt><code><em>Format:</em></code></dt>
<dd>
<code>
INTERVAL value { YEAR | MONTH | DAY | HOUR | MINUTE | SECOND | MILLISECOND | MICROSECOND }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be {value unit}+

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and we also support INTERVAL str unit TO unit.

docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Apr 17, 2020

Test build #121431 has finished for PR 28237 at commit 7628fb8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 18, 2020

Test build #121432 has finished for PR 28237 at commit c8c768f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 18, 2020

Test build #121435 has finished for PR 28237 at commit 2059414.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Apr 18, 2020

Test build #121438 has finished for PR 28237 at commit 39862ae.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
docs/sql-ref-literals.md Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Apr 20, 2020

Test build #121549 has finished for PR 28237 at commit 0809b1e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


decimal literals:
{% highlight sql %}
{ decimal_digits { [ BD ] | [ exponent BD ] } | digit [ ... ] [ exponent ] BD }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need the outer-most {}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed


double literals:
{% highlight sql %}
{ decimal_digits { D | exponent [ D ] } | digit [ ... ] { exponent [ D ] | [ exponent ] D }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

<dl>
<dt><code><em>BD</em></code></dt>
<dd>
Case insensitive, indicates <code>BIGDECIMAL</code>, which is an arbitrary-precision signed decimal number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no BIGDECIMAL in the type system, we should say DECIMAL.

And the precision is not arbitrary. It depends on how many digits the number has.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I am trying to say that user can have any precision he wants. The data types doc has DecimalType: Represents arbitrary-precision signed decimal numbers.

I changed to indicates DECIMAL, with the total number of digits as precision and the number of digits to right of decimal point as scale.

#### Parameters

<dl>
<dt><code><em>c</em></code></dt>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no c, how about y/m/d?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove this format 'yyyy-[m]m-[d]d[T]c[...]' . I don't think this is a recommended format. Seems we don't check the stuff after yyyy-[m]m-[d]d. Anything put after that is legal but will be cut off.

<dl>
<dt><code><em>c</em></code></dt>
<dd>
One character from the character set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One character from '0' to '9'.

|1997-01-01|
+----------+

SELECT TIMESTAMP '1997-01' AS col;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a date literal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

'yyyy-[m]m-[d]d[T][h]h:[m]m[:]' |
'yyyy-[m]m-[d]d[T][h]h:[m]m:[s]s[.]' |
'yyyy-[m]m-[d]d[T][h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]' |
'[T][h]h:' |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's allowed but not recommended to skip the date part in timestamp litereal. Shall we not document it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can simplify the doc a lot

'yyyy-[m]m-[d]d' |
            'yyyy-[m]m-[d]d ' |
            'yyyy-[m]m-[d]d[T][h]h[:]' |
            'yyyy-[m]m-[d]d[T][h]h:[m]m[:]' |
            'yyyy-[m]m-[d]d[T][h]h:[m]m:[s]s[.]' |
            'yyyy-[m]m-[d]d[T][h]h:[m]m:[s]s.[ms][ms][ms][us][us][us][zone_id]'


A numeric literal is used to specify a fixed or floating-point number.

#### Integer Literal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integral?

+---+
{% endhighlight %}

#### Non-integer Literals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fractional?

</li>
<li>Region-based zone IDs in the form <code>area/city</code>, such as <code>Europe/Paris</code></li>
</ul>
Note: defaults to system time-zone if <code>zone_id</code> is not specified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention the default value of hour, minute, second

Copy link
Contributor

@cloud-fan cloud-fan Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and timezone should default to the session local timezone (set via spark.sql.session.timeZone).

<ul>
<li>Z - Zulu time zone UTC+0</li>
<li>+|-[h]h:[m]m</li>
<li>A short id:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not recommended as it's ambiguous, let's not document it.

#### Syntax
{% highlight sql %}
{ INTERVAL interval_value interval_unit [ interval_value interval_unit ... ] |
INTERVAL ' [ INTERVAL ] interval_value interval_unit [ interval_value interval_unit ... ] ' |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove [ INTERVAL ] as it's not recommended to specify it.

<dl>
<dt><code><em>interval_string_value</em></code></dt>
<dd>
SQL standard year-month/date-time string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

year-month/day-time interval string

@SparkQA
Copy link

SparkQA commented Apr 22, 2020

Test build #121638 has finished for PR 28237 at commit 59cef7d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 22, 2020

Test build #121639 has finished for PR 28237 at commit 968338e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

|1997-01-31 09:26:56.123|
+-----------------------+

SELECT TIMESTAMP '1997-01-31 09:26:56.66666666CST' AS col;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not use CST as short id is not documented. How about UTC+8:00?

|-1 hours -57 minutes|
+--------------------+

SELECT INTERVAL 'INTERVAL 1 YEAR 2 DAYS 3 HOURS';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's omit the INTERVAL in the string

|1 years 2 months 25 days 5 hours 6 minutes 7.008009 seconds|
+-----------------------------------------------------------+

SELECT INTERVAL '20 15:40:32.99899999' DAY TO SECOND AS col;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's have an example for year-month interval as well.

@maropu maropu closed this in 03fe9ee Apr 23, 2020
maropu pushed a commit that referenced this pull request Apr 23, 2020
### What changes were proposed in this pull request?
Document Literal in SQL Reference

### Why are the changes needed?
Make SQL Reference complete

### Does this PR introduce any user-facing change?
Yes
<img width="1049" alt="Screen Shot 2020-04-22 at 8 50 04 PM" src="https://user-images.githubusercontent.com/13592258/80057912-9ecb0c00-84dc-11ea-881e-1415108d674f.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 50 29 PM" src="https://user-images.githubusercontent.com/13592258/80057917-a12d6600-84dc-11ea-8884-81f2a94644d5.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 50 54 PM" src="https://user-images.githubusercontent.com/13592258/80057922-a4c0ed00-84dc-11ea-9857-75db50f7b054.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 51 15 PM" src="https://user-images.githubusercontent.com/13592258/80057927-a7234700-84dc-11ea-9124-45ae1f6143fd.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 51 44 PM" src="https://user-images.githubusercontent.com/13592258/80057932-ab4f6480-84dc-11ea-8393-cf005af13ce9.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 52 03 PM" src="https://user-images.githubusercontent.com/13592258/80057936-ad192800-84dc-11ea-8d78-9f071a82f1df.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 52 28 PM" src="https://user-images.githubusercontent.com/13592258/80057940-b0141880-84dc-11ea-97a7-f787cad0ee03.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 53 14 PM" src="https://user-images.githubusercontent.com/13592258/80057945-b30f0900-84dc-11ea-985f-c070609e2329.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 53 34 PM" src="https://user-images.githubusercontent.com/13592258/80057949-b5716300-84dc-11ea-9452-3f51137fe03d.png">

<img width="1050" alt="Screen Shot 2020-04-22 at 8 53 56 PM" src="https://user-images.githubusercontent.com/13592258/80057957-b904ea00-84dc-11ea-8b12-a6f00362aa55.png">

<img width="1049" alt="Screen Shot 2020-04-22 at 8 54 12 PM" src="https://user-images.githubusercontent.com/13592258/80057962-bacead80-84dc-11ea-94da-916b1d1c1756.png">

### How was this patch tested?
Manually build and check

Closes #28237 from huaxingao/literal.

Authored-by: Huaxin Gao <huaxing@us.ibm.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(cherry picked from commit 03fe9ee)
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
@maropu
Copy link
Member

maropu commented Apr 23, 2020

Thanks, all! Merged to master/3.0.

@huaxingao
Copy link
Contributor Author

huaxingao commented Apr 23, 2020

Thank you all for the help!
Actually I need to address a couple of more comments. Sorry I was not fast enough. I will have a follow up in a few minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
7 participants