Skip to content

Commit

Permalink
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;

 - Fixes wrong syntaxes and capitalize sub-titles
 - Adds some DDL queries in `Examples` so that users can run examples there
 - Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
 - Adds/Removes spaces, Indents, or blank lines to follow the format below;

```
---
license...
---

### Description

Writes what's the syntax is.

### Syntax

{% highlight sql %}
SELECT...
    WHERE... // 4 indents after the second line
    ...
{% endhighlight %}

### Parameters

<dl>

  <dt><code><em>Param Name</em></code></dt>
  <dd>
    Param Description
  </dd>
  ...
</dl>

### Examples

{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
    ('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);

-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
  +---+-----+
  |key|value|
  +---+-----+
  |  a|  1.0|
  |  a|  2.0|
  |  b|  3.0|
  |  c|  4.0|
  +---+-----+

-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
    FROM t
    GROUP BY key;
  +---+----------+
  |key|sum(value)|
  +---+----------+
  |  c|       4.0|
  |  b|       3.0|
  |  a|       3.0|
  +---+----------+
...
{% endhighlight %}

### Related Statements

 * [XXX](xxx.html)
 * ...
```

### Why are the changes needed?

The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community

### Does this PR introduce any user-facing change?

Yes.

### How was this patch tested?

Manually checked.

Closes #28151 from maropu/MakeRightAligned.

Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
  • Loading branch information
maropu authored and srowen committed Apr 13, 2020
1 parent 310bef1 commit 179289f
Show file tree
Hide file tree
Showing 78 changed files with 2,203 additions and 2,024 deletions.
10 changes: 0 additions & 10 deletions docs/sql-ref-ansi-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,18 +69,15 @@ When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric
{% highlight sql %}
-- `spark.sql.ansi.enabled=true`
SELECT 2147483647 + 1;

java.lang.ArithmeticException: integer overflow

-- `spark.sql.ansi.enabled=false`
SELECT 2147483647 + 1;

+----------------+
|(2147483647 + 1)|
+----------------+
| -2147483648|
+----------------+

{% endhighlight %}

### Type Conversion
Expand All @@ -97,24 +94,20 @@ In future releases, the behaviour of type coercion might change along with the o

-- `spark.sql.ansi.enabled=true`
SELECT CAST('a' AS INT);

java.lang.NumberFormatException: invalid input syntax for type numeric: a

SELECT CAST(2147483648L AS INT);

java.lang.ArithmeticException: Casting 2147483648 to int causes overflow

-- `spark.sql.ansi.enabled=false` (This is a default behaviour)
SELECT CAST('a' AS INT);

+--------------+
|CAST(a AS INT)|
+--------------+
| null|
+--------------+

SELECT CAST(2147483648L AS INT);

+-----------------------+
|CAST(2147483648 AS INT)|
+-----------------------+
Expand All @@ -126,20 +119,17 @@ CREATE TABLE t (v INT);

-- `spark.sql.storeAssignmentPolicy=ANSI`
INSERT INTO t VALUES ('1');

org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`':
- Cannot safely cast 'v': StringType to IntegerType;

-- `spark.sql.storeAssignmentPolicy=LEGACY` (This is a legacy behaviour until Spark 2.x)
INSERT INTO t VALUES ('1');
SELECT * FROM t;

+---+
| v|
+---+
| 1|
+---+

{% endhighlight %}

### SQL Functions
Expand Down
1 change: 0 additions & 1 deletion docs/sql-ref-datatypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ license: |
limitations under the License.
---


Spark SQL and DataFrames support the following data types:

* Numeric types
Expand Down
6 changes: 3 additions & 3 deletions docs/sql-ref-functions-builtin.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,6 @@ license: |

Spark SQL defines built-in functions to use, a complete list of which can be found [here](api/sql/). Among them, Spark SQL has several special categories of built-in functions: [Aggregate Functions](sql-ref-functions-builtin-aggregate.html) to operate on a group of rows, [Array Functions](sql-ref-functions-builtin-array.html) to operate on Array columns, and [Date and Time Functions](sql-ref-functions-builtin-date-time.html) to operate on Date and Time.

* [Aggregate Functions](sql-ref-functions-builtin-aggregate.html)
* [Array Functions](sql-ref-functions-builtin-array.html)
* [Date and Time Functions](sql-ref-functions-builtin-date-time.html)
* [Aggregate Functions](sql-ref-functions-builtin-aggregate.html)
* [Array Functions](sql-ref-functions-builtin-array.html)
* [Date and Time Functions](sql-ref-functions-builtin-date-time.html)
6 changes: 3 additions & 3 deletions docs/sql-ref-functions-udf.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,6 @@ license: |

User-Defined Functions (UDFs) are a feature of Spark SQL that allows users to define their own functions when the system's built-in functions are not enough to perform the desired task. To use UDFs in Spark SQL, users must first define the function, then register the function with Spark, and finally call the registered function. The User-Defined Functions can act on a single row or act on multiple rows at once. Spark SQL also supports integration of existing Hive implementations of UDFs, UDAFs and UDTFs.

* [Scalar User-Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html)
* [User-Defined Aggregate Functions (UDAFs)](sql-ref-functions-udf-aggregate.html)
* [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html)
* [Scalar User-Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html)
* [User-Defined Aggregate Functions (UDAFs)](sql-ref-functions-udf-aggregate.html)
* [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html)
Loading

0 comments on commit 179289f

Please sign in to comment.