diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 7543180f5db35..3a0c2c123d81b 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -69,18 +69,15 @@ When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric {% highlight sql %} -- `spark.sql.ansi.enabled=true` SELECT 2147483647 + 1; - java.lang.ArithmeticException: integer overflow -- `spark.sql.ansi.enabled=false` SELECT 2147483647 + 1; - +----------------+ |(2147483647 + 1)| +----------------+ | -2147483648| +----------------+ - {% endhighlight %} ### Type Conversion @@ -97,16 +94,13 @@ In future releases, the behaviour of type coercion might change along with the o -- `spark.sql.ansi.enabled=true` SELECT CAST('a' AS INT); - java.lang.NumberFormatException: invalid input syntax for type numeric: a SELECT CAST(2147483648L AS INT); - java.lang.ArithmeticException: Casting 2147483648 to int causes overflow -- `spark.sql.ansi.enabled=false` (This is a default behaviour) SELECT CAST('a' AS INT); - +--------------+ |CAST(a AS INT)| +--------------+ @@ -114,7 +108,6 @@ SELECT CAST('a' AS INT); +--------------+ SELECT CAST(2147483648L AS INT); - +-----------------------+ |CAST(2147483648 AS INT)| +-----------------------+ @@ -126,20 +119,17 @@ CREATE TABLE t (v INT); -- `spark.sql.storeAssignmentPolicy=ANSI` INSERT INTO t VALUES ('1'); - org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`': - Cannot safely cast 'v': StringType to IntegerType; -- `spark.sql.storeAssignmentPolicy=LEGACY` (This is a legacy behaviour until Spark 2.x) INSERT INTO t VALUES ('1'); SELECT * FROM t; - +---+ | v| +---+ | 1| +---+ - {% endhighlight %} ### SQL Functions diff --git a/docs/sql-ref-datatypes.md b/docs/sql-ref-datatypes.md index 9700608fe8a34..1e0d0511a114f 100644 --- a/docs/sql-ref-datatypes.md +++ b/docs/sql-ref-datatypes.md @@ -19,7 +19,6 @@ license: | limitations under the License. --- - Spark SQL and DataFrames support the following data types: * Numeric types diff --git a/docs/sql-ref-functions-builtin.md b/docs/sql-ref-functions-builtin.md index 48e5c0e6e13e2..917c081adb7a0 100644 --- a/docs/sql-ref-functions-builtin.md +++ b/docs/sql-ref-functions-builtin.md @@ -21,6 +21,6 @@ license: | Spark SQL defines built-in functions to use, a complete list of which can be found [here](api/sql/). Among them, Spark SQL has several special categories of built-in functions: [Aggregate Functions](sql-ref-functions-builtin-aggregate.html) to operate on a group of rows, [Array Functions](sql-ref-functions-builtin-array.html) to operate on Array columns, and [Date and Time Functions](sql-ref-functions-builtin-date-time.html) to operate on Date and Time. -* [Aggregate Functions](sql-ref-functions-builtin-aggregate.html) -* [Array Functions](sql-ref-functions-builtin-array.html) -* [Date and Time Functions](sql-ref-functions-builtin-date-time.html) + * [Aggregate Functions](sql-ref-functions-builtin-aggregate.html) + * [Array Functions](sql-ref-functions-builtin-array.html) + * [Date and Time Functions](sql-ref-functions-builtin-date-time.html) diff --git a/docs/sql-ref-functions-udf.md b/docs/sql-ref-functions-udf.md index 91c04f16049e2..2c5f204d25736 100644 --- a/docs/sql-ref-functions-udf.md +++ b/docs/sql-ref-functions-udf.md @@ -21,6 +21,6 @@ license: | User-Defined Functions (UDFs) are a feature of Spark SQL that allows users to define their own functions when the system's built-in functions are not enough to perform the desired task. To use UDFs in Spark SQL, users must first define the function, then register the function with Spark, and finally call the registered function. The User-Defined Functions can act on a single row or act on multiple rows at once. Spark SQL also supports integration of existing Hive implementations of UDFs, UDAFs and UDTFs. -* [Scalar User-Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html) -* [User-Defined Aggregate Functions (UDAFs)](sql-ref-functions-udf-aggregate.html) -* [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html) + * [Scalar User-Defined Functions (UDFs)](sql-ref-functions-udf-scalar.html) + * [User-Defined Aggregate Functions (UDAFs)](sql-ref-functions-udf-aggregate.html) + * [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html) diff --git a/docs/sql-ref-null-semantics.md b/docs/sql-ref-null-semantics.md index 37b4081d6b27b..dc48a36cadb3c 100644 --- a/docs/sql-ref-null-semantics.md +++ b/docs/sql-ref-null-semantics.md @@ -20,6 +20,7 @@ license: | --- ### Description + A table consists of a set of rows and each row contains a set of columns. A column is associated with a data type and represents a specific attribute of an entity (for example, `age` is a column of an @@ -61,7 +62,7 @@ the `age` column and this table will be used in various examples in the sections 700Dan50 -### Comparison operators +### Comparison Operators Apache spark supports the standard comparison operators such as '>', '>=', '=', '<' and '<='. The result of these operators is unknown or `NULL` when one of the operands or both the operands are @@ -114,13 +115,14 @@ one or both operands are `NULL`: ### Examples + {% highlight sql %} -- Normal comparison operators return `NULL` when one of the operand is `NULL`. SELECT 5 > null AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -- Normal comparison operators return `NULL` when both the operands are `NULL`. @@ -128,7 +130,7 @@ SELECT null = null AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -- Null-safe equal operator return `False` when one of the operand is `NULL` @@ -136,7 +138,7 @@ SELECT 5 <=> null AS expression_output; +-----------------+ |expression_output| +-----------------+ - |false | + | false| +-----------------+ -- Null-safe equal operator return `True` when one of the operand is `NULL` @@ -144,11 +146,12 @@ SELECT NULL <=> NULL; +-----------------+ |expression_output| +-----------------+ - |true | + | true| +-----------------+ {% endhighlight %} -### Logical operators +### Logical Operators + Spark supports standard logical operators such as `AND`, `OR` and `NOT`. These operators take `Boolean` expressions as the arguments and return a `Boolean` value. @@ -205,13 +208,14 @@ The following tables illustrate the behavior of logical operators when one or bo ### Examples + {% highlight sql %} -- Normal comparison operators return `NULL` when one of the operands is `NULL`. SELECT (true OR null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |true | + | true| +-----------------+ -- Normal comparison operators return `NULL` when both the operands are `NULL`. @@ -219,7 +223,7 @@ SELECT (null OR false) AS expression_output +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -- Null-safe equal operator returns `False` when one of the operands is `NULL` @@ -227,11 +231,12 @@ SELECT NOT(null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ {% endhighlight %} ### Expressions + The comparison operators and logical operators are treated as expressions in Spark. Other than these two kinds of expressions, Spark supports other form of expressions such as function expressions, cast expressions, etc. The expressions @@ -240,35 +245,37 @@ in Spark can be broadly classified as : - Expressions that can process `NULL` value operands - The result of these expressions depends on the expression itself. -#### Null intolerant expressions +#### Null Intolerant Expressions + Null intolerant expressions return `NULL` when one or more arguments of expression are `NULL` and most of the expressions fall in this category. ##### Examples + {% highlight sql %} -SELECT concat('John', null) as expression_output; +SELECT concat('John', null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -SELECT positive(null) as expression_output; +SELECT positive(null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -SELECT to_date(null) as expression_output; +SELECT to_date(null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ {% endhighlight %} -#### Expressions that can process null value operands. +#### Expressions That Can Process Null Value Operands This class of expressions are designed to handle `NULL` values. The result of the expressions depends on the expression itself. As an example, function expression `isnull` @@ -287,14 +294,14 @@ returns the first non `NULL` value in its list of operands. However, `coalesce` - ATLEASTNNONNULLS - IN - ##### Examples + {% highlight sql %} SELECT isnull(null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |true | + | true| +-----------------+ -- Returns the first occurrence of non `NULL` value. @@ -302,7 +309,7 @@ SELECT coalesce(null, null, 3, null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |3 | + | 3| +-----------------+ -- Returns `NULL` as all its operands are `NULL`. @@ -310,18 +317,19 @@ SELECT coalesce(null, null, null, null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |null | + | null| +-----------------+ -SELECT isnan(null) as expression_output; +SELECT isnan(null) AS expression_output; +-----------------+ |expression_output| +-----------------+ - |false | + | false| +-----------------+ {% endhighlight %} #### Builtin Aggregate Expressions + Aggregate functions compute a single result by processing a set of input rows. Below are the rules of how `NULL` values are handled by aggregate functions. - `NULL` values are ignored from processing by all the aggregate functions. @@ -337,13 +345,14 @@ the rules of how `NULL` values are handled by aggregate functions. - SOME #### Examples + {% highlight sql %} -- `count(*)` does not skip `NULL` values. SELECT count(*) FROM person; +--------+ |count(1)| +--------+ - |7 | + | 7| +--------+ -- `NULL` values in column `age` are skipped from processing. @@ -351,7 +360,7 @@ SELECT count(age) FROM person; +----------+ |count(age)| +----------+ - |5 | + | 5| +----------+ -- `count(*)` on an empty input set returns 0. This is unlike the other @@ -360,7 +369,7 @@ SELECT count(*) FROM person where 1 = 0; +--------+ |count(1)| +--------+ - |0 | + | 0| +--------+ -- `NULL` values are excluded from computation of maximum value. @@ -368,7 +377,7 @@ SELECT max(age) FROM person; +--------+ |max(age)| +--------+ - |50 | + | 50| +--------+ -- `max` returns `NULL` on an empty input set. @@ -376,44 +385,45 @@ SELECT max(age) FROM person where 1 = 0; +--------+ |max(age)| +--------+ - |null | + | null| +--------+ - {% endhighlight %} -### Condition expressions in WHERE, HAVING and JOIN clauses. +### Condition Expressions in WHERE, HAVING and JOIN Clauses + `WHERE`, `HAVING` operators filter rows based on the user specified condition. A `JOIN` operator is used to combine rows from two tables based on a join condition. For all the three operators, a condition expression is a boolean expression and can return True, False or Unknown (NULL). They are "satisfied" if the result of the condition is `True`. #### Examples + {% highlight sql %} -- Persons whose age is unknown (`NULL`) are filtered out from the result set. SELECT * FROM person WHERE age > 0; +--------+---+ - |name |age| + | name|age| +--------+---+ - |Michelle|30 | - |Fred |50 | - |Mike |18 | - |Dan |50 | - |Joe |30 | + |Michelle| 30| + | Fred| 50| + | Mike| 18| + | Dan| 50| + | Joe| 30| +--------+---+ -- `IS NULL` expression is used in disjunction to select the persons -- with unknown (`NULL`) records. SELECT * FROM person WHERE age > 0 OR age IS NULL; +--------+----+ - |name |age | + | name| age| +--------+----+ - |Albert |null| - |Michelle|30 | - |Fred |50 | - |Mike |18 | - |Dan |50 | - |Marry |null| - |Joe |30 | + | Albert|null| + |Michelle| 30| + | Fred| 50| + | Mike| 18| + | Dan| 50| + | Marry|null| + | Joe| 30| +--------+----+ -- Person with unknown(`NULL`) ages are skipped from processing. @@ -421,135 +431,139 @@ SELECT * FROM person GROUP BY age HAVING max(age) > 18; +---+--------+ |age|count(1)| +---+--------+ - |50 |2 | - |30 |2 | + | 50| 2| + | 30| 2| +---+--------+ -- A self join case with a join condition `p1.age = p2.age AND p1.name = p2.name`. -- The persons with unknown age (`NULL`) are filtered out by the join operator. SELECT * FROM person p1, person p2 -WHERE p1.age = p2.age - AND p1.name = p2.name; + WHERE p1.age = p2.age + AND p1.name = p2.name; +--------+---+--------+---+ - |name |age|name |age| + | name|age| name|age| +--------+---+--------+---+ - |Michelle|30 |Michelle|30 | - |Fred |50 |Fred |50 | - |Mike |18 |Mike |18 | - |Dan |50 |Dan |50 | - |Joe |30 |Joe |30 | + |Michelle| 30|Michelle| 30| + | Fred| 50| Fred| 50| + | Mike| 18| Mike| 18| + | Dan| 50| Dan| 50| + | Joe| 30| Joe| 30| +--------+---+--------+---+ -- The age column from both legs of join are compared using null-safe equal which -- is why the persons with unknown age (`NULL`) are qualified by the join. SELECT * FROM person p1, person p2 -WHERE p1.age <=> p2.age - AND p1.name = p2.name; -+--------+----+--------+----+ -| name| age| name| age| -+--------+----+--------+----+ -| Albert|null| Albert|null| -|Michelle| 30|Michelle| 30| -| Fred| 50| Fred| 50| -| Mike| 18| Mike| 18| -| Dan| 50| Dan| 50| -| Marry|null| Marry|null| -| Joe| 30| Joe| 30| -+--------+----+--------+----+ - + WHERE p1.age <=> p2.age + AND p1.name = p2.name; + +--------+----+--------+----+ + | name| age| name| age| + +--------+----+--------+----+ + | Albert|null| Albert|null| + |Michelle| 30|Michelle| 30| + | Fred| 50| Fred| 50| + | Mike| 18| Mike| 18| + | Dan| 50| Dan| 50| + | Marry|null| Marry|null| + | Joe| 30| Joe| 30| + +--------+----+--------+----+ {% endhighlight %} -### Aggregate operator (GROUP BY, DISTINCT) +### Aggregate Operator (GROUP BY, DISTINCT) + As discussed in the previous section [comparison operator](sql-ref-null-semantics.html#comparison-operators), two `NULL` values are not equal. However, for the purpose of grouping and distinct processing, the two or more values with `NULL data`are grouped together into the same bucket. This behaviour is conformant with SQL standard and with other enterprise database management systems. #### Examples + {% highlight sql %} -- `NULL` values are put in one bucket in `GROUP BY` processing. SELECT age, count(*) FROM person GROUP BY age; +----+--------+ - |age |count(1)| + | age|count(1)| +----+--------+ - |null|2 | - |50 |2 | - |30 |2 | - |18 |1 | + |null| 2| + | 50| 2| + | 30| 2| + | 18| 1| +----+--------+ -- All `NULL` ages are considered one distinct value in `DISTINCT` processing. SELECT DISTINCT age FROM person; +----+ - |age | + | age| +----+ |null| - |50 | - |30 | - |18 | + | 50| + | 30| + | 18| +----+ - {% endhighlight %} -### Sort operator (ORDER BY Clause) +### Sort Operator (ORDER BY Clause) + Spark SQL supports null ordering specification in `ORDER BY` clause. Spark processes the `ORDER BY` clause by placing all the `NULL` values at first or at last depending on the null ordering specification. By default, all the `NULL` values are placed at first. #### Examples + {% highlight sql %} -- `NULL` values are shown at first and other values -- are sorted in ascending way. SELECT age, name FROM person ORDER BY age; +----+--------+ - |age |name | + | age| name| +----+--------+ - |null|Marry | - |null|Albert | - |18 |Mike | - |30 |Michelle| - |30 |Joe | - |50 |Fred | - |50 |Dan | + |null| Marry| + |null| Albert| + | 18| Mike| + | 30|Michelle| + | 30| Joe| + | 50| Fred| + | 50| Dan| +----+--------+ -- Column values other than `NULL` are sorted in ascending -- way and `NULL` values are shown at the last. SELECT age, name FROM person ORDER BY age NULLS LAST; +----+--------+ - |age |name | + | age| name| +----+--------+ - |18 |Mike | - |30 |Michelle| - |30 |Joe | - |50 |Dan | - |50 |Fred | - |null|Marry | - |null|Albert | + | 18| Mike| + | 30|Michelle| + | 30| Joe| + | 50| Dan| + | 50| Fred| + |null| Marry| + |null| Albert| +----+--------+ -- Columns other than `NULL` values are sorted in descending -- and `NULL` values are shown at the last. SELECT age, name FROM person ORDER BY age DESC NULLS LAST; +----+--------+ - |age |name | + | age| name| +----+--------+ - |50 |Fred | - |50 |Dan | - |30 |Michelle| - |30 |Joe | - |18 |Mike | - |null|Marry | - |null|Albert | + | 50| Fred| + | 50| Dan| + | 30|Michelle| + | 30| Joe| + | 18| Mike| + |null| Marry| + |null| Albert| +----+--------+ {% endhighlight %} -### Set operators (UNION, INTERSECT, EXCEPT) +### Set Operators (UNION, INTERSECT, EXCEPT) + `NULL` values are compared in a null-safe manner for equality in the context of set operations. That means when comparing rows, two `NULL` values are considered equal unlike the regular `EqualTo`(`=`) operator. #### Examples + {% highlight sql %} CREATE VIEW unknown_age SELECT * FROM person WHERE age IS NULL; @@ -557,51 +571,51 @@ CREATE VIEW unknown_age SELECT * FROM person WHERE age IS NULL; -- result set. The comparison between columns of the row are done -- in a null-safe manner. SELECT name, age FROM person -INTERSECT -SELECT name, age from unknown_age; + INTERSECT + SELECT name, age from unknown_age; +------+----+ - |name |age | + | name| age| +------+----+ |Albert|null| - |Marry |null| + | Marry|null| +------+----+ -- `NULL` values from two legs of the `EXCEPT` are not in output. -- This basically shows that the comparison happens in a null-safe manner. SELECT age, name FROM person -EXCEPT -SELECT age FROM unknown_age; + EXCEPT + SELECT age FROM unknown_age; +---+--------+ - |age|name | + |age| name| +---+--------+ - |30 |Joe | - |50 |Fred | - |30 |Michelle| - |18 |Mike | - |50 |Dan | + | 30| Joe| + | 50| Fred| + | 30|Michelle| + | 18| Mike| + | 50| Dan| +---+--------+ -- Performs `UNION` operation between two sets of data. -- The comparison between columns of the row ae done in -- null-safe manner. SELECT name, age FROM person -UNION -SELECT name, age FROM unknown_age; + UNION + SELECT name, age FROM unknown_age; +--------+----+ - |name |age | + | name| age| +--------+----+ - |Albert |null| - |Joe |30 | - |Michelle|30 | - |Marry |null| - |Fred |50 | - |Mike |18 | - |Dan |50 | + | Albert|null| + | Joe| 30| + |Michelle| 30| + | Marry|null| + | Fred| 50| + | Mike| 18| + | Dan| 50| +--------+----+ {% endhighlight %} - ### EXISTS/NOT EXISTS Subquery + In Spark, EXISTS and NOT EXISTS expressions are allowed inside a WHERE clause. These are boolean expressions which return either `TRUE` or `FALSE`. In other words, EXISTS is a membership condition and returns `TRUE` @@ -614,20 +628,21 @@ the subquery. They are normally faster because they can be converted to semijoins / anti-semijoins without special provisions for null awareness. #### Examples + {% highlight sql %} -- Even if subquery produces rows with `NULL` values, the `EXISTS` expression -- evaluates to `TRUE` as the subquery produces 1 row. SELECT * FROM person WHERE EXISTS (SELECT null); +--------+----+ - |name |age | + | name| age| +--------+----+ - |Albert |null| - |Michelle|30 | - |Fred |50 | - |Mike |18 | - |Dan |50 | - |Marry |null| - |Joe |30 | + | Albert|null| + |Michelle| 30| + | Fred| 50| + | Mike| 18| + | Dan| 50| + | Marry|null| + | Joe| 30| +--------+----+ -- `NOT EXISTS` expression returns `FALSE`. It returns `TRUE` only when @@ -641,19 +656,20 @@ SELECT * FROM person WHERE NOT EXISTS (SELECT null); -- `NOT EXISTS` expression returns `TRUE`. SELECT * FROM person WHERE NOT EXISTS (SELECT 1 WHERE 1 = 0); +--------+----+ - |name |age | + | name| age| +--------+----+ - |Albert |null| - |Michelle|30 | - |Fred |50 | - |Mike |18 | - |Dan |50 | - |Marry |null| - |Joe |30 | + | Albert|null| + |Michelle| 30| + | Fred| 50| + | Mike| 18| + | Dan| 50| + | Marry|null| + | Joe| 30| +--------+----+ {% endhighlight %} ### IN/NOT IN Subquery + In Spark, `IN` and `NOT IN` expressions are allowed inside a WHERE clause of a query. Unlike the `EXISTS` expression, `IN` expression can return a `TRUE`, `FALSE` or `UNKNOWN (NULL)` value. Conceptually a `IN` expression is semantically @@ -675,6 +691,7 @@ This is because IN returns UNKNOWN if the value is not in the list containing `N and because NOT UNKNOWN is again UNKNOWN. #### Examples + {% highlight sql %} -- The subquery has only `NULL` value in its result set. Therefore, -- the result of `IN` predicate is UNKNOWN. @@ -687,22 +704,21 @@ SELECT * FROM person WHERE age IN (SELECT null); -- The subquery has `NULL` value in the result set as well as a valid -- value `50`. Rows with age = 50 are returned. SELECT * FROM person -WHERE age IN (SELECT age FROM VALUES (50), (null) sub(age)); + WHERE age IN (SELECT age FROM VALUES (50), (null) sub(age)); +----+---+ |name|age| +----+---+ - |Fred|50 | - |Dan |50 | + |Fred| 50| + | Dan| 50| +----+---+ -- Since subquery has `NULL` value in the result set, the `NOT IN` -- predicate would return UNKNOWN. Hence, no rows are -- qualified for this query. SELECT * FROM person -WHERE age NOT IN (SELECT age FROM VALUES (50), (null) sub(age)); + WHERE age NOT IN (SELECT age FROM VALUES (50), (null) sub(age)); +----+---+ |name|age| +----+---+ +----+---+ - {% endhighlight %} diff --git a/docs/sql-ref-syntax-aux-analyze-table.md b/docs/sql-ref-syntax-aux-analyze-table.md index 40513e836b026..739e692680233 100644 --- a/docs/sql-ref-syntax-aux-analyze-table.md +++ b/docs/sql-ref-syntax-aux-analyze-table.md @@ -24,13 +24,14 @@ license: | The `ANALYZE TABLE` statement collects statistics about the table to be used by the query optimizer to find a better query execution plan. ### Syntax + {% highlight sql %} ANALYZE TABLE table_identifier [ partition_spec ] COMPUTE STATISTICS [ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ] - {% endhighlight %} ### Parameters +
table_identifier
@@ -69,41 +70,69 @@ ANALYZE TABLE table_identifier [ partition_spec ]
### Examples -{% highlight sql %} - ANALYZE TABLE students COMPUTE STATISTICS NOSCAN; - - DESC EXTENDED students; - ...... - Statistics 2820 bytes - ...... - - ANALYZE TABLE students COMPUTE STATISTICS; - - DESC EXTENDED students; - ...... - Statistics 2820 bytes, 3 rows - ...... - - ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS; - - DESC EXTENDED students PARTITION (student_id = 111111); - ...... - Partition Statistics 919 bytes, 1 rows - ...... - - ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name; - - DESC EXTENDED students name; - =default tbl=students - col_name name - data_type string - comment NULL - min NULL - max NULL - num_nulls 0 - distinct_count 3 - avg_col_len 11 - max_col_len 13 - histogram NULL +{% highlight sql %} +CREATE TABLE students (name STRING, student_id INT) PARTITIONED BY (student_id); +INSERT INTO students PARTITION (student_id = 111111) VALUES ('Mark'); +INSERT INTO students PARTITION (student_id = 222222) VALUES ('John'); + +ANALYZE TABLE students COMPUTE STATISTICS NOSCAN; + +DESC EXTENDED students; + +--------------------+--------------------+-------+ + | col_name| data_type|comment| + +--------------------+--------------------+-------+ + | name| string| null| + | student_id| int| null| + | ...| ...| ...| + | Statistics| 864 bytes| | + | ...| ...| ...| + | Partition Provider| Catalog| | + +--------------------+--------------------+-------+ + +ANALYZE TABLE students COMPUTE STATISTICS; + +DESC EXTENDED students; + +--------------------+--------------------+-------+ + | col_name| data_type|comment| + +--------------------+--------------------+-------+ + | name| string| null| + | student_id| int| null| + | ...| ...| ...| + | Statistics| 864 bytes, 2 rows| | + | ...| ...| ...| + | Partition Provider| Catalog| | + +--------------------+--------------------+-------+ + +ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS; + +DESC EXTENDED students PARTITION (student_id = 111111); + +--------------------+--------------------+-------+ + | col_name| data_type|comment| + +--------------------+--------------------+-------+ + | name| string| null| + | student_id| int| null| + | ...| ...| ...| + |Partition Statistics| 432 bytes, 1 rows| | + | ...| ...| ...| + | OutputFormat|org.apache.hadoop...| | + +--------------------+--------------------+-------+ + +ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name; + +DESC EXTENDED students name; + +--------------+----------+ + | info_name|info_value| + +--------------+----------+ + | col_name| name| + | data_type| string| + | comment| NULL| + | min| NULL| + | max| NULL| + | num_nulls| 0| + |distinct_count| 2| + | avg_col_len| 4| + | max_col_len| 4| + | histogram| NULL| + +--------------+----------+ {% endhighlight %} diff --git a/docs/sql-ref-syntax-aux-analyze.md b/docs/sql-ref-syntax-aux-analyze.md index b1bdc73657724..4c68e6b9ec974 100644 --- a/docs/sql-ref-syntax-aux-analyze.md +++ b/docs/sql-ref-syntax-aux-analyze.md @@ -19,4 +19,4 @@ license: | limitations under the License. --- -* [ANALYZE TABLE statement](sql-ref-syntax-aux-analyze-table.html) + * [ANALYZE TABLE statement](sql-ref-syntax-aux-analyze-table.html) diff --git a/docs/sql-ref-syntax-aux-cache-cache-table.md b/docs/sql-ref-syntax-aux-cache-cache-table.md index 27cc77b938fbe..11f682cc10891 100644 --- a/docs/sql-ref-syntax-aux-cache-cache-table.md +++ b/docs/sql-ref-syntax-aux-cache-cache-table.md @@ -20,16 +20,19 @@ license: | --- ### Description + `CACHE TABLE` statement caches contents of a table or output of a query with the given storage level. If a query is cached, then a temp view will be created for this query. This reduces scanning of the original files in future queries. ### Syntax + {% highlight sql %} CACHE [ LAZY ] TABLE table_identifier [ OPTIONS ( 'storageLevel' [ = ] value ) ] [ [ AS ] query ] {% endhighlight %} ### Parameters +
LAZY
Only cache the table when it is first used, instead of immediately.
@@ -80,13 +83,14 @@ CACHE [ LAZY ] TABLE table_identifier
### Examples + {% highlight sql %} CACHE TABLE testCache OPTIONS ('storageLevel' 'DISK_ONLY') SELECT * FROM testData; {% endhighlight %} ### Related Statements - * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) - * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) - * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) - * [REFRESH](sql-ref-syntax-aux-cache-refresh.html) + * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) + * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) + * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) + * [REFRESH](sql-ref-syntax-aux-cache-refresh.html) diff --git a/docs/sql-ref-syntax-aux-cache-clear-cache.md b/docs/sql-ref-syntax-aux-cache-clear-cache.md index 15ba3c787c177..47889691148b7 100644 --- a/docs/sql-ref-syntax-aux-cache-clear-cache.md +++ b/docs/sql-ref-syntax-aux-cache-clear-cache.md @@ -20,21 +20,24 @@ license: | --- ### Description + `CLEAR CACHE` removes the entries and associated data from the in-memory and/or on-disk cache for all cached tables and views. ### Syntax + {% highlight sql %} CLEAR CACHE {% endhighlight %} ### Examples + {% highlight sql %} CLEAR CACHE; {% endhighlight %} ### Related Statements + * [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) * [REFRESH](sql-ref-syntax-aux-cache-refresh.html) - diff --git a/docs/sql-ref-syntax-aux-cache-refresh.md b/docs/sql-ref-syntax-aux-cache-refresh.md index 4c56893aeca98..25f7ede1d324e 100644 --- a/docs/sql-ref-syntax-aux-cache-refresh.md +++ b/docs/sql-ref-syntax-aux-cache-refresh.md @@ -20,35 +20,39 @@ license: | --- ### Description + `REFRESH` is used to invalidate and refresh all the cached data (and the associated metadata) for all Datasets that contains the given data source path. Path matching is by prefix, i.e. "/" would invalidate everything that is cached. ### Syntax + {% highlight sql %} REFRESH resource_path {% endhighlight %} ### Parameters +
resource_path
The path of the resource that is to be refreshed.
### Examples + {% highlight sql %} - -- The Path is resolved using the datasource's File Index. +-- The Path is resolved using the datasource's File Index. CREATE TABLE test(ID INT) using parquet; INSERT INTO test SELECT 1000; CACHE TABLE test; INSERT INTO test SELECT 100; REFRESH "hdfs://path/to/table"; - {% endhighlight %} ### Related Statements -- [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) -- [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) -- [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) -- [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) + + * [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) + * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) + * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) + * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) diff --git a/docs/sql-ref-syntax-aux-cache-uncache-table.md b/docs/sql-ref-syntax-aux-cache-uncache-table.md index 7e4b8fbc35aa8..95fd91c3c4807 100644 --- a/docs/sql-ref-syntax-aux-cache-uncache-table.md +++ b/docs/sql-ref-syntax-aux-cache-uncache-table.md @@ -20,15 +20,18 @@ license: | --- ### Description + `UNCACHE TABLE` removes the entries and associated data from the in-memory and/or on-disk cache for a given table or view. The underlying entries should already have been brought to cache by previous `CACHE TABLE` operation. `UNCACHE TABLE` on a non-existent table throws an exception if `IF EXISTS` is not specified. ### Syntax + {% highlight sql %} UNCACHE TABLE [ IF EXISTS ] table_identifier {% endhighlight %} ### Parameters +
table_identifier
@@ -41,11 +44,13 @@ UNCACHE TABLE [ IF EXISTS ] table_identifier
### Examples + {% highlight sql %} UNCACHE TABLE t1; {% endhighlight %} ### Related Statements + * [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) diff --git a/docs/sql-ref-syntax-aux-cache.md b/docs/sql-ref-syntax-aux-cache.md index 1a48fceeb1f6b..418b8cc3403b5 100644 --- a/docs/sql-ref-syntax-aux-cache.md +++ b/docs/sql-ref-syntax-aux-cache.md @@ -19,8 +19,8 @@ license: | limitations under the License. --- -* [CACHE TABLE statement](sql-ref-syntax-aux-cache-cache-table.html) -* [UNCACHE TABLE statement](sql-ref-syntax-aux-cache-uncache-table.html) -* [CLEAR CACHE statement](sql-ref-syntax-aux-cache-clear-cache.html) -* [REFRESH TABLE statement](sql-ref-syntax-aux-refresh-table.html) -* [REFRESH statement](sql-ref-syntax-aux-cache-refresh.html) \ No newline at end of file + * [CACHE TABLE statement](sql-ref-syntax-aux-cache-cache-table.html) + * [UNCACHE TABLE statement](sql-ref-syntax-aux-cache-uncache-table.html) + * [CLEAR CACHE statement](sql-ref-syntax-aux-cache-clear-cache.html) + * [REFRESH TABLE statement](sql-ref-syntax-aux-refresh-table.html) + * [REFRESH statement](sql-ref-syntax-aux-cache-refresh.html) \ No newline at end of file diff --git a/docs/sql-ref-syntax-aux-conf-mgmt-reset.md b/docs/sql-ref-syntax-aux-conf-mgmt-reset.md index 5ebc7b97ef64f..e7e6dda4e25ee 100644 --- a/docs/sql-ref-syntax-aux-conf-mgmt-reset.md +++ b/docs/sql-ref-syntax-aux-conf-mgmt-reset.md @@ -20,19 +20,22 @@ license: | --- ### Description + Reset any runtime configurations specific to the current session which were set via the [SET](sql-ref-syntax-aux-conf-mgmt-set.html) command to their default values. ### Syntax + {% highlight sql %} RESET {% endhighlight %} - ### Examples + {% highlight sql %} -- Reset any runtime configurations specific to the current session which were set via the SET command to their default values. RESET; {% endhighlight %} ### Related Statements -- [SET](sql-ref-syntax-aux-conf-mgmt-set.html) + + * [SET](sql-ref-syntax-aux-conf-mgmt-set.html) diff --git a/docs/sql-ref-syntax-aux-conf-mgmt-set.md b/docs/sql-ref-syntax-aux-conf-mgmt-set.md index f05dde3f567ee..2ca51307c3aae 100644 --- a/docs/sql-ref-syntax-aux-conf-mgmt-set.md +++ b/docs/sql-ref-syntax-aux-conf-mgmt-set.md @@ -20,9 +20,11 @@ license: | --- ### Description + The SET command sets a property, returns the value of an existing property or returns all SQLConf properties with value and meaning. ### Syntax + {% highlight sql %} SET SET [ -v ] @@ -30,6 +32,7 @@ SET property_key[ = property_value ] {% endhighlight %} ### Parameters +
-v
Outputs the key, value and meaning of existing SQLConf properties.
@@ -46,9 +49,10 @@ SET property_key[ = property_value ]
### Examples + {% highlight sql %} -- Set a property. -SET spark.sql.variable.substitute=false; +SET spark.sql.variable.substitute=false; -- List all SQLConf properties with value and meaning. SET -v; @@ -57,13 +61,14 @@ SET -v; SET; -- List the value of specified property key. -SET spark.sql.variable.substitute; - +--------------------------------+--------+ - | key | value | - +--------------------------------+--------+ - | spark.sql.variable.substitute | false | - +--------------------------------+--------+ +SET spark.sql.variable.substitute; + +-----------------------------+-----+ + | key|value| + +-----------------------------+-----+ + |spark.sql.variable.substitute|false| + +-----------------------------+-----+ {% endhighlight %} ### Related Statements -- [RESET](sql-ref-syntax-aux-conf-mgmt-reset.html) + + * [RESET](sql-ref-syntax-aux-conf-mgmt-reset.html) diff --git a/docs/sql-ref-syntax-aux-conf-mgmt.md b/docs/sql-ref-syntax-aux-conf-mgmt.md index 7c5d9cc895c10..f5e48ef2fee30 100644 --- a/docs/sql-ref-syntax-aux-conf-mgmt.md +++ b/docs/sql-ref-syntax-aux-conf-mgmt.md @@ -19,5 +19,5 @@ license: | limitations under the License. --- -* [SET](sql-ref-syntax-aux-conf-mgmt-set.html) -* [UNSET](sql-ref-syntax-aux-conf-mgmt-reset.html) + * [SET](sql-ref-syntax-aux-conf-mgmt-set.html) + * [UNSET](sql-ref-syntax-aux-conf-mgmt-reset.html) diff --git a/docs/sql-ref-syntax-aux-describe-database.md b/docs/sql-ref-syntax-aux-describe-database.md index 05a64ab2060b4..2f7b1ce984d3e 100644 --- a/docs/sql-ref-syntax-aux-describe-database.md +++ b/docs/sql-ref-syntax-aux-describe-database.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description ​ `DESCRIBE DATABASE` statement returns the metadata of an existing database. The metadata information includes database @@ -26,11 +27,13 @@ returns the basic metadata information along with the database properties. The ` interchangeable. ### Syntax + {% highlight sql %} { DESC | DESCRIBE } DATABASE [ EXTENDED ] db_name {% endhighlight %} ### Parameters +
db_name
@@ -40,6 +43,7 @@ interchangeable.
### Example + {% highlight sql %} -- Create employees DATABASE CREATE DATABASE employees COMMENT 'For software companies'; @@ -49,11 +53,11 @@ CREATE DATABASE employees COMMENT 'For software companies'; -- for the employees DATABASE. DESCRIBE DATABASE employees; +-------------------------+-----------------------------+ - |database_description_item|database_description_value | + |database_description_item| database_description_value| +-------------------------+-----------------------------+ - |Database Name |employees | - |Description |For software companies | - |Location |file:/Users/Temp/employees.db| + | Database Name| employees| + | Description| For software companies| + | Location|file:/Users/Temp/employees.db| +-------------------------+-----------------------------+ -- Create employees DATABASE @@ -65,12 +69,12 @@ ALTER DATABASE employees SET DBPROPERTIES ('Create-by' = 'Kevin', 'Create-date' -- Describe employees DATABASE with EXTENDED option to return additional database properties DESCRIBE DATABASE EXTENDED employees; +-------------------------+---------------------------------------------+ - |database_description_item|database_description_value | + |database_description_item| database_description_value| +-------------------------+---------------------------------------------+ - |Database Name |employees | - |Description |For software companies | - |Location |file:/Users/Temp/employees.db | - |Properties |((Create-by,kevin), (Create-date,09/01/2019))| + | Database Name| employees| + | Description| For software companies| + | Location| file:/Users/Temp/employees.db| + | Properties|((Create-by,kevin), (Create-date,09/01/2019))| +-------------------------+---------------------------------------------+ -- Create deployment SCHEMA @@ -81,14 +85,14 @@ DESC DATABASE deployment; +-------------------------+------------------------------+ |database_description_item|database_description_value | +-------------------------+------------------------------+ - |Database Name |deployment | - |Description |Deployment environment | - |Location |file:/Users/Temp/deployment.db| + | Database Name| deployment| + | Description| Deployment environment| + | Location|file:/Users/Temp/deployment.db| +-------------------------+------------------------------+ - {% endhighlight %} ### Related Statements -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) + + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) diff --git a/docs/sql-ref-syntax-aux-describe-function.md b/docs/sql-ref-syntax-aux-describe-function.md index f3c9c625b97b8..a4ff76bddf782 100644 --- a/docs/sql-ref-syntax-aux-describe-function.md +++ b/docs/sql-ref-syntax-aux-describe-function.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description `DESCRIBE FUNCTION` statement returns the basic metadata information of an @@ -26,11 +27,13 @@ class and the usage details. If the optional `EXTENDED` option is specified, th metadata information is returned along with the extended usage information. ### Syntax + {% highlight sql %} { DESC | DESCRIBE } FUNCTION [ EXTENDED ] function_name {% endhighlight %} ### Parameters +
function_name
@@ -46,6 +49,7 @@ metadata information is returned along with the extended usage information.
### Examples + {% highlight sql %} -- Describe a builtin scalar function. -- Returns function name, implementing class and usage @@ -107,6 +111,7 @@ DESC FUNCTION EXTENDED explode {% endhighlight %} ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) diff --git a/docs/sql-ref-syntax-aux-describe-query.md b/docs/sql-ref-syntax-aux-describe-query.md index b07ebe78193d1..f64416adc556d 100644 --- a/docs/sql-ref-syntax-aux-describe-query.md +++ b/docs/sql-ref-syntax-aux-describe-query.md @@ -20,16 +20,19 @@ license: | --- ### Description + The `DESCRIBE QUERY` statement is used to return the metadata of output of a query. A shorthand `DESC` may be used instead of `DESCRIBE` to describe the query output. ### Syntax + {% highlight sql %} { DESC | DESCRIBE } [ QUERY ] input_statement {% endhighlight %} ### Parameters +
QUERY
This clause is optional and may be omitted.
@@ -49,6 +52,7 @@ describe the query output.
### Examples + {% highlight sql %} -- Create table `person` CREATE TABLE person (name STRING , age INT COMMENT 'Age column', address STRING); @@ -56,19 +60,19 @@ CREATE TABLE person (name STRING , age INT COMMENT 'Age column', address STRING) -- Returns column metadata information for a simple select query DESCRIBE QUERY select age, sum(age) FROM person GROUP BY age; +--------+---------+----------+ - |col_name|data_type|comment | + |col_name|data_type| comment| +--------+---------+----------+ - |age |int |Age column| - |sum(age)|bigint |null | + | age| int|Age column| + |sum(age)| bigint| null| +--------+---------+----------+ -- Returns column metadata information for common table expression (`CTE`). DESCRIBE QUERY WITH all_names_cte - AS (SELECT name from person) SELECT * FROM all_names_cte; + AS (SELECT name from person) SELECT * FROM all_names_cte; +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ - |name |string |null | + | name| string| null| +--------+---------+-------+ -- Returns column metadata information for a inline table. @@ -76,32 +80,33 @@ DESC QUERY VALUES(100, 'John', 10000.20D) AS employee(id, name, salary); +--------+---------+-------+ |col_name|data_type|comment| +--------+---------+-------+ - |id |int |null | - |name |string |null | - |salary |double |null | + | id| int| null| + | name| string| null| + | salary| double| null| +--------+---------+-------+ -- Returns column metadata information for `TABLE` statement. DESC QUERY TABLE person; +--------+---------+----------+ - |col_name|data_type|comment | + |col_name|data_type| comment| +--------+---------+----------+ - |name |string |null | - |age |int |Age column| - |address |string |null | + | name| string| null| + | age| int| Agecolumn| + | address| string| null| +--------+---------+----------+ -- Returns column metadata information for a `FROM` statement. -- `QUERY` clause is optional and can be omitted. DESCRIBE FROM person SELECT age; +--------+---------+----------+ - |col_name|data_type|comment | + |col_name|data_type| comment| +--------+---------+----------+ - |age |int |Age column| + | age| int| Agecolumn| +--------+---------+----------+ {% endhighlight %} ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) diff --git a/docs/sql-ref-syntax-aux-describe-table.md b/docs/sql-ref-syntax-aux-describe-table.md index 4e6aeb5b6f349..a8eee97b4dc1e 100644 --- a/docs/sql-ref-syntax-aux-describe-table.md +++ b/docs/sql-ref-syntax-aux-describe-table.md @@ -18,18 +18,22 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description + `DESCRIBE TABLE` statement returns the basic metadata information of a table. The metadata information includes column name, column type and column comment. Optionally a partition spec or column name may be specified to return the metadata pertaining to a partition or column respectively. ### Syntax + {% highlight sql %} { DESC | DESCRIBE } [ TABLE ] [ format ] table_identifier [ partition_spec ] [ col_name ] {% endhighlight %} ### Parameters +
format
@@ -69,101 +73,105 @@ to return the metadata pertaining to a partition or column respectively.
### Examples + {% highlight sql %} -- Creates a table `customer`. Assumes current database is `salesdb`. CREATE TABLE customer( - cust_id INT, - state VARCHAR(20), - name STRING COMMENT 'Short name' - ) - USING parquet - PARTITION BY state; - ; - + cust_id INT, + state VARCHAR(20), + name STRING COMMENT 'Short name' + ) + USING parquet + PARTITIONED BY (state); + +INSERT INTO customer PARTITION (state = 'AR') VALUES (100, 'Mike'); + -- Returns basic metadata information for unqualified table `customer` DESCRIBE TABLE customer; +-----------------------+---------+----------+ - |col_name |data_type|comment | + | col_name|data_type| comment| +-----------------------+---------+----------+ - |cust_id |int |null | - |name |string |Short name| - |state |string |null | + | cust_id| int| null| + | name| string|Short name| + | state| string| null| |# Partition Information| | | - |# col_name |data_type|comment | - |state |string |null | + | # col_name|data_type| comment| + | state| string| null| +-----------------------+---------+----------+ -- Returns basic metadata information for qualified table `customer` DESCRIBE TABLE salesdb.customer; +-----------------------+---------+----------+ - |col_name |data_type|comment | + | col_name|data_type| comment| +-----------------------+---------+----------+ - |cust_id |int |null | - |name |string |Short name| - |state |string |null | + | cust_id| int| null| + | name| string|Short name| + | state| string| null| |# Partition Information| | | - |# col_name |data_type|comment | - |state |string |null | + | # col_name|data_type| comment| + | state| string| null| +-----------------------+---------+----------+ -- Returns additional metadata such as parent database, owner, access time etc. DESCRIBE TABLE EXTENDED customer; +----------------------------+------------------------------+----------+ - |col_name |data_type |comment | + | col_name| data_type| comment| +----------------------------+------------------------------+----------+ - |cust_id |int |null | - |name |string |Short name| - |state |string |null | - |# Partition Information | | | - |# col_name |data_type |comment | - |state |string |null | + | cust_id| int| null| + | name| string|Short name| + | state| string| null| + | # Partition Information| | | + | # col_name| data_type| comment| + | state| string| null| | | | | |# Detailed Table Information| | | - |Database |salesdb | | - |Table |customer | | - |Owner | | | - |Created Time |Fri Aug 30 09:26:04 PDT 2019 | | - |Last Access |Wed Dec 31 16:00:00 PST 1969 | | - |Created By | | | - |Type |MANAGED | | - |Provider |parquet | | - |Location |file:.../salesdb.db/customer | | - |Serde Library |...serde.ParquetHiveSerDe | | - |InputFormat |...MapredParquetInputFormat | | - |OutputFormat |...MapredParquetOutputFormat | | + | Database| default| | + | Table| customer| | + | Owner|
| | + | Created Time| Tue Apr 07 22:56:34 JST 2020| | + | Last Access| UNKNOWN| | + | Created By| | | + | Type| MANAGED| | + | Provider| parquet| | + | Location|file:/tmp/salesdb.db/custom...| | + | Serde Library|org.apache.hadoop.hive.ql.i...| | + | InputFormat|org.apache.hadoop.hive.ql.i...| | + | OutputFormat|org.apache.hadoop.hive.ql.i...| | + | Partition Provider| Catalog| | +----------------------------+------------------------------+----------+ -- Returns partition metadata such as partitioning column name, column type and comment. -DESCRIBE TABLE customer PARTITION (state = 'AR'); - - +--------------------------------+-----------------------------------------+----------+ - |col_name |data_type |comment | - +--------------------------------+-----------------------------------------+----------+ - |cust_id |int |null | - |name |string |Short name| - |state |string |null | - |# Partition Information | | | - |# col_name |data_type |comment | - |state |string |null | - | | | | - |# Detailed Partition Information| | | - |Database |salesdb | | - |Table |customer | | - |Partition Values |[state=AR] | | - |Location |file:.../salesdb.db/customer/state=AR | | - |Serde Library |...serde.ParquetHiveSerDe | | - |InputFormat |...parquet.MapredParquetInputFormat | | - |OutputFormat |...parquet.MapredParquetOutputFormat | | - |Storage Properties |[path=file:.../salesdb.db/customer, | | - | | serialization.format=1] | | - |Partition Parameters |{rawDataSize=-1, numFiles=1l, | | - | | transient_lastDdlTime=1567185245, | | - | | totalSize=688, | | - | | COLUMN_STATS_ACCURATE=false, numRows=-1}| | - |Created Time |Fri Aug 30 10:14:05 PDT 2019 | | - |Last Access |Wed Dec 31 16:00:00 PST 1969 | | - |Partition Statistics |688 bytes | | - +--------------------------------+-----------------------------------------+----------+ +DESCRIBE TABLE EXTENDED customer PARTITION (state = 'AR'); + +------------------------------+------------------------------+----------+ + | col_name| data_type| comment| + +------------------------------+------------------------------+----------+ + | cust_id| int| null| + | name| string|Short name| + | state| string| null| + | # Partition Information| | | + | # col_name| data_type| comment| + | state| string| null| + | | | | + |# Detailed Partition Inform...| | | + | Database| default| | + | Table| customer| | + | Partition Values| [state=AR]| | + | Location|file:/tmp/salesdb.db/custom...| | + | Serde Library|org.apache.hadoop.hive.ql.i...| | + | InputFormat|org.apache.hadoop.hive.ql.i...| | + | OutputFormat|org.apache.hadoop.hive.ql.i...| | + | Storage Properties|[serialization.format=1, pa...| | + | Partition Parameters|{transient_lastDdlTime=1586...| | + | Created Time| Tue Apr 07 23:05:43 JST 2020| | + | Last Access| UNKNOWN| | + | Partition Statistics| 659 bytes| | + | | | | + | # Storage Information| | | + | Location|file:/tmp/salesdb.db/custom...| | + | Serde Library|org.apache.hadoop.hive.ql.i...| | + | InputFormat|org.apache.hadoop.hive.ql.i...| | + | OutputFormat|org.apache.hadoop.hive.ql.i...| | + +------------------------------+------------------------------+----------+ -- Returns the metadata for `name` column. -- Optional `TABLE` clause is omitted and column is fully qualified. @@ -171,13 +179,14 @@ DESCRIBE customer salesdb.customer.name; +---------+----------+ |info_name|info_value| +---------+----------+ - |col_name |name | - |data_type|string | - |comment |Short name| + | col_name| name| + |data_type| string| + | comment|Short name| +---------+----------+ {% endhighlight %} ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) diff --git a/docs/sql-ref-syntax-aux-describe.md b/docs/sql-ref-syntax-aux-describe.md index 9f17746316480..723943f97aa07 100644 --- a/docs/sql-ref-syntax-aux-describe.md +++ b/docs/sql-ref-syntax-aux-describe.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -* [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -* [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -* [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) -* [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + * [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) diff --git a/docs/sql-ref-syntax-aux-refresh-table.md b/docs/sql-ref-syntax-aux-refresh-table.md index b248ee67fa12a..165ca68309f4a 100644 --- a/docs/sql-ref-syntax-aux-refresh-table.md +++ b/docs/sql-ref-syntax-aux-refresh-table.md @@ -20,16 +20,19 @@ license: | --- ### Description + `REFRESH TABLE` statement invalidates the cached entries, which include data and metadata of the given table or view. The invalidated cache is populated in lazy manner when the cached table or the query associated with it is executed again. ### Syntax + {% highlight sql %} REFRESH [TABLE] table_identifier {% endhighlight %} ### Parameters +
table_identifier
@@ -42,6 +45,7 @@ REFRESH [TABLE] table_identifier
### Examples + {% highlight sql %} -- The cached entries of the table will be refreshed -- The table is resolved from the current database as the table name is unqualified. @@ -53,7 +57,8 @@ REFRESH TABLE tempDB.view1; {% endhighlight %} ### Related Statements -- [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) -- [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) -- [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) -- [REFRESH](sql-ref-syntax-aux-cache-refresh.html) + + * [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) + * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) + * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) + * [REFRESH](sql-ref-syntax-aux-cache-refresh.html) diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-add-file.md b/docs/sql-ref-syntax-aux-resource-mgmt-add-file.md index 7e485cbafe709..0028884308890 100644 --- a/docs/sql-ref-syntax-aux-resource-mgmt-add-file.md +++ b/docs/sql-ref-syntax-aux-resource-mgmt-add-file.md @@ -20,20 +20,24 @@ license: | --- ### Description + `ADD FILE` can be used to add a single file as well as a directory to the list of resources. The added resource can be listed using [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html). ### Syntax + {% highlight sql %} ADD FILE resource_name {% endhighlight %} ### Parameters +
resource_name
The name of the file or directory to be added.
### Examples + {% highlight sql %} ADD FILE /tmp/test; ADD FILE "/path/to/file/abc.txt"; @@ -43,6 +47,7 @@ ADD FILE "/path/to/some/directory"; {% endhighlight %} ### Related Statements + * [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) * [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) * [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md b/docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md index db0a85013321d..c4020347c1be0 100644 --- a/docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md +++ b/docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md @@ -20,20 +20,24 @@ license: | --- ### Description + `ADD JAR` adds a JAR file to the list of resources. The added JAR file can be listed using [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html). ### Syntax + {% highlight sql %} ADD JAR file_name {% endhighlight %} ### Parameters +
file_name
The name of the JAR file to be added. It could be either on a local file system or a distributed file system.
### Examples + {% highlight sql %} ADD JAR /tmp/test.jar; ADD JAR "/path/to/some.jar"; @@ -42,6 +46,7 @@ ADD JAR "/path with space/abc.jar"; {% endhighlight %} ### Related Statements + * [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) * [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) * [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md index c42bf7ae8dd41..eec98e1fbffb5 100644 --- a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md +++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md @@ -20,14 +20,17 @@ license: | --- ### Description + `LIST FILE` lists the resources added by [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html). ### Syntax + {% highlight sql %} LIST FILE {% endhighlight %} ### Examples + {% highlight sql %} ADD FILE /tmp/test; ADD FILE /tmp/test_2; @@ -42,6 +45,7 @@ file:/private/tmp/test {% endhighlight %} ### Related Statements + * [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) * [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) * [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md index 9d1739753099e..dca4252c90ef2 100644 --- a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md +++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md @@ -20,14 +20,17 @@ license: | --- ### Description + `LIST JAR` lists the JARs added by [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html). ### Syntax + {% highlight sql %} LIST JAR {% endhighlight %} ### Examples + {% highlight sql %} ADD JAR /tmp/test.jar; ADD JAR /tmp/test_2.jar; @@ -42,6 +45,7 @@ spark://192.168.1.112:62859/jars/test.jar {% endhighlight %} ### Related Statements + * [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) * [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) * [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) diff --git a/docs/sql-ref-syntax-aux-resource-mgmt.md b/docs/sql-ref-syntax-aux-resource-mgmt.md index 0885f56bdb7cf..50c12ef7c2beb 100644 --- a/docs/sql-ref-syntax-aux-resource-mgmt.md +++ b/docs/sql-ref-syntax-aux-resource-mgmt.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -* [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) -* [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) -* [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) -* [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) + * [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) + * [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) + * [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) + * [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) diff --git a/docs/sql-ref-syntax-aux-show-columns.md b/docs/sql-ref-syntax-aux-show-columns.md index 0c8aba83a8403..8f73aac0e3a61 100644 --- a/docs/sql-ref-syntax-aux-show-columns.md +++ b/docs/sql-ref-syntax-aux-show-columns.md @@ -18,15 +18,19 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description + Return the list of columns in a table. If the table does not exist, an exception is thrown. ### Syntax + {% highlight sql %} SHOW COLUMNS table_identifier [ database ] {% endhighlight %} ### Parameters +
table_identifier
@@ -54,44 +58,47 @@ SHOW COLUMNS table_identifier [ database ]
### Examples + {% highlight sql %} -- Create `customer` table in `salesdb` database; USE salesdb; -CREATE TABLE customer(cust_cd INT, - name VARCHAR(100), - cust_addr STRING); +CREATE TABLE customer( + cust_cd INT, + name VARCHAR(100), + cust_addr STRING); -- List the columns of `customer` table in current database. SHOW COLUMNS IN customer; +---------+ - |col_name | + | col_name| +---------+ - |cust_cd | - |name | + | cust_cd| + | name| |cust_addr| +---------+ -- List the columns of `customer` table in `salesdb` database. SHOW COLUMNS IN salesdb.customer; +---------+ - |col_name | + | col_name| +---------+ - |cust_cd | - |name | + | cust_cd| + | name| |cust_addr| +---------+ -- List the columns of `customer` table in `salesdb` database SHOW COLUMNS IN customer IN salesdb; +---------+ - |col_name | + | col_name| +---------+ - |cust_cd | - |name | + | cust_cd| + | name| |cust_addr| +---------+ {% endhighlight %} ### Related Statements -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [SHOW TABLE](sql-ref-syntax-aux-show-table.html) + + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [SHOW TABLE](sql-ref-syntax-aux-show-table.html) diff --git a/docs/sql-ref-syntax-aux-show-create-table.md b/docs/sql-ref-syntax-aux-show-create-table.md index 24aba602ab3cf..0a37c96bfc5ab 100644 --- a/docs/sql-ref-syntax-aux-show-create-table.md +++ b/docs/sql-ref-syntax-aux-show-create-table.md @@ -20,14 +20,17 @@ license: | --- ### Description + `SHOW CREATE TABLE` returns the [CREATE TABLE statement](sql-ref-syntax-ddl-create-table.html) or [CREATE VIEW statement](sql-ref-syntax-ddl-create-view.html) that was used to create a given table or view. `SHOW CREATE TABLE` on a non-existent table or a temporary view throws an exception. ### Syntax + {% highlight sql %} SHOW CREATE TABLE table_identifier {% endhighlight %} ### Parameters +
table_identifier
@@ -40,31 +43,26 @@ SHOW CREATE TABLE table_identifier
### Examples + {% highlight sql %} CREATE TABLE test (c INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE TBLPROPERTIES ('prop1' = 'value1', 'prop2' = 'value2'); -show create table test; - --- the result of SHOW CREATE TABLE test -CREATE TABLE `test`(`c` INT) -ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' -WITH SERDEPROPERTIES ( - 'field.delim' = ',', - 'serialization.format' = ',' -) -STORED AS - INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' - OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' -TBLPROPERTIES ( - 'transient_lastDdlTime' = '1569350233', - 'prop1' = 'value1', - 'prop2' = 'value2' -) - +SHOW CREATE TABLE test; + +----------------------------------------------------+ + | createtab_stmt| + +----------------------------------------------------+ + |CREATE TABLE `default`.`test` (`c` INT) + USING text + TBLPROPERTIES ( + 'transient_lastDdlTime' = '1586269021', + 'prop1' = 'value1', + 'prop2' = 'value2') + +----------------------------------------------------+ {% endhighlight %} ### Related Statements + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) * [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) diff --git a/docs/sql-ref-syntax-aux-show-databases.md b/docs/sql-ref-syntax-aux-show-databases.md index 9d4be21daeabc..0ed34527bcbf3 100644 --- a/docs/sql-ref-syntax-aux-show-databases.md +++ b/docs/sql-ref-syntax-aux-show-databases.md @@ -20,17 +20,20 @@ license: | --- ### Description + Lists the databases that match an optionally supplied string pattern. If no pattern is supplied then the command lists all the databases in the system. Please note that the usage of `SCHEMAS` and `DATABASES` are interchangeable and mean the same thing. ### Syntax + {% highlight sql %} SHOW { DATABASES | SCHEMAS } [ LIKE string_pattern ] {% endhighlight %} ### Parameters +
LIKE string_pattern
@@ -40,6 +43,7 @@ SHOW { DATABASES | SCHEMAS } [ LIKE string_pattern ]
### Examples + {% highlight sql %} -- Create database. Assumes a database named `default` already exists in -- the system. @@ -55,6 +59,7 @@ SHOW DATABASES; | payments_db| | payroll_db| +------------+ + -- Lists databases with name starting with string pattern `pay` SHOW DATABASES LIKE 'pay*'; +------------+ @@ -63,6 +68,7 @@ SHOW DATABASES LIKE 'pay*'; | payments_db| | payroll_db| +------------+ + -- Lists all databases. Keywords SCHEMAS and DATABASES are interchangeable. SHOW SCHEMAS; +------------+ @@ -73,7 +79,9 @@ SHOW SCHEMAS; | payroll_db| +------------+ {% endhighlight %} + ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) diff --git a/docs/sql-ref-syntax-aux-show-functions.md b/docs/sql-ref-syntax-aux-show-functions.md index d6f9df9896afe..da33d999f0b38 100644 --- a/docs/sql-ref-syntax-aux-show-functions.md +++ b/docs/sql-ref-syntax-aux-show-functions.md @@ -20,6 +20,7 @@ license: | --- ### Description + Returns the list of functions after applying an optional regex pattern. Given number of functions supported by Spark is quite large, this statement in conjunction with [describe function](sql-ref-syntax-aux-describe-function.html) @@ -27,11 +28,13 @@ may be used to quickly find the function and understand its usage. The `LIKE` clause is optional and supported only for compatibility with other systems. ### Syntax + {% highlight sql %} SHOW [ function_kind ] FUNCTIONS ( [ LIKE ] function_name | regex_pattern ) {% endhighlight %} ### Parameters +
function_kind
@@ -66,6 +69,7 @@ SHOW [ function_kind ] FUNCTIONS ( [ LIKE ] function_name | regex_pattern )
### Examples + {% highlight sql %} -- List a system function `trim` by searching both user defined and system -- defined functions. @@ -73,7 +77,7 @@ SHOW FUNCTIONS trim; +--------+ |function| +--------+ - |trim | + | trim| +--------+ -- List a system function `concat` by searching system defined functions. @@ -81,7 +85,7 @@ SHOW SYSTEM FUNCTIONS concat; +--------+ |function| +--------+ - |concat | + | concat| +--------+ -- List a qualified function `max` from database `salesdb`. @@ -89,30 +93,31 @@ SHOW SYSTEM FUNCTIONS salesdb.max; +--------+ |function| +--------+ - |max | + | max| +--------+ -- List all functions starting with `t` SHOW FUNCTIONS LIKE 't*'; +-----------------+ - |function | + | function| +-----------------+ - |tan | - |tanh | - |timestamp | - |tinyint | - |to_csv | - |to_date | - |to_json | - |to_timestamp | + | tan| + | tanh| + | timestamp| + | tinyint| + | to_csv| + | to_date| + | to_json| + | to_timestamp| |to_unix_timestamp| - |to_utc_timestamp | - |transform | - |transform_keys | - |transform_values | - |translate | - |trim | - |trunc | + | to_utc_timestamp| + | transform| + | transform_keys| + | transform_values| + | translate| + | trim| + | trunc| + | typeof| +-----------------+ -- List all functions starting with `yea` or `windo` @@ -120,8 +125,8 @@ SHOW FUNCTIONS LIKE 'yea*|windo*'; +--------+ |function| +--------+ - |window | - |year | + | window| + | year| +--------+ -- Use normal regex pattern to list function names that has 4 characters @@ -130,10 +135,11 @@ SHOW FUNCTIONS LIKE 't[a-z][a-z][a-z]'; +--------+ |function| +--------+ - |tanh | - |trim | + | tanh| + | trim| +--------+ {% endhighlight %} -### Related statements -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) +### Related Statements + + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) diff --git a/docs/sql-ref-syntax-aux-show-partitions.md b/docs/sql-ref-syntax-aux-show-partitions.md index 6c8401e8e4718..31b881ea8f141 100644 --- a/docs/sql-ref-syntax-aux-show-partitions.md +++ b/docs/sql-ref-syntax-aux-show-partitions.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description The `SHOW PARTITIONS` statement is used to list partitions of a table. An optional @@ -25,11 +26,13 @@ partition spec may be specified to return the partitions matching the supplied partition spec. ### Syntax + {% highlight sql %} SHOW PARTITIONS table_identifier [ partition_spec ] {% endhighlight %} ### Parameters +
table_identifier
@@ -53,6 +56,7 @@ SHOW PARTITIONS table_identifier [ partition_spec ]
### Examples + {% highlight sql %} -- create a partitioned table and insert a few rows. USE salesdb; @@ -64,27 +68,27 @@ INSERT INTO customer PARTITION (state = 'AZ', city = 'Peoria') VALUES (300, 'Dan -- Lists all partitions for table `customer` SHOW PARTITIONS customer; +----------------------+ - |partition | + | partition| +----------------------+ - |state=AZ/city=Peoria | - |state=CA/city=Fremont | + | state=AZ/city=Peoria| + | state=CA/city=Fremont| |state=CA/city=San Jose| +----------------------+ -- Lists all partitions for the qualified table `customer` SHOW PARTITIONS salesdb.customer; +----------------------+ - |partition | + | partition| +----------------------+ - |state=AZ/city=Peoria | - |state=CA/city=Fremont | + | state=AZ/city=Peoria| + | state=CA/city=Fremont| |state=CA/city=San Jose| +----------------------+ -- Specify a full partition spec to list specific partition SHOW PARTITIONS customer PARTITION (state = 'CA', city = 'Fremont'); +---------------------+ - |partition | + | partition| +---------------------+ |state=CA/city=Fremont| +---------------------+ @@ -92,23 +96,24 @@ SHOW PARTITIONS customer PARTITION (state = 'CA', city = 'Fremont'); -- Specify a partial partition spec to list the specific partitions SHOW PARTITIONS customer PARTITION (state = 'CA'); +----------------------+ - |partition | + | partition| +----------------------+ - |state=CA/city=Fremont | + | state=CA/city=Fremont| |state=CA/city=San Jose| +----------------------+ -- Specify a partial spec to list specific partition SHOW PARTITIONS customer PARTITION (city = 'San Jose'); +----------------------+ - |partition | + | partition| +----------------------+ |state=CA/city=San Jose| +----------------------+ {% endhighlight %} -### Related statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [INSERT STATEMENT](sql-ref-syntax-dml-insert.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [SHOW TABLE](sql-ref-syntax-aux-show-table.html) +### Related Statements + + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [INSERT STATEMENT](sql-ref-syntax-dml-insert.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [SHOW TABLE](sql-ref-syntax-aux-show-table.html) diff --git a/docs/sql-ref-syntax-aux-show-table.md b/docs/sql-ref-syntax-aux-show-table.md index 49696585ba581..1aa44d3ab30ea 100644 --- a/docs/sql-ref-syntax-aux-show-table.md +++ b/docs/sql-ref-syntax-aux-show-table.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description `SHOW TABLE EXTENDED` will show information for all tables matching the given regular expression. @@ -30,12 +31,14 @@ information such as `Partition Parameters` and `Partition Statistics`. Note that cannot be used with a partition specification. ### Syntax + {% highlight sql %} SHOW TABLE EXTENDED [ IN | FROM database_name ] LIKE 'identifier_with_wildcards' [ partition_spec ] {% endhighlight %} ### Parameters +
IN|FROM database_name
@@ -61,123 +64,123 @@ SHOW TABLE EXTENDED [ IN | FROM database_name ] LIKE 'identifier_with_wildcards'
+ ### Examples + {% highlight sql %} -- Assumes `employee` table created with partitioned by column `grade` --- +-------+--------+--+ --- | name | grade | --- +-------+--------+--+ --- | sam | 1 | --- | suj | 2 | --- +-------+--------+--+ +CREATE TABLE employee(name STRING, grade INT) PARTITIONED BY (grade); +INSERT INTO employee PARTITION (grade = 1) VALUES ('sam'); +INSERT INTO employee PARTITION (grade = 2) VALUES ('suj'); -- Show the details of the table -SHOW TABLE EXTENDED LIKE `employee`; -+--------+---------+-----------+--------------------------------------------------------------- -|database|tableName|isTemporary| information -+--------+---------+-----------+--------------------------------------------------------------- -|default |employee |false |Database: default - Table: employee - Owner: root - Created Time: Fri Aug 30 15:10:21 IST 2019 - Last Access: Thu Jan 01 05:30:00 IST 1970 - Created By: Spark 3.0.0-SNAPSHOT - Type: MANAGED - Provider: hive - Table Properties: [transient_lastDdlTime=1567158021] - Location: file:/opt/spark1/spark/spark-warehouse/employee - Serde Library: org.apache.hadoop.hive.serde2.lazy - .LazySimpleSerDe - InputFormat: org.apache.hadoop.mapred.TextInputFormat - OutputFormat: org.apache.hadoop.hive.ql.io - .HiveIgnoreKeyTextOutputFormat - Storage Properties: [serialization.format=1] - Partition Provider: Catalog - Partition Columns: [`grade`] - Schema: root - |-- name: string (nullable = true) - |-- grade: integer (nullable = true) - -+--------+---------+-----------+--------------------------------------------------------------- +SHOW TABLE EXTENDED LIKE 'employee'; + +--------+---------+-----------+--------------------------------------------------------------+ + |database|tableName|isTemporary| information | + +--------+---------+-----------+--------------------------------------------------------------+ + |default |employee |false |Database: default + Table: employee + Owner: root + Created Time: Fri Aug 30 15:10:21 IST 2019 + Last Access: Thu Jan 01 05:30:00 IST 1970 + Created By: Spark 3.0.0-SNAPSHOT + Type: MANAGED + Provider: hive + Table Properties: [transient_lastDdlTime=1567158021] + Location: file:/opt/spark1/spark/spark-warehouse/employee + Serde Library: org.apache.hadoop.hive.serde2.lazy + .LazySimpleSerDe + InputFormat: org.apache.hadoop.mapred.TextInputFormat + OutputFormat: org.apache.hadoop.hive.ql.io + .HiveIgnoreKeyTextOutputFormat + Storage Properties: [serialization.format=1] + Partition Provider: Catalog + Partition Columns: [`grade`] + Schema: root + |-- name: string (nullable = true) + |-- grade: integer (nullable = true) + + +--------+---------+-----------+--------------------------------------------------------------+ -- showing the multiple table details with pattern matching SHOW TABLE EXTENDED LIKE `employe*`; -+--------+---------+-----------+--------------------------------------------------------------- -|database|tableName|isTemporary| information -+--------+---------+-----------+--------------------------------------------------------------- -|default |employee |false |Database: default - Table: employee - Owner: root - Created Time: Fri Aug 30 15:10:21 IST 2019 - Last Access: Thu Jan 01 05:30:00 IST 1970 - Created By: Spark 3.0.0-SNAPSHOT - Type: MANAGED - Provider: hive - Table Properties: [transient_lastDdlTime=1567158021] - Location: file:/opt/spark1/spark/spark-warehouse/employee - Serde Library: org.apache.hadoop.hive.serde2.lazy - .LazySimpleSerDe - InputFormat: org.apache.hadoop.mapred.TextInputFormat - OutputFormat: org.apache.hadoop.hive.ql.io - .HiveIgnoreKeyTextOutputFormat - Storage Properties: [serialization.format=1] - Partition Provider: Catalog - Partition Columns: [`grade`] - Schema: root - |-- name: string (nullable = true) - |-- grade: integer (nullable = true) - -|default |employee1|false |Database: default - Table: employee1 - Owner: root - Created Time: Fri Aug 30 15:22:33 IST 2019 - Last Access: Thu Jan 01 05:30:00 IST 1970 - Created By: Spark 3.0.0-SNAPSHOT - Type: MANAGED - Provider: hive - Table Properties: [transient_lastDdlTime=1567158753] - Location: file:/opt/spark1/spark/spark-warehouse/employee1 - Serde Library: org.apache.hadoop.hive.serde2.lazy - .LazySimpleSerDe - InputFormat: org.apache.hadoop.mapred.TextInputFormat - OutputFormat: org.apache.hadoop.hive.ql.io - .HiveIgnoreKeyTextOutputFormat - Storage Properties: [serialization.format=1] - Partition Provider: Catalog - Schema: root - |-- name: string (nullable = true) - -+--------+---------+----------+---------------------------------------------------------------- + +--------+---------+-----------+--------------------------------------------------------------+ + |database|tableName|isTemporary| information | + +--------+---------+-----------+--------------------------------------------------------------+ + |default |employee |false |Database: default + Table: employee + Owner: root + Created Time: Fri Aug 30 15:10:21 IST 2019 + Last Access: Thu Jan 01 05:30:00 IST 1970 + Created By: Spark 3.0.0-SNAPSHOT + Type: MANAGED + Provider: hive + Table Properties: [transient_lastDdlTime=1567158021] + Location: file:/opt/spark1/spark/spark-warehouse/employee + Serde Library: org.apache.hadoop.hive.serde2.lazy + .LazySimpleSerDe + InputFormat: org.apache.hadoop.mapred.TextInputFormat + OutputFormat: org.apache.hadoop.hive.ql.io + .HiveIgnoreKeyTextOutputFormat + Storage Properties: [serialization.format=1] + Partition Provider: Catalog + Partition Columns: [`grade`] + Schema: root + |-- name: string (nullable = true) + |-- grade: integer (nullable = true) + + |default |employee1|false |Database: default + Table: employee1 + Owner: root + Created Time: Fri Aug 30 15:22:33 IST 2019 + Last Access: Thu Jan 01 05:30:00 IST 1970 + Created By: Spark 3.0.0-SNAPSHOT + Type: MANAGED + Provider: hive + Table Properties: [transient_lastDdlTime=1567158753] + Location: file:/opt/spark1/spark/spark-warehouse/employee1 + Serde Library: org.apache.hadoop.hive.serde2.lazy + .LazySimpleSerDe + InputFormat: org.apache.hadoop.mapred.TextInputFormat + OutputFormat: org.apache.hadoop.hive.ql.io + .HiveIgnoreKeyTextOutputFormat + Storage Properties: [serialization.format=1] + Partition Provider: Catalog + Schema: root + |-- name: string (nullable = true) + + +--------+---------+----------+---------------------------------------------------------------+ -- show partition file system details SHOW TABLE EXTENDED IN `default` LIKE `employee` PARTITION (`grade=1`); -+--------+---------+-----------+--------------------------------------------------------------- -|database|tableName|isTemporary| information -+--------+---------+-----------+--------------------------------------------------------------- -|default |employee |false | Partition Values: [grade=1] - Location: file:/opt/spark1/spark/spark-warehouse/employee - /grade=1 - Serde Library: org.apache.hadoop.hive.serde2.lazy - .LazySimpleSerDe - InputFormat: org.apache.hadoop.mapred.TextInputFormat - OutputFormat: org.apache.hadoop.hive.ql.io - .HiveIgnoreKeyTextOutputFormat - Storage Properties: [serialization.format=1] - Partition Parameters: {rawDataSize=-1, numFiles=1, - transient_lastDdlTime=1567158221, totalSize=4, - COLUMN_STATS_ACCURATE=false, numRows=-1} - Created Time: Fri Aug 30 15:13:41 IST 2019 - Last Access: Thu Jan 01 05:30:00 IST 1970 - Partition Statistics: 4 bytes - | -+--------+---------+-----------+--------------------------------------------------------------- + +--------+---------+-----------+--------------------------------------------------------------+ + |database|tableName|isTemporary| information | + +--------+---------+-----------+--------------------------------------------------------------+ + |default |employee |false |Partition Values: [grade=1] + Location: file:/opt/spark1/spark/spark-warehouse/employee + /grade=1 + Serde Library: org.apache.hadoop.hive.serde2.lazy + .LazySimpleSerDe + InputFormat: org.apache.hadoop.mapred.TextInputFormat + OutputFormat: org.apache.hadoop.hive.ql.io + .HiveIgnoreKeyTextOutputFormat + Storage Properties: [serialization.format=1] + Partition Parameters: {rawDataSize=-1, numFiles=1, + transient_lastDdlTime=1567158221, totalSize=4, + COLUMN_STATS_ACCURATE=false, numRows=-1} + Created Time: Fri Aug 30 15:13:41 IST 2019 + Last Access: Thu Jan 01 05:30:00 IST 1970 + Partition Statistics: 4 bytes + | + +--------+---------+-----------+--------------------------------------------------------------+ -- show partition file system details with regex fails as shown below SHOW TABLE EXTENDED IN `default` LIKE `empl*` PARTITION (`grade=1`); -Error: Error running query: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: - Table or view 'emplo*' not found in database 'default'; (state=,code=0) - + Error: Error running query: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: + Table or view 'emplo*' not found in database 'default'; (state=,code=0) {% endhighlight %} + ### Related Statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) diff --git a/docs/sql-ref-syntax-aux-show-tables.md b/docs/sql-ref-syntax-aux-show-tables.md index 311401cf580be..0b7062ec8eff7 100644 --- a/docs/sql-ref-syntax-aux-show-tables.md +++ b/docs/sql-ref-syntax-aux-show-tables.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description The `SHOW TABLES` statement returns all the tables for an optionally specified database. @@ -26,11 +27,13 @@ pattern. If no database is specified then the tables are returned from the current database. ### Syntax + {% highlight sql %} SHOW TABLES [ { FROM | IN } database_name ] [ LIKE 'regex_pattern' ] {% endhighlight %} ### Parameters +
{ FROM | IN } database_name
@@ -50,58 +53,59 @@ SHOW TABLES [ { FROM | IN } database_name ] [ LIKE 'regex_pattern' ]
### Example + {% highlight sql %} -- List all tables in default database SHOW TABLES; - +-----------+------------+--------------+--+ - | database | tableName | isTemporary | - +-----------+------------+--------------+--+ - | default | sam | false | - | default | sam1 | false | - | default | suj | false | - +-----------+------------+--------------+--+ + +--------+---------+-----------+ + |database|tableName|isTemporary| + +--------+---------+-----------+ + | default| sam| false| + | default| sam1| false| + | default| suj| false| + +--------+---------+-----------+ -- List all tables from userdb database SHOW TABLES FROM userdb; - +-----------+------------+--------------+--+ - | database | tableName | isTemporary | - +-----------+------------+--------------+--+ - | userdb | user1 | false | - | userdb | user2 | false | - +-----------+------------+--------------+--+ + +--------+---------+-----------+ + |database|tableName|isTemporary| + +--------+---------+-----------+ + | userdb| user1| false| + | userdb| user2| false| + +--------+---------+-----------+ -- List all tables in userdb database SHOW TABLES IN userdb; - +-----------+------------+--------------+--+ - | database | tableName | isTemporary | - +-----------+------------+--------------+--+ - | userdb | user1 | false | - | userdb | user2 | false | - +-----------+------------+--------------+--+ + +--------+---------+-----------+ + |database|tableName|isTemporary| + +--------+---------+-----------+ + | userdb| user1| false| + | userdb| user2| false| + +--------+---------+-----------+ -- List all tables from default database matching the pattern `sam*` SHOW TABLES FROM default LIKE 'sam*'; - +-----------+------------+--------------+--+ - | database | tableName | isTemporary | - +-----------+------------+--------------+--+ - | default | sam | false | - | default | sam1 | false | - +-----------+------------+--------------+--+ + +--------+---------+-----------+ + |database|tableName|isTemporary| + +--------+---------+-----------+ + | default| sam| false| + | default| sam1| false| + +--------+---------+-----------+ -- List all tables matching the pattern `sam*|suj` SHOW TABLES LIKE 'sam*|suj'; - +-----------+------------+--------------+--+ - | database | tableName | isTemporary | - +-----------+------------+--------------+--+ - | default | sam | false | - | default | sam1 | false | - | default | suj | false | - +-----------+------------+--------------+--+ - + +--------+---------+-----------+ + |database|tableName|isTemporary| + +--------+---------+-----------+ + | default| sam| false| + | default| sam1| false| + | default| suj| false| + +--------+---------+-----------+ {% endhighlight %} -### Related statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) +### Related Statements + + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) diff --git a/docs/sql-ref-syntax-aux-show-tblproperties.md b/docs/sql-ref-syntax-aux-show-tblproperties.md index 451fd4abc85bb..26e57ef0aba24 100644 --- a/docs/sql-ref-syntax-aux-show-tblproperties.md +++ b/docs/sql-ref-syntax-aux-show-tblproperties.md @@ -20,16 +20,19 @@ license: | --- ### Description + This statement returns the value of a table property given an optional value for a property key. If no key is specified then all the properties are returned. ### Syntax + {% highlight sql %} SHOW TBLPROPERTIES table_identifier [ ( unquoted_property_key | property_key_as_string_literal ) ] {% endhighlight %} ### Parameters +
table_identifier
@@ -64,19 +67,20 @@ SHOW TBLPROPERTIES table_identifier properties are: `numFiles`, `numPartitions`, `numRows`. ### Examples + {% highlight sql %} -- create a table `customer` in database `salesdb` USE salesdb; CREATE TABLE customer(cust_code INT, name VARCHAR(100), cust_addr STRING) - TBLPROPERTIES ('created.by.user' = 'John', 'created.date' = '01-01-2001'); + TBLPROPERTIES ('created.by.user' = 'John', 'created.date' = '01-01-2001'); -- show all the user specified properties for table `customer` SHOW TBLPROPERTIES customer; +---------------------+----------+ - |key |value | + | key| value| +---------------------+----------+ - |created.by.user |John | - |created.date |01-01-2001| + | created.by.user| John| + | created.date|01-01-2001| |transient_lastDdlTime|1567554931| +---------------------+----------+ @@ -84,10 +88,10 @@ SHOW TBLPROPERTIES customer; -- in database `salesdb` SHOW TBLPROPERTIES salesdb.customer; +---------------------+----------+ - |key |value | + | key| value| +---------------------+----------+ - |created.by.user |John | - |created.date |01-01-2001| + | created.by.user| John| + | created.date|01-01-2001| |transient_lastDdlTime|1567554931| +---------------------+----------+ @@ -96,20 +100,21 @@ SHOW TBLPROPERTIES customer (created.by.user); +-----+ |value| +-----+ - |John | + | John| +-----+ -- show value for property `created.date`` specified as string literal SHOW TBLPROPERTIES customer ('created.date'); +----------+ - |value | + | value| +----------+ |01-01-2001| +----------+ {% endhighlight %} ### Related Statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [ALTER TABLE SET TBLPROPERTIES](sql-ref-syntax-ddl-alter-table.html) -- [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) -- [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) + + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [ALTER TABLE SET TBLPROPERTIES](sql-ref-syntax-ddl-alter-table.html) + * [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) + * [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) diff --git a/docs/sql-ref-syntax-aux-show-views.md b/docs/sql-ref-syntax-aux-show-views.md index a5e840d97bf2d..aec3716c2889f 100644 --- a/docs/sql-ref-syntax-aux-show-views.md +++ b/docs/sql-ref-syntax-aux-show-views.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description The `SHOW VIEWS` statement returns all the views for an optionally specified database. diff --git a/docs/sql-ref-syntax-aux-show.md b/docs/sql-ref-syntax-aux-show.md index dd56d467815c7..424fe71370897 100644 --- a/docs/sql-ref-syntax-aux-show.md +++ b/docs/sql-ref-syntax-aux-show.md @@ -18,12 +18,13 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- -* [SHOW COLUMNS](sql-ref-syntax-aux-show-columns.html) -* [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) -* [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) -* [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) -* [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) -* [SHOW TBLPROPERTIES](sql-ref-syntax-aux-show-tblproperties.html) -* [SHOW PARTITIONS](sql-ref-syntax-aux-show-partitions.html) -* [SHOW CREATE TABLE](sql-ref-syntax-aux-show-create-table.html) -* [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) + + * [SHOW COLUMNS](sql-ref-syntax-aux-show-columns.html) + * [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) + * [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) + * [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) + * [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) + * [SHOW TBLPROPERTIES](sql-ref-syntax-aux-show-tblproperties.html) + * [SHOW PARTITIONS](sql-ref-syntax-aux-show-partitions.html) + * [SHOW CREATE TABLE](sql-ref-syntax-aux-show-create-table.html) + * [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) diff --git a/docs/sql-ref-syntax-aux.md b/docs/sql-ref-syntax-aux.md index ba09d70b437a9..3cd758f076e99 100644 --- a/docs/sql-ref-syntax-aux.md +++ b/docs/sql-ref-syntax-aux.md @@ -21,9 +21,9 @@ license: | Besides the major SQL statements such as Data Definition Statements, Data Manipulation Statements and Data Retrieval Statements, Spark SQL also supports the following Auxiliary Statements: -- [ANALYZE](sql-ref-syntax-aux-analyze.html) -- [CACHE](sql-ref-syntax-aux-cache.html) -- [DESCRIBE](sql-ref-syntax-aux-describe.html) -- [SHOW](sql-ref-syntax-aux-show.html) -- [CONFIGURATION MANAGEMENT](sql-ref-syntax-aux-conf-mgmt.html) -- [RESOURCE MANAGEMENT](sql-ref-syntax-aux-resource-mgmt.html) + * [ANALYZE](sql-ref-syntax-aux-analyze.html) + * [CACHE](sql-ref-syntax-aux-cache.html) + * [DESCRIBE](sql-ref-syntax-aux-describe.html) + * [SHOW](sql-ref-syntax-aux-show.html) + * [CONFIGURATION MANAGEMENT](sql-ref-syntax-aux-conf-mgmt.html) + * [RESOURCE MANAGEMENT](sql-ref-syntax-aux-resource-mgmt.html) diff --git a/docs/sql-ref-syntax-ddl-alter-database.md b/docs/sql-ref-syntax-ddl-alter-database.md index a32343674feb0..520aba35567e8 100644 --- a/docs/sql-ref-syntax-ddl-alter-database.md +++ b/docs/sql-ref-syntax-ddl-alter-database.md @@ -18,7 +18,9 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description + You can alter metadata associated with a database by setting `DBPROPERTIES`. The specified property values override any existing value with the same property name. Please note that the usage of `SCHEMA` and `DATABASE` are interchangeable and one can be used in place of the other. An error message @@ -26,18 +28,21 @@ is issued if the database is not found in the system. This command is mostly use for a database and may be used for auditing purposes. ### Syntax + {% highlight sql %} ALTER { DATABASE | SCHEMA } database_name SET DBPROPERTIES ( property_name = property_value, ... ) {% endhighlight %} ### Parameters +
database_name
Specifies the name of the database to be altered.
### Examples + {% highlight sql %} -- Creates a database named `inventory`. CREATE DATABASE inventory; @@ -47,16 +52,16 @@ ALTER DATABASE inventory SET DBPROPERTIES ('Edited-by' = 'John', 'Edit-date' = ' -- Verify that properties are set. DESCRIBE DATABASE EXTENDED inventory; - - +-------------------------+--------------------------------------------+ - |database_description_item|database_description_value | - +-------------------------+--------------------------------------------+ - |Database Name |inventory | - |Description | | - |Location |file:/temp/spark-warehouse/inventory.db | - |Properties |((Edit-date,01/01/2001), (Edited-by,John)) | - +-------------------------+--------------------------------------------+ + +-------------------------+------------------------------------------+ + |database_description_item| database_description_value| + +-------------------------+------------------------------------------+ + | Database Name| inventory| + | Description| | + | Location| file:/temp/spark-warehouse/inventory.db| + | Properties|((Edit-date,01/01/2001), (Edited-by,John))| + +-------------------------+------------------------------------------+ {% endhighlight %} ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) diff --git a/docs/sql-ref-syntax-ddl-alter-table.md b/docs/sql-ref-syntax-ddl-alter-table.md index 2dd808b131ef2..edb081b7f45c0 100644 --- a/docs/sql-ref-syntax-ddl-alter-table.md +++ b/docs/sql-ref-syntax-ddl-alter-table.md @@ -20,12 +20,15 @@ license: | --- ### Description + `ALTER TABLE` statement changes the schema or properties of a table. ### RENAME + `ALTER TABLE RENAME TO` statement changes the table name of an existing table in the database. #### Syntax + {% highlight sql %} ALTER TABLE table_identifier RENAME TO table_identifier @@ -33,6 +36,7 @@ ALTER TABLE table_identifier partition_spec RENAME TO partition_spec {% endhighlight %} #### Parameters +
table_identifier
@@ -55,16 +59,18 @@ ALTER TABLE table_identifier partition_spec RENAME TO partition_spec
- ### ADD COLUMNS + `ALTER TABLE ADD COLUMNS` statement adds mentioned columns to an existing table. #### Syntax + {% highlight sql %} ALTER TABLE table_identifier ADD COLUMNS ( col_spec [ , col_spec ... ] ) {% endhighlight %} #### Parameters +
table_identifier
@@ -81,16 +87,18 @@ ALTER TABLE table_identifier ADD COLUMNS ( col_spec [ , col_spec ... ] )
Specifies the columns to be added to be renamed.
- ### ALTER OR CHANGE COLUMN + `ALTER TABLE ALTER COLUMN` or `ALTER TABLE CHANGE COLUMN` statement changes column's comment. #### Syntax + {% highlight sql %} ALTER TABLE table_identifier { ALTER | CHANGE } [ COLUMN ] col_spec alterColumnAction {% endhighlight %} #### Parameters +
table_identifier
@@ -118,19 +126,21 @@ ALTER TABLE table_identifier { ALTER | CHANGE } [ COLUMN ] col_spec alterColumnA
- ### ADD AND DROP PARTITION #### ADD PARTITION + `ALTER TABLE ADD` statement adds partition to the partitioned table. ##### Syntax + {% highlight sql %} ALTER TABLE table_identifier ADD [IF NOT EXISTS] ( partition_spec [ partition_spec ... ] ) {% endhighlight %} ##### Parameters +
table_identifier
@@ -154,14 +164,17 @@ ALTER TABLE table_identifier ADD [IF NOT EXISTS]
#### DROP PARTITION + `ALTER TABLE DROP` statement drops the partition of the table. ##### Syntax + {% highlight sql %} ALTER TABLE table_identifier DROP [ IF EXISTS ] partition_spec [PURGE] {% endhighlight %} ##### Parameters +
table_identifier
@@ -183,35 +196,35 @@ ALTER TABLE table_identifier DROP [ IF EXISTS ] partition_spec [PURGE]
- ### SET AND UNSET #### SET TABLE PROPERTIES + `ALTER TABLE SET` command is used for setting the table properties. If a particular property was already set, this overrides the old value with the new one. `ALTER TABLE UNSET` is used to drop the table property. ##### Syntax -{% highlight sql %} ---Set Table Properties +{% highlight sql %} +-- Set Table Properties ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 = val1, key2 = val2, ... ) ---Unset Table Properties +-- Unset Table Properties ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, ... ) - {% endhighlight %} #### SET SERDE + `ALTER TABLE SET` command is used for setting the SERDE or SERDE properties in Hive tables. If a particular property was already set, this overrides the old value with the new one. ##### Syntax -{% highlight sql %} ---Set SERDE Properties +{% highlight sql %} +-- Set SERDE Properties ALTER TABLE table_identifier [ partition_spec ] SET SERDEPROPERTIES ( key1 = val1, key2 = val2, ... ) @@ -221,21 +234,22 @@ ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name {% endhighlight %} #### SET LOCATION And SET FILE FORMAT + `ALTER TABLE SET` command can also be used for changing the file location and file format for existing tables. ##### Syntax -{% highlight sql %} ---Changing File Format +{% highlight sql %} +-- Changing File Format ALTER TABLE table_identifier [ partition_spec ] SET FILEFORMAT file_format ---Changing File Location +-- Changing File Location ALTER TABLE table_identifier [ partition_spec ] SET LOCATION 'new_location' - {% endhighlight %} #### Parameters +
table_identifier
@@ -263,205 +277,198 @@ ALTER TABLE table_identifier [ partition_spec ] SET LOCATION 'new_location'
Specifies the SERDE properties to be set.
- ### Examples -{% highlight sql %} ---RENAME table +{% highlight sql %} +-- RENAME table DESC student; -+--------------------------+------------+----------+--+ -| col_name | data_type | comment | -+--------------------------+------------+----------+--+ -| name | string | NULL | -| rollno | int | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+----------+--+ + +-----------------------+---------+-------+ + | col_name|data_type|comment| + +-----------------------+---------+-------+ + | name| string| NULL| + | rollno| int| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type|comment| + | age| int| NULL| + +-----------------------+---------+-------+ ALTER TABLE Student RENAME TO StudentInfo; ---After Renaming the table - +-- After Renaming the table DESC StudentInfo; -+--------------------------+------------+----------+--+ -| col_name | data_type | comment | -+--------------------------+------------+----------+--+ -| name | string | NULL | -| rollno | int | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+----------+--+ - ---RENAME partition + +-----------------------+---------+-------+ + | col_name|data_type|comment| + +-----------------------+---------+-------+ + | name| string| NULL| + | rollno| int| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type|comment| + | age| int| NULL| + +-----------------------+---------+-------+ + +-- RENAME partition SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=10 | -| age=11 | -| age=12 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=10| + | age=11| + | age=12| + +---------+ ALTER TABLE default.StudentInfo PARTITION (age='10') RENAME TO PARTITION (age='15'); ---After renaming Partition +-- After renaming Partition SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + +---------+ -- Add new columns to a table - DESC StudentInfo; -+--------------------------+------------+----------+--+ -| col_name | data_type | comment | -+--------------------------+------------+----------+--+ -| name | string | NULL | -| rollno | int | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+----------+ + +-----------------------+---------+-------+ + | col_name|data_type|comment| + +-----------------------+---------+-------+ + | name| string| NULL| + | rollno| int| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type|comment| + | age| int| NULL| + +-----------------------+---------+-------+ ALTER TABLE StudentInfo ADD columns (LastName string, DOB timestamp); ---After Adding New columns to the table +-- After Adding New columns to the table DESC StudentInfo; -+--------------------------+------------+----------+--+ -| col_name | data_type | comment | -+--------------------------+------------+----------+--+ -| name | string | NULL | -| rollno | int | NULL | -| LastName | string | NULL | -| DOB | timestamp | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+----------+--+ + +-----------------------+---------+-------+ + | col_name|data_type|comment| + +-----------------------+---------+-------+ + | name| string| NULL| + | rollno| int| NULL| + | LastName| string| NULL| + | DOB|timestamp| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type|comment| + | age| int| NULL| + +-----------------------+---------+-------+ -- Add a new partition to a table - SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + +---------+ ALTER TABLE StudentInfo ADD IF NOT EXISTS PARTITION (age=18); -- After adding a new partition to the table SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -| age=18 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + | age=18| + +---------+ -- Drop a partition from the table - SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -| age=18 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + | age=18| + +---------+ ALTER TABLE StudentInfo DROP IF EXISTS PARTITION (age=18); -- After dropping the partition of the table SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + +---------+ -- Adding multiple partitions to the table - SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + +---------+ ALTER TABLE StudentInfo ADD IF NOT EXISTS PARTITION (age=18) PARTITION (age=20); -- After adding multiple partitions to the table SHOW PARTITIONS StudentInfo; -+------------+--+ -| partition | -+------------+--+ -| age=11 | -| age=12 | -| age=15 | -| age=18 | -| age=20 | -+------------+--+ + +---------+ + |partition| + +---------+ + | age=11| + | age=12| + | age=15| + | age=18| + | age=20| + +---------+ -- ALTER OR CHANGE COLUMNS - DESC StudentInfo; -+--------------------------+------------+----------+--+ -| col_name | data_type | comment | -+--------------------------+------------+----------+--+ -| name | string | NULL | -| rollno | int | NULL | -| LastName | string | NULL | -| DOB | timestamp | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+----------+--+ + +-----------------------+---------+-------+ + | col_name|data_type|comment| + +-----------------------+---------+-------+ + | name| string| NULL| + | rollno| int| NULL| + | LastName| string| NULL| + | DOB|timestamp| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type|comment| + | age| int| NULL| + +-----------------------+---------+-------+ ALTER TABLE StudentInfo ALTER COLUMN name COMMENT "new comment"; --After ALTER or CHANGE COLUMNS DESC StudentInfo; -+--------------------------+------------+------------+--+ -| col_name | data_type | comment | -+--------------------------+------------+------------+--+ -| name | string | new comment| -| rollno | int | NULL | -| LastName | string | NULL | -| DOB | timestamp | NULL | -| age | int | NULL | -| # Partition Information | | | -| # col_name | data_type | comment | -| age | int | NULL | -+--------------------------+------------+------------+--+ - ---Change the fileformat + +-----------------------+---------+-----------+ + | col_name|data_type| comment| + +-----------------------+---------+-----------+ + | name| string|new comment| + | rollno| int| NULL| + | LastName| string| NULL| + | DOB|timestamp| NULL| + | age| int| NULL| + |# Partition Information| | | + | # col_name|data_type| comment| + | age| int| NULL| + +-----------------------+---------+-----------+ + +-- Change the fileformat ALTER TABLE loc_orc SET fileformat orc; ALTER TABLE p1 partition (month=2, day=2) SET fileformat parquet; ---Change the file Location +-- Change the file Location ALTER TABLE dbx.tab1 PARTITION (a='1', b='2') SET LOCATION '/path/to/part/ways' -- SET SERDE/ SERDE Properties @@ -469,17 +476,14 @@ ALTER TABLE test_tab SET SERDE 'org.apache.hadoop.hive.serde2.columnar.LazyBinar ALTER TABLE dbx.tab1 SET SERDE 'org.apache.hadoop' WITH SERDEPROPERTIES ('k' = 'v', 'kay' = 'vee') ---SET TABLE PROPERTIES +-- SET TABLE PROPERTIES ALTER TABLE dbx.tab1 SET TBLPROPERTIES ('winner' = 'loser') ---DROP TABLE PROPERTIES +-- DROP TABLE PROPERTIES ALTER TABLE dbx.tab1 UNSET TBLPROPERTIES ('winner') - {% endhighlight %} - ### Related Statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) - + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) diff --git a/docs/sql-ref-syntax-ddl-alter-view.md b/docs/sql-ref-syntax-ddl-alter-view.md index a29f2b4f632a1..8116c97cc2f41 100644 --- a/docs/sql-ref-syntax-ddl-alter-view.md +++ b/docs/sql-ref-syntax-ddl-alter-view.md @@ -113,6 +113,8 @@ and the `view_identifier` must exist. ALTER VIEW view_identifier AS select_statement {% endhighlight %} +Note that `ALTER VIEW` statement does not support `SET SERDE` or `SET SERDEPROPERTIES` properties. + #### Parameters
view_identifier
@@ -139,98 +141,88 @@ ALTER VIEW tempdb1.v1 RENAME TO tempdb1.v2; -- Verify that the new view is created. DESCRIBE TABLE EXTENDED tempdb1.v2; - -+----------------------------+----------+-------+ -|col_name |data_type |comment| -+----------------------------+----------+-------+ -|c1 |int |null | -|c2 |string |null | -| | | | -|# Detailed Table Information| | | -|Database |tempdb1 | | -|Table |v2 | | -+----------------------------+----------+-------+ + +----------------------------+----------+-------+ + | col_name|data_type |comment| + +----------------------------+----------+-------+ + | c1| int| null| + | c2| string| null| + | | | | + |# Detailed Table Information| | | + | Database| tempdb1| | + | Table| v2| | + +----------------------------+----------+-------+ -- Before ALTER VIEW SET TBLPROPERTIES DESC TABLE EXTENDED tempdb1.v2; - -+----------------------------+----------+-------+ -|col_name |data_type |comment| -+----------------------------+----------+-------+ -|c1 |int |null | -|c2 |string |null | -| | | | -|# Detailed Table Information| | | -|Database |tempdb1 | | -|Table |v2 | | -|Table Properties |[....] | | -+----------------------------+----------+-------+ + +----------------------------+----------+-------+ + | col_name| data_type|comment| + +----------------------------+----------+-------+ + | c1| int| null| + | c2| string| null| + | | | | + |# Detailed Table Information| | | + | Database| tempdb1| | + | Table| v2| | + | Table Properties| [....]| | + +----------------------------+----------+-------+ -- Set properties in TBLPROPERTIES ALTER VIEW tempdb1.v2 SET TBLPROPERTIES ('created.by.user' = "John", 'created.date' = '01-01-2001' ); -- Use `DESCRIBE TABLE EXTENDED tempdb1.v2` to verify DESC TABLE EXTENDED tempdb1.v2; - -+----------------------------+-----------------------------------------------------+-------+ -|col_name |data_type |comment| -+----------------------------+-----------------------------------------------------+-------+ -|c1 |int |null | -|c2 |string |null | -| | | | -|# Detailed Table Information| | | -|Database |tempdb1 | | -|Table |v2 | | -|Table Properties |[created.by.user=John, created.date=01-01-2001, ....]| | -+----------------------------+-----------------------------------------------------+-------+ + +----------------------------+-----------------------------------------------------+-------+ + | col_name| data_type|comment| + +----------------------------+-----------------------------------------------------+-------+ + | c1| int| null| + | c2| string| null| + | | | | + |# Detailed Table Information| | | + | Database| tempdb1| | + | Table| v2| | + | Table Properties|[created.by.user=John, created.date=01-01-2001, ....]| | + +----------------------------+-----------------------------------------------------+-------+ -- Remove the key `created.by.user` and `created.date` from `TBLPROPERTIES` ALTER VIEW tempdb1.v2 UNSET TBLPROPERTIES ('created.by.user', 'created.date'); --Use `DESC TABLE EXTENDED tempdb1.v2` to verify the changes DESC TABLE EXTENDED tempdb1.v2; - -+----------------------------+----------+-------+ -|col_name |data_type |comment| -+----------------------------+----------+-------+ -|c1 |int |null | -|c2 |string |null | -| | | | -|# Detailed Table Information| | | -|Database |tempdb1 | | -|Table |v2 | | -|Table Properties |[....] | | -+----------------------------+----------+-------+ + +----------------------------+----------+-------+ + | col_name| data_type|comment| + +----------------------------+----------+-------+ + | c1| int| null| + | c2| string| null| + | | | | + |# Detailed Table Information| | | + | Database| tempdb1| | + | Table| v2| | + | Table Properties| [....]| | + +----------------------------+----------+-------+ -- Change the view definition ALTER VIEW tempdb1.v2 AS SELECT * FROM tempdb1.v1; -- Use `DESC TABLE EXTENDED` to verify DESC TABLE EXTENDED tempdb1.v2; - -+----------------------------+---------------------------+-------+ -|col_name |data_type |comment| -+----------------------------+---------------------------+-------+ -|c1 |int |null | -|c2 |string |null | -| | | | -|# Detailed Table Information| | | -|Database |tempdb1 | | -|Table |v2 | | -|Type |VIEW | | -|View Text |select * from tempdb1.v1 | | -|View Original Text |select * from tempdb1.v1 | | -+----------------------------+---------------------------+-------+ + +----------------------------+---------------------------+-------+ + | col_name| data_type|comment| + +----------------------------+---------------------------+-------+ + | c1| int| null| + | c2| string| null| + | | | | + |# Detailed Table Information| | | + | Database| tempdb1| | + | Table| v2| | + | Type| VIEW| | + | View Text| select * from tempdb1.v1| | + | View Original Text| select * from tempdb1.v1| | + +----------------------------+---------------------------+-------+ {% endhighlight %} ### Related Statements -- [describe-table](sql-ref-syntax-aux-describe-table.html) -- [create-view](sql-ref-syntax-ddl-create-view.html) -- [drop-view](sql-ref-syntax-ddl-drop-view.html) -- [show-views](sql-ref-syntax-aux-show-views.html) - -#### Note: - -`ALTER VIEW` statement does not support `SET SERDE` or `SET SERDEPROPERTIES` properties - + * [describe-table](sql-ref-syntax-aux-describe-table.html) + * [create-view](sql-ref-syntax-ddl-create-view.html) + * [drop-view](sql-ref-syntax-ddl-drop-view.html) + * [show-views](sql-ref-syntax-aux-show-views.html) diff --git a/docs/sql-ref-syntax-ddl-create-database.md b/docs/sql-ref-syntax-ddl-create-database.md index 4d2211c650953..6f74acdb60bf7 100644 --- a/docs/sql-ref-syntax-ddl-create-database.md +++ b/docs/sql-ref-syntax-ddl-create-database.md @@ -20,17 +20,20 @@ license: | --- ### Description + Creates a database with the specified name. If database with the same name already exists, an exception will be thrown. ### Syntax + {% highlight sql %} CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name - [ COMMENT database_comment ] - [ LOCATION database_directory ] - [ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ] + [ COMMENT database_comment ] + [ LOCATION database_directory ] + [ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ] {% endhighlight %} ### Parameters +
database_name
Specifies the name of the database to be created.
@@ -49,6 +52,7 @@ CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
### Examples + {% highlight sql %} -- Create database `customer_db`. This throws exception if database with name customer_db -- already exists. @@ -60,20 +64,21 @@ CREATE DATABASE IF NOT EXISTS customer_db; -- Create database `customer_db` only if database with same name doesn't exist with -- `Comments`,`Specific Location` and `Database properties`. CREATE DATABASE IF NOT EXISTS customer_db COMMENT 'This is customer database' LOCATION '/user' - WITH DBPROPERTIES (ID=001, Name='John'); + WITH DBPROPERTIES (ID=001, Name='John'); -- Verify that properties are set. DESCRIBE DATABASE EXTENDED customer_db; - +----------------------------+-----------------------------+ - | database_description_item | database_description_value | - +----------------------------+-----------------------------+ - | Database Name | customer_db | - | Description | This is customer database | - | Location | hdfs://hacluster/user | - | Properties | ((ID,001), (Name,John)) | - +----------------------------+-----------------------------+ + +-------------------------+--------------------------+ + |database_description_item|database_description_value| + +-------------------------+--------------------------+ + | Database Name| customer_db| + | Description| This is customer database| + | Location| hdfs://hacluster/user| + | Properties| ((ID,001), (Name,John))| + +-------------------------+--------------------------+ {% endhighlight %} ### Related Statements -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) + + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) diff --git a/docs/sql-ref-syntax-ddl-create-function.md b/docs/sql-ref-syntax-ddl-create-function.md index 1f94bf6d25aa5..2bd26d18c7736 100644 --- a/docs/sql-ref-syntax-ddl-create-function.md +++ b/docs/sql-ref-syntax-ddl-create-function.md @@ -20,6 +20,7 @@ license: | --- ### Description + The `CREATE FUNCTION` statement is used to create a temporary or permanent function in Spark. Temporary functions are scoped at a session level where as permanent functions are created in the persistent catalog and are made available to @@ -31,12 +32,14 @@ aggregate functions using Scala, Python and Java APIs. Please refer to [aggregate functions](sql-getting-started#aggregations) for more information. ### Syntax + {% highlight sql %} CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ] function_name AS class_name [ resource_locations ] {% endhighlight %} ### Parameters +
OR REPLACE
@@ -90,6 +93,7 @@ CREATE [ OR REPLACE ] [ TEMPORARY ] FUNCTION [ IF NOT EXISTS ]
### Examples + {% highlight sql %} -- 1. Create a simple UDF `SimpleUdf` that increments the supplied integral value by 10. -- import org.apache.hadoop.hive.ql.exec.UDF; @@ -106,7 +110,7 @@ INSERT INTO test VALUES (1), (2); -- Create a permanent function called `simple_udf`. CREATE FUNCTION simple_udf AS 'SimpleUdf' - USING JAR '/tmp/SimpleUdf.jar'; + USING JAR '/tmp/SimpleUdf.jar'; -- Verify that the function is in the registry. SHOW USER FUNCTIONS; @@ -127,7 +131,7 @@ SELECT simple_udf(c1) AS function_return_value FROM t1; -- Created a temporary function. CREATE TEMPORARY FUNCTION simple_temp_udf AS 'SimpleUdf' - USING JAR '/tmp/SimpleUdf.jar'; + USING JAR '/tmp/SimpleUdf.jar'; -- Verify that the newly created temporary function is in the registry. -- Please note that the temporary function does not have a qualified @@ -152,20 +156,20 @@ SHOW USER FUNCTIONS; -- Replace the implementation of `simple_udf` CREATE OR REPLACE FUNCTION simple_udf AS 'SimpleUdfR' - USING JAR '/tmp/SimpleUdfR.jar'; + USING JAR '/tmp/SimpleUdfR.jar'; -- Invoke the function. Every selected value should be incremented by 20. SELECT simple_udf(c1) AS function_return_value FROM t1; -+---------------------+ -|function_return_value| -+---------------------+ -| 21| -| 22| -+---------------------+ - + +---------------------+ + |function_return_value| + +---------------------+ + | 21| + | 22| + +---------------------+ {% endhighlight %} -### Related statements -- [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) -- [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) +### Related Statements + + * [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + * [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md b/docs/sql-ref-syntax-ddl-create-table-datasource.md index 532377d7fcec3..715b64c33baed 100644 --- a/docs/sql-ref-syntax-ddl-create-table-datasource.md +++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md @@ -24,19 +24,20 @@ license: | The `CREATE TABLE` statement defines a new table using a Data Source. ### Syntax + {% highlight sql %} CREATE TABLE [ IF NOT EXISTS ] table_identifier - [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ] - [USING data_source] - [ OPTIONS ( key1=val1, key2=val2, ... ) ] - [ PARTITIONED BY ( col_name1, col_name2, ... ) ] - [ CLUSTERED BY ( col_name3, col_name4, ... ) - [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] - INTO num_buckets BUCKETS ] - [ LOCATION path ] - [ COMMENT table_comment ] - [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] - [ AS select_statement ] + [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ] + [USING data_source] + [ OPTIONS ( key1=val1, key2=val2, ... ) ] + [ PARTITIONED BY ( col_name1, col_name2, ... ) ] + [ CLUSTERED BY ( col_name3, col_name4, ... ) + [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] + INTO num_buckets BUCKETS ] + [ LOCATION path ] + [ COMMENT table_comment ] + [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] + [ AS select_statement ] {% endhighlight %} Note that, the clauses between the USING clause and the AS SELECT clause can come in @@ -95,6 +96,7 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
### Data Source Interaction + A Data Source table acts like a pointer to the underlying data source. For example, you can create a table "foo" in Spark which points to a table "bar" in MySQL using JDBC Data Source. When you read/write table "foo", you actually read/write table "bar". @@ -107,6 +109,7 @@ For CREATE TABLE AS SELECT, Spark will overwrite the underlying data source with input query, to make sure the table gets created contains exactly the same data as the input query. ### Examples + {% highlight sql %} --Use data source @@ -114,29 +117,29 @@ CREATE TABLE student (id INT, name STRING, age INT) USING CSV; --Use data from another table CREATE TABLE student_copy USING CSV - AS SELECT * FROM student; + AS SELECT * FROM student; --Omit the USING clause, which uses the default data source (parquet by default) CREATE TABLE student (id INT, name STRING, age INT); --Specify table comment and properties CREATE TABLE student (id INT, name STRING, age INT) USING CSV - COMMENT 'this is a comment' - TBLPROPERTIES ('foo'='bar'); + COMMENT 'this is a comment' + TBLPROPERTIES ('foo'='bar'); --Specify table comment and properties with different clauses order CREATE TABLE student (id INT, name STRING, age INT) USING CSV - TBLPROPERTIES ('foo'='bar') - COMMENT 'this is a comment'; + TBLPROPERTIES ('foo'='bar') + COMMENT 'this is a comment'; --Create partitioned and bucketed table CREATE TABLE student (id INT, name STRING, age INT) - USING CSV - PARTITIONED BY (age) - CLUSTERED BY (Id) INTO 4 buckets; - + USING CSV + PARTITIONED BY (age) + CLUSTERED BY (Id) INTO 4 buckets; {% endhighlight %} ### Related Statements -* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) -* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) + + * [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) + * [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md index 0425bafd94398..06f353ad2f103 100644 --- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md +++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md @@ -18,23 +18,24 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description The `CREATE TABLE` statement defines a new table using Hive format. ### Syntax + {% highlight sql %} CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier - [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ] - [ COMMENT table_comment ] - [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) - | ( col_name1, col_name2, ... ) ] - [ ROW FORMAT row_format ] - [ STORED AS file_format ] - [ LOCATION path ] - [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] - [ AS select_statement ] - + [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ] + [ COMMENT table_comment ] + [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) + | ( col_name1, col_name2, ... ) ] + [ ROW FORMAT row_format ] + [ STORED AS file_format ] + [ LOCATION path ] + [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] + [ AS select_statement ] {% endhighlight %} Note that, the clauses between the columns definition clause and the AS SELECT clause can come in @@ -93,47 +94,45 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
The table is populated using the data from the select statement.
- ### Examples -{% highlight sql %} +{% highlight sql %} --Use hive format CREATE TABLE student (id INT, name STRING, age INT) STORED AS ORC; --Use data from another table CREATE TABLE student_copy STORED AS ORC - AS SELECT * FROM student; + AS SELECT * FROM student; --Specify table comment and properties CREATE TABLE student (id INT, name STRING, age INT) - COMMENT 'this is a comment' - STORED AS ORC - TBLPROPERTIES ('foo'='bar'); + COMMENT 'this is a comment' + STORED AS ORC + TBLPROPERTIES ('foo'='bar'); --Specify table comment and properties with different clauses order CREATE TABLE student (id INT, name STRING, age INT) - STORED AS ORC - TBLPROPERTIES ('foo'='bar') - COMMENT 'this is a comment'; + STORED AS ORC + TBLPROPERTIES ('foo'='bar') + COMMENT 'this is a comment'; --Create partitioned table CREATE TABLE student (id INT, name STRING) - PARTITIONED BY (age INT) - STORED AS ORC; + PARTITIONED BY (age INT) + STORED AS ORC; --Create partitioned table with different clauses order CREATE TABLE student (id INT, name STRING) - STORED AS ORC - PARTITIONED BY (age INT); + STORED AS ORC + PARTITIONED BY (age INT); --Use Row Format and file format CREATE TABLE student (id INT,name STRING) - ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' - STORED AS TEXTFILE; - + ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + STORED AS TEXTFILE; {% endhighlight %} - ### Related Statements -* [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) -* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) + + * [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) + * [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md b/docs/sql-ref-syntax-ddl-create-table-like.md index f49fd7fb24c91..fe1dc4b1ef258 100644 --- a/docs/sql-ref-syntax-ddl-create-table-like.md +++ b/docs/sql-ref-syntax-ddl-create-table-like.md @@ -18,21 +18,24 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description The `CREATE TABLE` statement defines a new table using the definition/metadata of an existing table or view. ### Syntax + {% highlight sql %} CREATE TABLE [IF NOT EXISTS] table_identifier LIKE source_table_identifier -USING data_source -[ ROW FORMAT row_format ] -[ STORED AS file_format ] -[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] -[ LOCATION path ] + USING data_source + [ ROW FORMAT row_format ] + [ STORED AS file_format ] + [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] + [ LOCATION path ] {% endhighlight %} ### Parameters +
table_identifier
@@ -70,28 +73,27 @@ USING data_source
Path to the directory where table data is stored,Path to the directory where table data is stored, which could be a path on distributed storage like HDFS, etc. Location to create an external table.
- ### Examples -{% highlight sql %} ---Create table using an existing table +{% highlight sql %} +-- Create table using an existing table CREATE TABLE Student_Dupli like Student; ---Create table like using a data source +-- Create table like using a data source CREATE TABLE Student_Dupli like Student USING CSV; ---Table is created as external table at the location specified +-- Table is created as external table at the location specified CREATE TABLE Student_Dupli like Student location '/root1/home'; ---Create table like using a rowformat +-- Create table like using a rowformat CREATE TABLE Student_Dupli like Student - ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' - STORED AS TEXTFILE - TBLPROPERTIES ('owner'='xxxx'); - + ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + STORED AS TEXTFILE + TBLPROPERTIES ('owner'='xxxx'); {% endhighlight %} ### Related Statements -* [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) -* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) + + * [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) + * [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) diff --git a/docs/sql-ref-syntax-ddl-create-table.md b/docs/sql-ref-syntax-ddl-create-table.md index 20aff6fb823cb..b0388adbc9a38 100644 --- a/docs/sql-ref-syntax-ddl-create-table.md +++ b/docs/sql-ref-syntax-ddl-create-table.md @@ -20,13 +20,16 @@ license: | --- ### Description + `CREATE TABLE` statement is used to define a table in an existing database. The CREATE statements: -* [CREATE TABLE USING DATA_SOURCE](sql-ref-syntax-ddl-create-table-datasource.html) -* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) -* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) + + * [CREATE TABLE USING DATA_SOURCE](sql-ref-syntax-ddl-create-table-datasource.html) + * [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) + * [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) ### Related Statements -- [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) + + * [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) diff --git a/docs/sql-ref-syntax-ddl-create-view.md b/docs/sql-ref-syntax-ddl-create-view.md index 53b402771e421..5d9db2ab36b5d 100644 --- a/docs/sql-ref-syntax-ddl-create-view.md +++ b/docs/sql-ref-syntax-ddl-create-view.md @@ -20,17 +20,20 @@ license: | --- ### Description + Views are based on the result-set of an `SQL` query. `CREATE VIEW` constructs a virtual table that has no physical data therefore other operations like `ALTER VIEW` and `DROP VIEW` only change metadata. ### Syntax + {% highlight sql %} CREATE [ OR REPLACE ] [ [ GLOBAL ] TEMPORARY ] VIEW [ IF NOT EXISTS ] view_identifier create_view_clauses AS query {% endhighlight %} ### Parameters +
OR REPLACE
If a view of same name already exists, it will be replaced.
@@ -71,6 +74,7 @@ CREATE [ OR REPLACE ] [ [ GLOBAL ] TEMPORARY ] VIEW [ IF NOT EXISTS ] view_ident
### Examples + {% highlight sql %} -- Create or replace view for `experienced_employee` with comments. CREATE OR REPLACE VIEW experienced_employee @@ -87,6 +91,7 @@ CREATE GLOBAL TEMPORARY VIEW IF NOT EXISTS subscribed_movies {% endhighlight %} ### Related Statements -- [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) -- [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) -- [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) + + * [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) + * [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) + * [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) diff --git a/docs/sql-ref-syntax-ddl-drop-database.md b/docs/sql-ref-syntax-ddl-drop-database.md index 0bdb98f2b129c..431139101eba4 100644 --- a/docs/sql-ref-syntax-ddl-drop-database.md +++ b/docs/sql-ref-syntax-ddl-drop-database.md @@ -30,7 +30,6 @@ exception will be thrown if the database does not exist in the system. DROP ( DATABASE | SCHEMA ) [ IF EXISTS ] dbname [ RESTRICT | CASCADE ] {% endhighlight %} - ### Parameters
@@ -54,27 +53,20 @@ DROP ( DATABASE | SCHEMA ) [ IF EXISTS ] dbname [ RESTRICT | CASCADE ]
### Example + {% highlight sql %} -- Create `inventory_db` Database CREATE DATABASE inventory_db COMMENT 'This database is used to maintain Inventory'; -- Drop the database and it's tables DROP DATABASE inventory_db CASCADE; -+---------+ -| Result | -+---------+ -+---------+ -- Drop the database using IF EXISTS DROP DATABASE IF EXISTS inventory_db CASCADE; -+---------+ -| Result | -+---------+ -+---------+ - {% endhighlight %} -### Related statements -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) +### Related Statements + + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) diff --git a/docs/sql-ref-syntax-ddl-drop-function.md b/docs/sql-ref-syntax-ddl-drop-function.md index 16d08d1ae8e99..f7ad18553e304 100644 --- a/docs/sql-ref-syntax-ddl-drop-function.md +++ b/docs/sql-ref-syntax-ddl-drop-function.md @@ -20,15 +20,16 @@ license: | --- ### Description + The `DROP FUNCTION` statement drops a temporary or user defined function (UDF). An exception will - be thrown if the function does not exist. +be thrown if the function does not exist. ### Syntax + {% highlight sql %} DROP [ TEMPORARY ] FUNCTION [ IF EXISTS ] [ db_name. ] function_name {% endhighlight %} - ### Parameters
@@ -47,36 +48,33 @@ DROP [ TEMPORARY ] FUNCTION [ IF EXISTS ] [ db_name. ] function_name
### Example + {% highlight sql %} -- Create a permanent function `test_avg` CREATE FUNCTION test_avg as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFAverage'; -- List user functions SHOW USER FUNCTIONS; - +-------------------+ - | function | - +-------------------+ - | default.test_avg | - +-------------------+ + +----------------+ + | function| + +----------------+ + |default.test_avg| + +----------------+ -- Create Temporary function `test_avg` CREATE TEMPORARY FUNCTION test_avg as 'org.apache.hadoop.hive.ql.udf.generic.GenericUDAFAverage'; -- List user functions SHOW USER FUNCTIONS; - +-------------------+ - | function | - +-------------------+ - | default.test_avg | - | test_avg | - +-------------------+ + +----------------+ + | function| + +----------------+ + |default.test_avg| + | test_avg| + +----------------+ -- Drop Permanent function DROP FUNCTION test_avg; - +---------+ - | Result | - +---------+ - +---------+ -- Try to drop Permanent function which is not present DROP FUNCTION test_avg; @@ -86,20 +84,18 @@ DROP FUNCTION test_avg; -- List the functions after dropping, it should list only temporary function SHOW USER FUNCTIONS; - +-----------+ - | function | - +-----------+ - | test_avg | - +-----------+ + +--------+ + |function| + +--------+ + |test_avg| + +--------+ -- Drop Temporary function DROP TEMPORARY FUNCTION IF EXISTS test_avg; - +---------+ - | Result | - +---------+ - +---------+ {% endhighlight %} -### Related statements -- [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) -- [SHOW FUNCTION](sql-ref-syntax-aux-show-functions.html) + +### Related Statements + + * [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + * [SHOW FUNCTION](sql-ref-syntax-aux-show-functions.html) diff --git a/docs/sql-ref-syntax-ddl-drop-table.md b/docs/sql-ref-syntax-ddl-drop-table.md index d1d8534efe7a2..32a9cc7bb27db 100644 --- a/docs/sql-ref-syntax-ddl-drop-table.md +++ b/docs/sql-ref-syntax-ddl-drop-table.md @@ -27,11 +27,13 @@ if the table is not `EXTERNAL` table. If the table is not present it throws an e In case of an external table, only the associated metadata information is removed from the metastore database. ### Syntax + {% highlight sql %} DROP TABLE [ IF EXISTS ] table_identifier {% endhighlight %} ### Parameter +
IF EXISTS
@@ -48,40 +50,27 @@ DROP TABLE [ IF EXISTS ] table_identifier
### Example + {% highlight sql %} -- Assumes a table named `employeetable` exists. DROP TABLE employeetable; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ -- Assumes a table named `employeetable` exists in the `userdb` database DROP TABLE userdb.employeetable; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ -- Assumes a table named `employeetable` does not exists. -- Throws exception DROP TABLE employeetable; -Error: org.apache.spark.sql.AnalysisException: Table or view not found: employeetable; -(state=,code=0) + Error: org.apache.spark.sql.AnalysisException: Table or view not found: employeetable; + (state=,code=0) -- Assumes a table named `employeetable` does not exists,Try with IF EXISTS -- this time it will not throw exception DROP TABLE IF EXISTS employeetable; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ - {% endhighlight %} ### Related Statements -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) - + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) diff --git a/docs/sql-ref-syntax-ddl-drop-view.md b/docs/sql-ref-syntax-ddl-drop-view.md index f313995022089..ae976c125f5f8 100644 --- a/docs/sql-ref-syntax-ddl-drop-view.md +++ b/docs/sql-ref-syntax-ddl-drop-view.md @@ -20,14 +20,17 @@ license: | --- ### Description + `DROP VIEW` removes the metadata associated with a specified view from the catalog. ### Syntax + {% highlight sql %} DROP VIEW [ IF EXISTS ] view_identifier {% endhighlight %} ### Parameter +
IF EXISTS
@@ -44,40 +47,29 @@ DROP VIEW [ IF EXISTS ] view_identifier
### Example + {% highlight sql %} -- Assumes a view named `employeeView` exists. DROP VIEW employeeView; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ -- Assumes a view named `employeeView` exists in the `userdb` database DROP VIEW userdb.employeeView; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ -- Assumes a view named `employeeView` does not exists. -- Throws exception DROP VIEW employeeView; -Error: org.apache.spark.sql.AnalysisException: Table or view not found: employeeView; -(state=,code=0) + Error: org.apache.spark.sql.AnalysisException: Table or view not found: employeeView; + (state=,code=0) -- Assumes a view named `employeeView` does not exists,Try with IF EXISTS -- this time it will not throw exception DROP VIEW IF EXISTS employeeView; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ - {% endhighlight %} ### Related Statements -- [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) -- [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) -- [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) + + * [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) + * [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) + * [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) diff --git a/docs/sql-ref-syntax-ddl-repair-table.md b/docs/sql-ref-syntax-ddl-repair-table.md index daa6a46fca58f..499b2bff54d8d 100644 --- a/docs/sql-ref-syntax-ddl-repair-table.md +++ b/docs/sql-ref-syntax-ddl-repair-table.md @@ -20,14 +20,17 @@ license: | --- ### Description + `MSCK REPAIR TABLE` recovers all the partitions in the directory of a table and updates the Hive metastore. When creating a table using `PARTITIONED BY` clause, partitions are generated and registered in the Hive metastore. However, if the partitioned table is created from existing data, partitions are not registered automatically in the Hive metastore. User needs to run `MSCK REPAIR TABLE` to register the partitions. `MSCK REPAIR TABLE` on a non-existent table or a table without partitions throws an exception. Another way to recover partitions is to use `ALTER TABLE RECOVER PARTITIONS`. ### Syntax + {% highlight sql %} MSCK REPAIR TABLE table_identifier {% endhighlight %} ### Parameters +
table_identifier
@@ -40,30 +43,31 @@ MSCK REPAIR TABLE table_identifier
### Examples -{% highlight sql %} - -- create a partitioned table from existing data /tmp/namesAndAges.parquet - CREATE TABLE t1 (name STRING, age INT) USING parquet PARTITIONED BY (age) - location "/tmp/namesAndAges.parquet"; - -- SELECT * FROM t1 does not return results - SELECT * FROM t1; - - -- run MSCK REPAIR TABLE to recovers all the partitions - MSCK REPAIR TABLE t1; +{% highlight sql %} +-- create a partitioned table from existing data /tmp/namesAndAges.parquet +CREATE TABLE t1 (name STRING, age INT) USING parquet PARTITIONED BY (age) + LOCATION "/tmp/namesAndAges.parquet"; - -- SELECT * FROM t1 returns results - SELECT * FROM t1; +-- SELECT * FROM t1 does not return results +SELECT * FROM t1; - + -------------- + ------+ - | name | age | - + -------------- + ------+ - | Michael | 20 | - + -------------- + ------+ - | Justin | 19 | - + -------------- + ----- + - | Andy | 30 | - + -------------- + ----- + +-- run MSCK REPAIR TABLE to recovers all the partitions +MSCK REPAIR TABLE t1; +-- SELECT * FROM t1 returns results +SELECT * FROM t1; + +-------+---+ + | name|age| + +-------+---+ + |Michael| 20| + +-------+---+ + | Justin| 19| + +-------+---+ + | Andy| 30| + +-------+---+ {% endhighlight %} + ### Related Statements + * [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) diff --git a/docs/sql-ref-syntax-ddl-truncate-table.md b/docs/sql-ref-syntax-ddl-truncate-table.md index 3a0569e809d84..6377e83570207 100644 --- a/docs/sql-ref-syntax-ddl-truncate-table.md +++ b/docs/sql-ref-syntax-ddl-truncate-table.md @@ -20,16 +20,19 @@ license: | --- ### Description + The `TRUNCATE TABLE` statement removes all the rows from a table or partition(s). The table must not be a view or an external/temporary table. In order to truncate multiple partitions at once, the user can specify the partitions in `partition_spec`. If no `partition_spec` is specified it will remove all partitions in the table. ### Syntax + {% highlight sql %} TRUNCATE TABLE table_identifier [ partition_spec ] {% endhighlight %} ### Parameters +
table_identifier
@@ -52,47 +55,43 @@ TRUNCATE TABLE table_identifier [ partition_spec ]
- ### Examples -{% highlight sql %} ---Create table Student with partition -CREATE TABLE Student ( name String, rollno INT) PARTITIONED BY (age int); +{% highlight sql %} +-- Create table Student with partition +CREATE TABLE Student (name STRING, rollno INT) PARTITIONED BY (age INT); SELECT * from Student; -+-------+---------+------+--+ -| name | rollno | age | -+-------+---------+------+--+ -| ABC | 1 | 10 | -| DEF | 2 | 10 | -| XYZ | 3 | 12 | -+-------+---------+------+--+ + +----+------+---+ + |name|rollno|age| + +----+------+---+ + | ABC| 1| 10| + | DEF| 2| 10| + | XYZ| 3| 12| + +----+------+---+ -- Removes all rows from the table in the partition specified TRUNCATE TABLE Student partition(age=10); ---After truncate execution, records belonging to partition age=10 are removed +-- After truncate execution, records belonging to partition age=10 are removed SELECT * from Student; -+-------+---------+------+--+ -| name | rollno | age | -+-------+---------+------+--+ -| XYZ | 3 | 12 | -+-------+---------+------+--+ + +----+------+---+ + |name|rollno|age| + +----+------+---+ + | XYZ| 3| 12| + +----+------+---+ -- Removes all rows from the table from all partitions TRUNCATE TABLE Student; SELECT * from Student; -+-------+---------+------+--+ -| name | rollno | age | -+-------+---------+------+--+ -+-------+---------+------+--+ -No rows selected - + +----+------+---+ + |name|rollno|age| + +----+------+---+ + +----+------+---+ {% endhighlight %} - ### Related Statements -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) -- [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) + * [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) diff --git a/docs/sql-ref-syntax-ddl.md b/docs/sql-ref-syntax-ddl.md index ab4e95a1539ff..82fbf0498a20f 100644 --- a/docs/sql-ref-syntax-ddl.md +++ b/docs/sql-ref-syntax-ddl.md @@ -19,21 +19,19 @@ license: | limitations under the License. --- - Data Definition Statements are used to create or modify the structure of database objects in a database. Spark SQL supports the following Data Definition Statements: - -- [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) -- [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) -- [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) -- [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) -- [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) -- [TRUNCATE TABLE](sql-ref-syntax-ddl-truncate-table.html) -- [REPAIR TABLE](sql-ref-syntax-ddl-repair-table.html) -- [USE DATABASE](sql-ref-syntax-qry-select-usedb.html) + * [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) + * [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) + * [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) + * [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) + * [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) + * [TRUNCATE TABLE](sql-ref-syntax-ddl-truncate-table.html) + * [REPAIR TABLE](sql-ref-syntax-ddl-repair-table.html) + * [USE DATABASE](sql-ref-syntax-qry-select-usedb.html) diff --git a/docs/sql-ref-syntax-dml-insert-into.md b/docs/sql-ref-syntax-dml-insert-into.md index 715f43c9b80ea..ba65334ef8f61 100644 --- a/docs/sql-ref-syntax-dml-insert-into.md +++ b/docs/sql-ref-syntax-dml-insert-into.md @@ -24,12 +24,14 @@ license: | The `INSERT INTO` statement inserts new rows into a table. The inserted rows can be specified by value expressions or result from a query. ### Syntax + {% highlight sql %} INSERT INTO [ TABLE ] table_identifier [ partition_spec ] { { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query } {% endhighlight %} ### Parameters +
table_identifier
@@ -70,149 +72,148 @@ INSERT INTO [ TABLE ] table_identifier [ partition_spec ]
### Examples -#### Single Row Insert Using a VALUES Clause -{% highlight sql %} - CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT) - USING PARQUET PARTITIONED BY (student_id); - - INSERT INTO students - VALUES ('Amy Smith', '123 Park Ave, San Jose', 111111); - SELECT * FROM students; +#### Single Row Insert Using a VALUES Clause - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + +{% highlight sql %} +CREATE TABLE students (name VARCHAR(64), address VARCHAR(64), student_id INT) + USING PARQUET PARTITIONED BY (student_id); + +INSERT INTO students VALUES + ('Amy Smith', '123 Park Ave, San Jose', 111111); + +SELECT * FROM students; + +---------+---------------------+----------+ + | name| address|student_id| + +---------+---------------------+----------+ + |Amy Smith|123 Park Ave,San Jose| 111111| + +---------+---------------------+----------+ {% endhighlight %} #### Multi-Row Insert Using a VALUES Clause + {% highlight sql %} - INSERT INTO students - VALUES ('Bob Brown', '456 Taylor St, Cupertino', 222222), - ('Cathy Johnson', '789 Race Ave, Palo Alto', 333333); - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - | Bob Brown | 456 Taylor St, Cupertino | 222222 | - + -------------- + ------------------------------ + -------------- + - | Cathy Johnson | 789 Race Ave, Palo Alto | 333333 | - + -------------- + ------------------------------ + -------------- + +INSERT INTO students VALUES + ('Bob Brown', '456 Taylor St, Cupertino', 222222), + ('Cathy Johnson', '789 Race Ave, Palo Alto', 333333); + +SELECT * FROM students; + +-------------+------------------------+----------+ + | name| address|student_id| + +-------------+------------------------+----------+ + | Amy Smith| 123 Park Ave, San Jose| 111111| + +-------------+------------------------+----------+ + | Bob Brown|456 Taylor St, Cupertino| 222222| + +-------------+------------------------+----------+ + |Cathy Johnson| 789 Race Ave, Palo Alto| 333333| + +--------------+-----------------------+----------+ {% endhighlight %} #### Insert Using a SELECT Statement + {% highlight sql %} - -- Assuming the persons table has already been created and populated. - SELECT * FROM persons; - - + -------------- + ------------------------------ + -------------- + - | name | address | ssn | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 123456789 | - + -------------- + ------------------------------ + -------------- + - | Eddie Davis | 245 Market St, Milpitas | 345678901 | - + -------------- + ------------------------------ + ---------------+ - - INSERT INTO students PARTITION (student_id = 444444) - SELECT name, address FROM persons WHERE name = "Dora Williams"; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - | Bob Brown | 456 Taylor St, Cupertino | 222222 | - + -------------- + ------------------------------ + -------------- + - | Cathy Johnson | 789 Race Ave, Palo Alto | 333333 | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 444444 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the persons table has already been created and populated. +SELECT * FROM persons; + +-------------+-------------------------+---------+ + | name| address| ssn| + +-------------+-------------------------+---------+ + |Dora Williams|134 Forest Ave, Melo Park|123456789| + +-------------+-------------------------+---------+ + | Eddie Davis| 245 Market St, Milpitas|345678901| + +-------------+-------------------------+---------+ + +INSERT INTO students PARTITION (student_id = 444444) + SELECT name, address FROM persons WHERE name = "Dora Williams"; + +SELECT * FROM students; + +-------------+-------------------------+----------+ + | name| address|student_id| + +-------------+-------------------------+----------+ + | Amy Smith| 123 Park Ave, San Jose| 111111| + +-------------+-------------------------+----------+ + | Bob Brown| 456 Taylor St, Cupertino| 222222| + +-------------+-------------------------+----------+ + |Cathy Johnson| 789 Race Ave, Palo Alto| 333333| + +-------------+-------------------------+----------+ + |Dora Williams|134 Forest Ave, Melo Park| 444444| + +-------------+-------------------------+----------+ {% endhighlight %} #### Insert Using a TABLE Statement + {% highlight sql %} - -- Assuming the visiting_students table has already been created and populated. - SELECT * FROM visiting_students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + - - INSERT INTO students TABLE visiting_students; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - | Bob Brown | 456 Taylor St, Cupertino | 222222 | - + -------------- + ------------------------------ + -------------- + - | Cathy Johnson | 789 Race Ave, Palo Alto | 333333 | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 444444 | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the visiting_students table has already been created and populated. +SELECT * FROM visiting_students; + +-------------+---------------------+----------+ + | name| address|student_id| + +-------------+---------------------+----------+ + |Fleur Laurent|345 Copper St, London| 777777| + +-------------+---------------------+----------+ + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + +-------------+---------------------+----------+ + +INSERT INTO students TABLE visiting_students; + +SELECT * FROM students; + +-------------+-------------------------+----------+ + | name| address|student_id| + +-------------+-------------------------+----------+ + | Amy Smith| 123 Park Ave,San Jose| 111111| + +-------------+-------------------------+----------+ + | Bob Brown| 456 Taylor St, Cupertino| 222222| + +-------------+-------------------------+----------+ + |Cathy Johnson| 789 Race Ave, Palo Alto| 333333| + +-------------+-------------------------+----------+ + |Dora Williams|134 Forest Ave, Melo Park| 444444| + +-------------+-------------------------+----------+ + |Fleur Laurent| 345 Copper St, London| 777777| + +-------------+-------------------------+----------+ + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + +-------------+-------------------------+----------+ {% endhighlight %} #### Insert Using a FROM Statement + {% highlight sql %} - -- Assuming the applicants table has already been created and populated. - SELECT * FROM applicants; - - + -------------- + ------------------------------ + -------------- + -------------- + - | name | address | student_id | qualified | - + -------------- + ------------------------------ + -------------- + -------------- + - | Helen Davis | 469 Mission St, San Diego | 999999 | true | - + -------------- + ------------------------------ + -------------- + -------------- + - | Ivy King | 367 Leigh Ave, Santa Clara | 101010 | false | - + -------------- + ------------------------------ + -------------- + -------------- + - | Jason Wang | 908 Bird St, Saratoga | 121212 | true | - + -------------- + ------------------------------ + -------------- + -------------- + - - INSERT INTO students - FROM applicants SELECT name, address, id applicants WHERE qualified = true; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - | Bob Brown | 456 Taylor St, Cupertino | 222222 | - + -------------- + ------------------------------ + -------------- + - | Cathy Johnson | 789 Race Ave, Palo Alto | 333333 | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 444444 | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + - | Helen Davis | 469 Mission St, San Diego | 999999 | - + -------------- + ------------------------------ + -------------- + - | Jason Wang | 908 Bird St, Saratoga | 121212 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the applicants table has already been created and populated. +SELECT * FROM applicants; + +-----------+--------------------------+----------+---------+ + | name| address|student_id|qualified| + +-----------+--------------------------+----------+---------+ + |Helen Davis| 469 Mission St, San Diego| 999999| true| + +-----------+--------------------------+----------+---------+ + | Ivy King|367 Leigh Ave, Santa Clara| 101010| false| + +-----------+--------------------------+----------+---------+ + | Jason Wang| 908 Bird St, Saratoga| 121212| true| + +-----------+--------------------------+----------+---------+ + +INSERT INTO students + FROM applicants SELECT name, address, id applicants WHERE qualified = true; + +SELECT * FROM students; + +-------------+-------------------------+----------+ + | name| address|student_id| + +-------------+-------------------------+----------+ + | Amy Smith| 123 Park Ave, San Jose| 111111| + +-------------+-------------------------+----------+ + | Bob Brown| 456 Taylor St, Cupertino| 222222| + +-------------+-------------------------+----------+ + |Cathy Johnson| 789 Race Ave, Palo Alto| 333333| + +-------------+-------------------------+----------+ + |Dora Williams|134 Forest Ave, Melo Park| 444444| + +-------------+-------------------------+----------+ + |Fleur Laurent| 345 Copper St, London| 777777| + +-------------+-------------------------+----------+ + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + +-------------+-------------------------+----------+ + | Helen Davis|469 Mission St, San Diego| 999999| + +-------------+-------------------------+----------+ + | Jason Wang| 908 Bird St, Saratoga| 121212| + +-------------+-------------------------+----------+ {% endhighlight %} ### Related Statements - * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) - * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) - * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) + + * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) + * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) + * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) diff --git a/docs/sql-ref-syntax-dml-insert-overwrite-directory-hive.md b/docs/sql-ref-syntax-dml-insert-overwrite-directory-hive.md index 3b0475aef1015..3ab0994cf06e7 100644 --- a/docs/sql-ref-syntax-dml-insert-overwrite-directory-hive.md +++ b/docs/sql-ref-syntax-dml-insert-overwrite-directory-hive.md @@ -20,17 +20,20 @@ license: | --- ### Description + The `INSERT OVERWRITE DIRECTORY` with Hive format overwrites the existing data in the directory with the new values using Hive `SerDe`. Hive support must be enabled to use this command. The inserted rows can be specified by value expressions or result from a query. ### Syntax + {% highlight sql %} INSERT OVERWRITE [ LOCAL ] DIRECTORY directory_path - [ ROW FORMAT row_format ] [ STORED AS file_format ] - { { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query } + [ ROW FORMAT row_format ] [ STORED AS file_format ] + { { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query } {% endhighlight %} ### Parameters +
directory_path
@@ -71,17 +74,19 @@ INSERT OVERWRITE [ LOCAL ] DIRECTORY directory_path
### Examples + {% highlight sql %} - INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination' - STORED AS orc - SELECT * FROM test_table; +INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination' + STORED AS orc + SELECT * FROM test_table; - INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination' - ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' - SELECT * FROM test_table; +INSERT OVERWRITE LOCAL DIRECTORY '/tmp/destination' + ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' + SELECT * FROM test_table; {% endhighlight %} ### Related Statements - * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) - * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) - * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) + + * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) + * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) + * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) diff --git a/docs/sql-ref-syntax-dml-insert-overwrite-directory.md b/docs/sql-ref-syntax-dml-insert-overwrite-directory.md index 7f3224de0ccee..645396620f21a 100644 --- a/docs/sql-ref-syntax-dml-insert-overwrite-directory.md +++ b/docs/sql-ref-syntax-dml-insert-overwrite-directory.md @@ -18,10 +18,13 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description + The `INSERT OVERWRITE DIRECTORY` statement overwrites the existing data in the directory with the new values using a given Spark file format. The inserted rows can be specified by value expressions or result from a query. ### Syntax + {% highlight sql %} INSERT OVERWRITE [ LOCAL ] DIRECTORY [ directory_path ] USING file_format [ OPTIONS ( key = val [ , ... ] ) ] @@ -29,6 +32,7 @@ INSERT OVERWRITE [ LOCAL ] DIRECTORY [ directory_path ] {% endhighlight %} ### Parameters +
directory_path
@@ -67,6 +71,7 @@ INSERT OVERWRITE [ LOCAL ] DIRECTORY [ directory_path ]
### Examples + {% highlight sql %} INSERT OVERWRITE DIRECTORY '/tmp/destination' USING parquet @@ -80,6 +85,7 @@ INSERT OVERWRITE DIRECTORY {% endhighlight %} ### Related Statements - * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) - * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) - * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) + + * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) + * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) + * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) diff --git a/docs/sql-ref-syntax-dml-insert-overwrite-table.md b/docs/sql-ref-syntax-dml-insert-overwrite-table.md index 2318a8b622202..added8e1976be 100644 --- a/docs/sql-ref-syntax-dml-insert-overwrite-table.md +++ b/docs/sql-ref-syntax-dml-insert-overwrite-table.md @@ -24,12 +24,14 @@ license: | The `INSERT OVERWRITE` statement overwrites the existing data in the table using the new values. The inserted rows can be specified by value expressions or result from a query. ### Syntax + {% highlight sql %} INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ] { { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] } | query } {% endhighlight %} ### Parameters +
table_identifier
@@ -70,129 +72,120 @@ INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ]
### Examples + #### Insert Using a VALUES Clause + {% highlight sql %} - -- Assuming the students table has already been created and populated. - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - | Bob Brown | 456 Taylor St, Cupertino | 222222 | - + -------------- + ------------------------------ + -------------- + - | Cathy Johnson | 789 Race Ave, Palo Alto | 333333 | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 444444 | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + - | Helen Davis | 469 Mission St, San Diego | 999999 | - + -------------- + ------------------------------ + -------------- + - | Jason Wang | 908 Bird St, Saratoga | 121212 | - + -------------- + ------------------------------ + -------------- + - - INSERT OVERWRITE students - VALUES ('Ashua Hill', '456 Erica Ct, Cupertino', 111111), - ('Brian Reed', '723 Kern Ave, Palo Alto', 222222); - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Ashua Hill | 456 Erica Ct, Cupertino | 111111 | - + -------------- + ------------------------------ + -------------- + - | Brian Reed | 723 Kern Ave, Palo Alto | 222222 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the students table has already been created and populated. +SELECT * FROM students; + +-------------+-------------------------+----------+ + | name| address|student_id| + +-------------+-------------------------+----------+ + | Amy Smith| 123 Park Ave, San Jose| 111111| + | Bob Brown| 456 Taylor St, Cupertino| 222222| + |Cathy Johnson| 789 Race Ave, Palo Alto| 333333| + |Dora Williams|134 Forest Ave, Melo Park| 444444| + |Fleur Laurent| 345 Copper St, London| 777777| + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + | Helen Davis|469 Mission St, San Diego| 999999| + | Jason Wang| 908 Bird St, Saratoga| 121212| + +-------------+-------------------------+----------+ + +INSERT OVERWRITE students VALUES + ('Ashua Hill', '456 Erica Ct, Cupertino', 111111), + ('Brian Reed', '723 Kern Ave, Palo Alto', 222222); + +SELECT * FROM students; + +----------+-----------------------+----------+ + | name| address|student_id| + +----------+-----------------------+----------+ + |Ashua Hill|456 Erica Ct, Cupertino| 111111| + |Brian Reed|723 Kern Ave, Palo Alto| 222222| + +----------+-----------------------+----------+ + {% endhighlight %} #### Insert Using a SELECT Statement + {% highlight sql %} - -- Assuming the persons table has already been created and populated. - SELECT * FROM persons; - - + -------------- + ------------------------------ + -------------- + - | name | address | ssn | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 123456789 | - + -------------- + ------------------------------ + -------------- + - | Eddie Davis | 245 Market St, Milpitas | 345678901 | - + -------------- + ------------------------------ + ---------------+ - - INSERT OVERWRITE students PARTITION (student_id = 222222) - SELECT name, address FROM persons WHERE name = "Dora Williams"; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Ashua Hill | 456 Erica Ct, Cupertino | 111111 | - + -------------- + ------------------------------ + -------------- + - | Dora Williams | 134 Forest Ave, Melo Park | 222222 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the persons table has already been created and populated. +SELECT * FROM persons; + +-------------+-------------------------+---------+ + | name| address| ssn| + +-------------+-------------------------+---------+ + |Dora Williams|134 Forest Ave, Melo Park|123456789| + +-------------+-------------------------+---------+ + | Eddie Davis| 245 Market St,Milpitas|345678901| + +-------------+-------------------------+---------+ + +INSERT OVERWRITE students PARTITION (student_id = 222222) + SELECT name, address FROM persons WHERE name = "Dora Williams"; + +SELECT * FROM students; + +-------------+-------------------------+----------+ + | name| address|student_id| + +-------------+-------------------------+----------+ + | Ashua Hill| 456 Erica Ct, Cupertino| 111111| + +-------------+-------------------------+----------+ + |Dora Williams|134 Forest Ave, Melo Park| 222222| + +-------------+-------------------------+----------+ {% endhighlight %} #### Insert Using a TABLE Statement + {% highlight sql %} - -- Assuming the visiting_students table has already been created and populated. - SELECT * FROM visiting_students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + - - INSERT OVERWRITE students TABLE visiting_students; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Fleur Laurent | 345 Copper St, London | 777777 | - + -------------- + ------------------------------ + -------------- + - | Gordon Martin | 779 Lake Ave, Oxford | 888888 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the visiting_students table has already been created and populated. +SELECT * FROM visiting_students; + +-------------+---------------------+----------+ + | name| address|student_id| + +-------------+---------------------+----------+ + |Fleur Laurent|345 Copper St, London| 777777| + +-------------+---------------------+----------+ + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + +-------------+---------------------+----------+ + +INSERT OVERWRITE students TABLE visiting_students; + +SELECT * FROM students; + +-------------+---------------------+----------+ + | name| address|student_id| + +-------------+---------------------+----------+ + |Fleur Laurent|345 Copper St, London| 777777| + +-------------+---------------------+----------+ + |Gordon Martin| 779 Lake Ave, Oxford| 888888| + +-------------+---------------------+----------+ {% endhighlight %} #### Insert Using a FROM Statement + {% highlight sql %} - -- Assuming the applicants table has already been created and populated. - SELECT * FROM applicants; - - + -------------- + ------------------------------ + -------------- + -------------- + - | name | address | student_id | qualified | - + -------------- + ------------------------------ + -------------- + -------------- + - | Helen Davis | 469 Mission St, San Diego | 999999 | true | - + -------------- + ------------------------------ + -------------- + -------------- + - | Ivy King | 367 Leigh Ave, Santa Clara | 101010 | false | - + -------------- + ------------------------------ + -------------- + -------------- + - | Jason Wang | 908 Bird St, Saratoga | 121212 | true | - + -------------- + ------------------------------ + -------------- + -------------- + - - INSERT OVERWRITE students - FROM applicants SELECT name, address, id applicants WHERE qualified = true; - - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Helen Davis | 469 Mission St, San Diego | 999999 | - + -------------- + ------------------------------ + -------------- + - | Jason Wang | 908 Bird St, Saratoga | 121212 | - + -------------- + ------------------------------ + -------------- + +-- Assuming the applicants table has already been created and populated. +SELECT * FROM applicants; + +-----------+--------------------------+----------+---------+ + | name| address|student_id|qualified| + +-----------+--------------------------+----------+---------+ + |Helen Davis| 469 Mission St, San Diego| 999999| true| + +-----------+--------------------------+----------+---------+ + | Ivy King|367 Leigh Ave, Santa Clara| 101010| false| + +-----------+--------------------------+----------+---------+ + | Jason Wang| 908 Bird St, Saratoga| 121212| true| + +-----------+--------------------------+----------+---------+ + +INSERT OVERWRITE students + FROM applicants SELECT name, address, id applicants WHERE qualified = true; + +SELECT * FROM students; + +-----------+-------------------------+----------+ + | name| address|student_id| + +-----------+-------------------------+----------+ + |Helen Davis|469 Mission St, San Diego| 999999| + +-----------+-------------------------+----------+ + | Jason Wang| 908 Bird St, Saratoga| 121212| + +-----------+-------------------------+----------+ {% endhighlight %} ### Related Statements - * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) - * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) - * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) + + * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) + * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) + * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) diff --git a/docs/sql-ref-syntax-dml-insert.md b/docs/sql-ref-syntax-dml-insert.md index 15a2e28896943..2345add2460c8 100644 --- a/docs/sql-ref-syntax-dml-insert.md +++ b/docs/sql-ref-syntax-dml-insert.md @@ -20,7 +20,8 @@ license: | --- The INSERT statements: -* [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) -* [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) -* [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) -* [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) + + * [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html) + * [INSERT OVERWRITE statement](sql-ref-syntax-dml-insert-overwrite-table.html) + * [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html) + * [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) diff --git a/docs/sql-ref-syntax-dml-load.md b/docs/sql-ref-syntax-dml-load.md index 090c49dc5d082..9a9bf230e3101 100644 --- a/docs/sql-ref-syntax-dml-load.md +++ b/docs/sql-ref-syntax-dml-load.md @@ -20,14 +20,17 @@ license: | --- ### Description + `LOAD DATA` statement loads the data into a Hive serde table from the user specified directory or file. If a directory is specified then all the files from the directory are loaded. If a file is specified then only the single file is loaded. Additionally the `LOAD DATA` statement takes an optional partition specification. When a partition is specified, the data files (when input source is a directory) or the single file (when input source is a file) are loaded into the partition of the target table. ### Syntax + {% highlight sql %} LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ partition_spec ] {% endhighlight %} ### Parameters +
path
Path of the file system. It can be either an absolute or a relative path.
@@ -67,65 +70,57 @@ LOAD DATA [ LOCAL ] INPATH path [ OVERWRITE ] INTO TABLE table_identifier [ part
### Examples -{% highlight sql %} - -- Example without partition specification. - -- Assuming the students table has already been created and populated. - SELECT * FROM students; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - - CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT) USING HIVE; - - -- Assuming the students table is in '/user/hive/warehouse/' - LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load; - - SELECT * FROM test_load; - - + -------------- + ------------------------------ + -------------- + - | name | address | student_id | - + -------------- + ------------------------------ + -------------- + - | Amy Smith | 123 Park Ave, San Jose | 111111 | - + -------------- + ------------------------------ + -------------- + - - -- Example with partition specification. - CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) PARTITIONED BY (c2, c3); - - INSERT INTO test_partition PARTITION (c2 = 2, c3 = 3) VALUES (1); - - INSERT INTO test_partition PARTITION (c2 = 5, c3 = 6) VALUES (4); - - INSERT INTO test_partition PARTITION (c2 = 8, c3 = 9) VALUES (7); - - SELECT * FROM test_partition; - - + ------- + ------- + ----- + - | c1 | c2 | c3 | - + ------- + --------------- + - | 1 | 2 | 3 | - + ------- + ------- + ----- + - | 4 | 5 | 6 | - + ------- + ------- + ----- + - | 7 | 8 | 9 | - + ------- + ------- + ----- + - - CREATE TABLE test_load_partition (c1 INT, c2 INT, c3 INT) USING HIVE PARTITIONED BY (c2, c3); - - -- Assuming the test_partition table is in '/user/hive/warehouse/' - LOAD DATA LOCAL INPATH '/user/hive/warehouse/test_partition/c2=2/c3=3' - OVERWRITE INTO TABLE test_load_partition PARTITION (c2=2, c3=3); - - SELECT * FROM test_load_partition; - - + ------- + ------- + ----- + - | c1 | c2 | c3 | - + ------- + --------------- + - | 1 | 2 | 3 | - + ------- + ------- + ----- + - +{% highlight sql %} +-- Example without partition specification. +-- Assuming the students table has already been created and populated. +SELECT * FROM students; + +---------+----------------------+----------+ + | name| address|student_id| + +---------+----------------------+----------+ + |Amy Smith|123 Park Ave, San Jose| 111111| + +---------+----------------------+----------+ + +CREATE TABLE test_load (name VARCHAR(64), address VARCHAR(64), student_id INT) USING HIVE; + +-- Assuming the students table is in '/user/hive/warehouse/' +LOAD DATA LOCAL INPATH '/user/hive/warehouse/students' OVERWRITE INTO TABLE test_load; + +SELECT * FROM test_load; + +---------+----------------------+----------+ + | name| address|student_id| + +---------+----------------------+----------+ + |Amy Smith|123 Park Ave, San Jose| 111111| + +---------+----------------------+----------+ + +-- Example with partition specification. +CREATE TABLE test_partition (c1 INT, c2 INT, c3 INT) PARTITIONED BY (c2, c3); + +INSERT INTO test_partition PARTITION (c2 = 2, c3 = 3) VALUES (1); + +INSERT INTO test_partition PARTITION (c2 = 5, c3 = 6) VALUES (4); + +INSERT INTO test_partition PARTITION (c2 = 8, c3 = 9) VALUES (7); + +SELECT * FROM test_partition; + +---+---+---+ + | c1| c2| c3| + +---+---+---+ + | 1| 2| 3| + | 4| 5| 6| + | 7| 8| 9| + +---+---+---+ + +CREATE TABLE test_load_partition (c1 INT, c2 INT, c3 INT) USING HIVE PARTITIONED BY (c2, c3); + +-- Assuming the test_partition table is in '/user/hive/warehouse/' +LOAD DATA LOCAL INPATH '/user/hive/warehouse/test_partition/c2=2/c3=3' + OVERWRITE INTO TABLE test_load_partition PARTITION (c2=2, c3=3); + +SELECT * FROM test_load_partition; + +---+---+---+ + | c1| c2| c3| + +---+---+---+ + | 1| 2| 3| + +---+---+---+ {% endhighlight %} - diff --git a/docs/sql-ref-syntax-dml.md b/docs/sql-ref-syntax-dml.md index b5dd45f8962c9..9f75990555f64 100644 --- a/docs/sql-ref-syntax-dml.md +++ b/docs/sql-ref-syntax-dml.md @@ -21,5 +21,5 @@ license: | Data Manipulation Statements are used to add, change, or delete data. Spark SQL supports the following Data Manipulation Statements: -- [INSERT](sql-ref-syntax-dml-insert.html) -- [LOAD](sql-ref-syntax-dml-load.html) + * [INSERT](sql-ref-syntax-dml-insert.html) + * [LOAD](sql-ref-syntax-dml-load.html) diff --git a/docs/sql-ref-syntax-qry-explain.md b/docs/sql-ref-syntax-qry-explain.md index 7e18e16bc8ea6..6a7c2ace8223b 100644 --- a/docs/sql-ref-syntax-qry-explain.md +++ b/docs/sql-ref-syntax-qry-explain.md @@ -24,8 +24,8 @@ license: | The `EXPLAIN` statement is used to provide logical/physical plans for an input statement. By default, this clause provides information about a physical plan only. - ### Syntax + {% highlight sql %} EXPLAIN [EXTENDED | CODEGEN | COST | FORMATTED] statement {% endhighlight %} @@ -64,76 +64,72 @@ EXPLAIN [EXTENDED | CODEGEN | COST | FORMATTED] statement ### Examples -{% highlight sql %} - ---Default Output +{% highlight sql %} +-- Default Output EXPLAIN select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k; -+----------------------------------------------------+ -| plan | -+----------------------------------------------------+ -| == Physical Plan == -*(2) HashAggregate(keys=[k#33], functions=[sum(cast(v#34 as bigint))]) -+- Exchange hashpartitioning(k#33, 200), true, [id=#59] - +- *(1) HashAggregate(keys=[k#33], functions=[partial_sum(cast(v#34 as bigint))]) - +- *(1) LocalTableScan [k#33, v#34] -| - +---------------------------------------------------- + +----------------------------------------------------+ + | plan| + +----------------------------------------------------+ + | == Physical Plan == + *(2) HashAggregate(keys=[k#33], functions=[sum(cast(v#34 as bigint))]) + +- Exchange hashpartitioning(k#33, 200), true, [id=#59] + +- *(1) HashAggregate(keys=[k#33], functions=[partial_sum(cast(v#34 as bigint))]) + +- *(1) LocalTableScan [k#33, v#34] + | + +---------------------------------------------------- -- Using Extended - EXPLAIN EXTENDED select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k; -+----------------------------------------------------+ -| plan | -+----------------------------------------------------+ -| == Parsed Logical Plan == -'Aggregate ['k], ['k, unresolvedalias('sum('v), None)] -+- 'SubqueryAlias `t` - +- 'UnresolvedInlineTable [k, v], [List(1, 2), List(1, 3)] - -== Analyzed Logical Plan == -k: int, sum(v): bigint -Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L] -+- SubqueryAlias `t` + +----------------------------------------------------+ + | plan| + +----------------------------------------------------+ + | == Parsed Logical Plan == + 'Aggregate ['k], ['k, unresolvedalias('sum('v), None)] + +- 'SubqueryAlias `t` + +- 'UnresolvedInlineTable [k, v], [List(1, 2), List(1, 3)] + + == Analyzed Logical Plan == + k: int, sum(v): bigint + Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L] + +- SubqueryAlias `t` + +- LocalRelation [k#47, v#48] + + == Optimized Logical Plan == + Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L] +- LocalRelation [k#47, v#48] - -== Optimized Logical Plan == -Aggregate [k#47], [k#47, sum(cast(v#48 as bigint)) AS sum(v)#50L] -+- LocalRelation [k#47, v#48] - -== Physical Plan == -*(2) HashAggregate(keys=[k#47], functions=[sum(cast(v#48 as bigint))], output=[k#47, sum(v)#50L]) -+- Exchange hashpartitioning(k#47, 200), true, [id=#79] - +- *(1) HashAggregate(keys=[k#47], functions=[partial_sum(cast(v#48 as bigint))], output=[k#47, sum#52L]) + + == Physical Plan == + *(2) HashAggregate(keys=[k#47], functions=[sum(cast(v#48 as bigint))], output=[k#47, sum(v)#50L]) + +- Exchange hashpartitioning(k#47, 200), true, [id=#79] + +- *(1) HashAggregate(keys=[k#47], functions=[partial_sum(cast(v#48 as bigint))], output=[k#47, sum#52L]) +- *(1) LocalTableScan [k#47, v#48] - | -+----------------------------------------------------+ - ---Using Formatted + | + +----------------------------------------------------+ +-- Using Formatted EXPLAIN FORMATTED select k, sum(v) from values (1, 2), (1, 3) t(k, v) group by k; -+----------------------------------------------------+ -| plan | -+----------------------------------------------------+ -| == Physical Plan == -* HashAggregate (4) -+- Exchange (3) - +- * HashAggregate (2) - +- * LocalTableScan (1) - - -(1) LocalTableScan [codegen id : 1] -Output: [k#19, v#20] - -(2) HashAggregate [codegen id : 1] -Input: [k#19, v#20] - -(3) Exchange -Input: [k#19, sum#24L] - -(4) HashAggregate [codegen id : 2] -Input: [k#19, sum#24L] - | -+----------------------------------------------------+ - + +----------------------------------------------------+ + | plan| + +----------------------------------------------------+ + | == Physical Plan == + * HashAggregate (4) + +- Exchange (3) + +- * HashAggregate (2) + +- * LocalTableScan (1) + + + (1) LocalTableScan [codegen id : 1] + Output: [k#19, v#20] + + (2) HashAggregate [codegen id : 1] + Input: [k#19, v#20] + + (3) Exchange + Input: [k#19, sum#24L] + + (4) HashAggregate [codegen id : 2] + Input: [k#19, sum#24L] + | + +----------------------------------------------------+ {% endhighlight %} diff --git a/docs/sql-ref-syntax-qry-sampling.md b/docs/sql-ref-syntax-qry-sampling.md index 061f21c3d16dd..3bc45cc48b78f 100644 --- a/docs/sql-ref-syntax-qry-sampling.md +++ b/docs/sql-ref-syntax-qry-sampling.md @@ -31,9 +31,9 @@ Note: `TABLESAMPLE` returns the approximate number of rows or fraction requested ### Syntax {% highlight sql %} - TABLESAMPLE ((integer_expression | decimal_expression) PERCENT) - | TABLESAMPLE (integer_expression ROWS) - | TABLESAMPLE (BUCKET integer_expression OUT OF integer_expression) +TABLESAMPLE ((integer_expression | decimal_expression) PERCENT) + | TABLESAMPLE (integer_expression ROWS) + | TABLESAMPLE (BUCKET integer_expression OUT OF integer_expression) {% endhighlight %} ### Examples diff --git a/docs/sql-ref-syntax-qry-select-clusterby.md b/docs/sql-ref-syntax-qry-select-clusterby.md index 8f1dc59806f80..687b2b512cd90 100644 --- a/docs/sql-ref-syntax-qry-select-clusterby.md +++ b/docs/sql-ref-syntax-qry-select-clusterby.md @@ -18,6 +18,9 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The CLUSTER BY clause is used to first repartition the data based on the input expressions and then sort the data within each partition. This is semantically equivalent to performing a @@ -26,11 +29,13 @@ semantically equivalent to performing a resultant rows are sorted within each partition and does not guarantee a total order of output. ### Syntax + {% highlight sql %} CLUSTER BY { expression [ , ... ] } {% endhighlight %} ### Parameters +
expression
@@ -39,6 +44,7 @@ CLUSTER BY { expression [ , ... ] }
### Examples + {% highlight sql %} CREATE TABLE person (name STRING, age INT); INSERT INTO person VALUES @@ -58,16 +64,15 @@ SET spark.sql.shuffle.partitions = 2; -- of a query when `CLUSTER BY` is not used vs when it's used. The query below produces rows -- where age column is not sorted. SELECT age, name FROM person; - +---+-------+ - |age|name | + |age| name| +---+-------+ - |16 |Shone S| - |25 |Zen Hui| - |16 |Jack N | - |25 |Mike A | - |18 |John A | - |18 |Anil B | + | 16|Shone S| + | 25|Zen Hui| + | 16| Jack N| + | 25| Mike A| + | 18| John A| + | 18| Anil B| +---+-------+ -- Produces rows clustered by age. Persons with same age are clustered together. @@ -75,25 +80,25 @@ SELECT age, name FROM person; -- persons with age 16 are in the second partition. The rows are sorted based -- on age within each partition. SELECT age, name FROM person CLUSTER BY age; - +---+-------+ - |age|name | + |age| name| +---+-------+ - |18 |John A | - |18 |Anil B | - |25 |Zen Hui| - |25 |Mike A | - |16 |Shone S| - |16 |Jack N | + | 18| John A| + | 18| Anil B| + | 25|Zen Hui| + | 25| Mike A| + | 16|Shone S| + | 16| Jack N| +---+-------+ {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-distribute-by.md b/docs/sql-ref-syntax-qry-select-distribute-by.md index 957df9c48eee7..18d73c7cdff19 100644 --- a/docs/sql-ref-syntax-qry-select-distribute-by.md +++ b/docs/sql-ref-syntax-qry-select-distribute-by.md @@ -18,16 +18,21 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The DISTRIBUTE BY clause is used to repartition the data based on the input expressions. Unlike the [CLUSTER BY](sql-ref-syntax-qry-select-clusterby.html) clause, this does not sort the data within each partition. ### Syntax + {% highlight sql %} DISTRIBUTE BY { expression [ , ... ] } {% endhighlight %} ### Parameters +
expression
@@ -36,6 +41,7 @@ DISTRIBUTE BY { expression [ , ... ] }
### Examples + {% highlight sql %} CREATE TABLE person (name STRING, age INT); INSERT INTO person VALUES @@ -55,40 +61,39 @@ SET spark.sql.shuffle.partitions = 2; -- behavior of `DISTRIBUTE BY`. The query below produces rows where age columns are not -- clustered together. SELECT age, name FROM person; - +---+-------+ - |age|name | + |age| name| +---+-------+ - |16 |Shone S| - |25 |Zen Hui| - |16 |Jack N | - |25 |Mike A | - |18 |John A | - |18 |Anil B | + | 16|Shone S| + | 25|Zen Hui| + | 16| Jack N| + | 25| Mike A| + | 18| John A| + | 18| Anil B| +---+-------+ -- Produces rows clustered by age. Persons with same age are clustered together. -- Unlike `CLUSTER BY` clause, the rows are not sorted within a partition. SELECT age, name FROM person DISTRIBUTE BY age; - +---+-------+ - |age|name | + |age| name| +---+-------+ - |25 |Zen Hui| - |25 |Mike A | - |18 |John A | - |18 |Anil B | - |16 |Shone S| - |16 |Jack N | + | 25|Zen Hui| + | 25| Mike A| + | 18| John A| + | 18| Anil B| + | 16|Shone S| + | 16| Jack N| +---+-------+ {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-groupby.md b/docs/sql-ref-syntax-qry-select-groupby.md index c461a18e004a4..1676ca9c6d6aa 100644 --- a/docs/sql-ref-syntax-qry-select-groupby.md +++ b/docs/sql-ref-syntax-qry-select-groupby.md @@ -18,15 +18,19 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The GROUP BY clause is used to group the rows based on a set of specified grouping expressions and compute aggregations on the group of rows based on one or more specified aggregate functions. Spark also supports advanced aggregations to do multiple aggregations for the same input record set via `GROUPING SETS`, `CUBE`, `ROLLUP` clauses. When a FILTER clause is attached to an aggregate function, only the matching rows are passed to that function. ### Syntax + {% highlight sql %} GROUP BY group_expression [ , group_expression [ , ... ] ] - [ { WITH ROLLUP | WITH CUBE | GROUPING SETS (grouping_set [ , ...]) } ] + [ { WITH ROLLUP | WITH CUBE | GROUPING SETS (grouping_set [ , ...]) } ] GROUP BY GROUPING SETS (grouping_set [ , ...]) {% endhighlight %} @@ -37,6 +41,7 @@ aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_ex {% endhighlight %} ### Parameters +
GROUPING SETS
@@ -92,6 +97,7 @@ aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_ex
### Examples + {% highlight sql %} CREATE TABLE dealer (id INT, city STRING, car_model STRING, quantity INT); INSERT INTO dealer VALUES @@ -106,42 +112,38 @@ INSERT INTO dealer VALUES -- Sum of quantity per dealership. Group by `id`. SELECT id, sum(quantity) FROM dealer GROUP BY id ORDER BY id; - +---+-------------+ - |id |sum(quantity)| + | id|sum(quantity)| +---+-------------+ - |100|32 | - |200|33 | - |300|13 | + |100| 32| + |200| 33| + |300| 13| +---+-------------+ -- Use column position in GROUP by clause. SELECT id, sum(quantity) FROM dealer GROUP BY 1 ORDER BY 1; - +---+-------------+ - |id |sum(quantity)| + | id|sum(quantity)| +---+-------------+ - |100|32 | - |200|33 | - |300|13 | + |100| 32| + |200| 33| + |300| 13| +---+-------------+ -- Multiple aggregations. -- 1. Sum of quantity per dealership. -- 2. Max quantity per dealership. SELECT id, sum(quantity) AS sum, max(quantity) AS max FROM dealer GROUP BY id ORDER BY id; - +---+---+---+ - |id |sum|max| + | id|sum|max| +---+---+---+ - |100|32 |15 | - |200|33 |20 | - |300|13 |8 | + |100| 32| 15| + |200| 33| 20| + |300| 13| 8| +---+---+---+ -- Count the number of distinct dealer cities per car_model. SELECT car_model, count(DISTINCT city) AS count FROM dealer GROUP BY car_model; - +------------+-----+ | car_model|count| +------------+-----+ @@ -155,14 +157,13 @@ SELECT id, sum(quantity) FILTER ( WHERE car_model IN ('Honda Civic', 'Honda CRV') ) AS `sum(quantity)` FROM dealer GROUP BY id ORDER BY id; - - +---+-------------+ - | id|sum(quantity)| - +---+-------------+ - |100| 17| - |200| 23| - |300| 5| - +---+-------------+ + +---+-------------+ + | id|sum(quantity)| + +---+-------------+ + |100| 17| + |200| 23| + |300| 5| + +---+-------------+ -- Aggregations using multiple sets of grouping columns in a single statement. -- Following performs aggregations based on four sets of grouping columns. @@ -171,112 +172,108 @@ SELECT id, sum(quantity) FILTER ( -- 3. car_model -- 4. Empty grouping set. Returns quantities for all city and car models. SELECT city, car_model, sum(quantity) AS sum FROM dealer - GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ()) - ORDER BY city; - + GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ()) + ORDER BY city; +--------+------------+---+ - |city |car_model |sum| + | city| car_model|sum| +--------+------------+---+ - |null |null |78 | - |null |Honda Accord|33 | - |null |Honda CRV |10 | - |null |Honda Civic |35 | - |Dublin |null |33 | - |Dublin |Honda Accord|10 | - |Dublin |Honda CRV |3 | - |Dublin |Honda Civic |20 | - |Fremont |null |32 | - |Fremont |Honda Accord|15 | - |Fremont |Honda CRV |7 | - |Fremont |Honda Civic |10 | - |San Jose|null |13 | - |San Jose|Honda Accord|8 | - |San Jose|Honda Civic |5 | + | null| null| 78| + | null| HondaAccord| 33| + | null| HondaCRV| 10| + | null| HondaCivic| 35| + | Dublin| null| 33| + | Dublin| HondaAccord| 10| + | Dublin| HondaCRV| 3| + | Dublin| HondaCivic| 20| + | Fremont| null| 32| + | Fremont| HondaAccord| 15| + | Fremont| HondaCRV| 7| + | Fremont| HondaCivic| 10| + | SanJose| null| 13| + | SanJose| HondaAccord| 8| + | SanJose| HondaCivic| 5| +--------+------------+---+ -- Alternate syntax for `GROUPING SETS` in which both `GROUP BY` and `GROUPING SETS` -- specifications are present. SELECT city, car_model, sum(quantity) AS sum FROM dealer - GROUP BY city, car_model GROUPING SETS ((city, car_model), (city), (car_model), ()) - ORDER BY city, car_model; - + GROUP BY city, car_model GROUPING SETS ((city, car_model), (city), (car_model), ()) + ORDER BY city, car_model; +--------+------------+---+ - |city |car_model |sum| + | city| car_model|sum| +--------+------------+---+ - |null |null |78 | - |null |Honda Accord|33 | - |null |Honda CRV |10 | - |null |Honda Civic |35 | - |Dublin |null |33 | - |Dublin |Honda Accord|10 | - |Dublin |Honda CRV |3 | - |Dublin |Honda Civic |20 | - |Fremont |null |32 | - |Fremont |Honda Accord|15 | - |Fremont |Honda CRV |7 | - |Fremont |Honda Civic |10 | - |San Jose|null |13 | - |San Jose|Honda Accord|8 | - |San Jose|Honda Civic |5 | + | null| null| 78| + | null| HondaAccord| 33| + | null| HondaCRV| 10| + | null| HondaCivic| 35| + | Dublin| null| 33| + | Dublin| HondaAccord| 10| + | Dublin| HondaCRV| 3| + | Dublin| HondaCivic| 20| + | Fremont| null| 32| + | Fremont| HondaAccord| 15| + | Fremont| HondaCRV| 7| + | Fremont| HondaCivic| 10| + | SanJose| null| 13| + | SanJose| HondaAccord| 8| + | SanJose| HondaCivic| 5| +--------+------------+---+ -- Group by processing with `ROLLUP` clause. -- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), ()) SELECT city, car_model, sum(quantity) AS sum FROM dealer - GROUP BY city, car_model WITH ROLLUP - ORDER BY city, car_model; - + GROUP BY city, car_model WITH ROLLUP + ORDER BY city, car_model; +--------+------------+---+ - |city |car_model |sum| + | city| car_model|sum| +--------+------------+---+ - |null |null |78 | - |Dublin |null |33 | - |Dublin |Honda Accord|10 | - |Dublin |Honda CRV |3 | - |Dublin |Honda Civic |20 | - |Fremont |null |32 | - |Fremont |Honda Accord|15 | - |Fremont |Honda CRV |7 | - |Fremont |Honda Civic |10 | - |San Jose|null |13 | - |San Jose|Honda Accord|8 | - |San Jose|Honda Civic |5 | + | null| null| 78| + | Dublin| null| 33| + | Dublin| HondaAccord| 10| + | Dublin| HondaCRV| 3| + | Dublin| HondaCivic| 20| + | Fremont| null| 32| + | Fremont| HondaAccord| 15| + | Fremont| HondaCRV| 7| + | Fremont| HondaCivic| 10| + | SanJose| null| 13| + | SanJose| HondaAccord| 8| + | SanJose| HondaCivic| 5| +--------+------------+---+ -- Group by processing with `CUBE` clause. -- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ()) SELECT city, car_model, sum(quantity) AS sum FROM dealer - GROUP BY city, car_model WITH CUBE - ORDER BY city, car_model; - + GROUP BY city, car_model WITH CUBE + ORDER BY city, car_model; +--------+------------+---+ - |city |car_model |sum| + | city| car_model|sum| +--------+------------+---+ - |null |null |78 | - |null |Honda Accord|33 | - |null |Honda CRV |10 | - |null |Honda Civic |35 | - |Dublin |null |33 | - |Dublin |Honda Accord|10 | - |Dublin |Honda CRV |3 | - |Dublin |Honda Civic |20 | - |Fremont |null |32 | - |Fremont |Honda Accord|15 | - |Fremont |Honda CRV |7 | - |Fremont |Honda Civic |10 | - |San Jose|null |13 | - |San Jose|Honda Accord|8 | - |San Jose|Honda Civic |5 | + | null| null| 78| + | null| HondaAccord| 33| + | null| HondaCRV| 10| + | null| HondaCivic| 35| + | Dublin| null| 33| + | Dublin| HondaAccord| 10| + | Dublin| HondaCRV| 3| + | Dublin| HondaCivic| 20| + | Fremont| null| 32| + | Fremont| HondaAccord| 15| + | Fremont| HondaCRV| 7| + | Fremont| HondaCivic| 10| + | SanJose| null| 13| + | SanJose| HondaAccord| 8| + | SanJose| HondaCivic| 5| +--------+------------+---+ - {% endhighlight %} -### Related clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-having.md b/docs/sql-ref-syntax-qry-select-having.md index dee1e3c0e39b9..b84ad17955864 100644 --- a/docs/sql-ref-syntax-qry-select-having.md +++ b/docs/sql-ref-syntax-qry-select-having.md @@ -18,17 +18,22 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The HAVING clause is used to filter the results produced by GROUP BY based on the specified condition. It is often used in conjunction with a [GROUP BY](sql-ref-syntax-qry-select-groupby.html) clause. ### Syntax + {% highlight sql %} HAVING boolean_expression {% endhighlight %} ### Parameters +
boolean_expression
@@ -47,6 +52,7 @@ HAVING boolean_expression
### Examples + {% highlight sql %} CREATE TABLE dealer (id INT, city STRING, car_model STRING, quantity INT); INSERT INTO dealer VALUES @@ -61,16 +67,14 @@ INSERT INTO dealer VALUES -- `HAVING` clause referring to column in `GROUP BY`. SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING city = 'Fremont'; - +-------+---+ - |city |sum| + | city|sum| +-------+---+ - |Fremont|32 | + |Fremont| 32| +-------+---+ -- `HAVING` clause referring to aggregate function. SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum(quantity) > 15; - +-------+---+ | city|sum| +-------+---+ @@ -80,7 +84,6 @@ SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum(quantity) -- `HAVING` clause referring to aggregate function by its alias. SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum > 15; - +-------+---+ | city|sum| +-------+---+ @@ -91,16 +94,14 @@ SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING sum > 15; -- `HAVING` clause referring to a different aggregate function than what is present in -- `SELECT` list. SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING max(quantity) > 15; - +------+---+ - |city |sum| + | city|sum| +------+---+ - |Dublin|33 | + |Dublin| 33| +------+---+ -- `HAVING` clause referring to constant expression. SELECT city, sum(quantity) AS sum FROM dealer GROUP BY city HAVING 1 > 0 ORDER BY city; - +--------+---+ | city|sum| +--------+---+ @@ -116,15 +117,15 @@ SELECT sum(quantity) AS sum FROM dealer HAVING sum(quantity) > 10; +---+ | 78| +---+ - {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-hints.md b/docs/sql-ref-syntax-qry-select-hints.md index 688ba10c3b1f7..16f4f95f90ea1 100644 --- a/docs/sql-ref-syntax-qry-select-hints.md +++ b/docs/sql-ref-syntax-qry-select-hints.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description Join Hints allow users to suggest the join strategy that Spark should use. Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, `SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. When different join strategy hints are specified on both sides of a join, Spark prioritizes hints in the following order: `BROADCAST` over `MERGE` over `SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side based on the join type and the sizes of the relations. Since a given strategy may not support all join types, Spark is not guaranteed to use the join strategy suggested by the hint. @@ -55,7 +56,6 @@ Join Hints allow users to suggest the join strategy that Spark should use. Prior ### Examples {% highlight sql %} - -- Join Hints for broadcast join SELECT /*+ BROADCAST(t1) */ * FROM t1 INNER JOIN t2 ON t1.key = t2.key; SELECT /*+ BROADCASTJOIN (t1) */ * FROM t1 left JOIN t2 ON t1.key = t2.key; @@ -79,9 +79,9 @@ SELECT /*+ SHUFFLE_REPLICATE_NL(t1) */ * FROM t1 INNER JOIN t2 ON t1.key = t2.ke -- org.apache.spark.sql.catalyst.analysis.HintErrorLogger: Hint (strategy=merge) -- is overridden by another hint and will not take effect. SELECT /*+ BROADCAST(t1) */ /*+ MERGE(t1, t2) */ * FROM t1 INNER JOIN t2 ON t1.key = t2.key; - {% endhighlight %} ### Related Statements -- [JOIN](sql-ref-syntax-qry-select-join.html) -- [SELECT](sql-ref-syntax-qry-select.html) + + * [JOIN](sql-ref-syntax-qry-select-join.html) + * [SELECT](sql-ref-syntax-qry-select.html) diff --git a/docs/sql-ref-syntax-qry-select-join.md b/docs/sql-ref-syntax-qry-select-join.md index 4759b12f4db70..41b7603a3a25e 100644 --- a/docs/sql-ref-syntax-qry-select-join.md +++ b/docs/sql-ref-syntax-qry-select-join.md @@ -18,6 +18,7 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + ### Description A SQL join is used to combine rows from two relations based on join criteria. The following section describes the overall join syntax and the sub-sections cover different types of joins along with examples. @@ -84,6 +85,7 @@ A left join returns all values from the left relation and the matched values fro #### Right Join +
A right join returns all values from the right relation and the matched values from the left relation, or appends NULL if there is no match. It is also referred to as a right outer join.

Syntax:
@@ -254,6 +256,7 @@ SELECT * FROM employee ANTI JOIN department ON employee.deptno = department.dept +---+-----+------+ {% endhighlight %} -### Related Statement +### Related Statements + * [SELECT](sql-ref-syntax-qry-select.html) * [Join Hints](sql-ref-syntax-qry-select-hints.html) diff --git a/docs/sql-ref-syntax-qry-select-limit.md b/docs/sql-ref-syntax-qry-select-limit.md index 356930c879d28..0ceb705889b47 100644 --- a/docs/sql-ref-syntax-qry-select-limit.md +++ b/docs/sql-ref-syntax-qry-select-limit.md @@ -18,17 +18,22 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The LIMIT clause is used to constrain the number of rows returned by the [SELECT](sql-ref-syntax-qry-select.html) statement. In general, this clause is used in conjunction with [ORDER BY](sql-ref-syntax-qry-select-orderby.html) to ensure that the results are deterministic. ### Syntax + {% highlight sql %} LIMIT { ALL | integer_expression } {% endhighlight %} ### Parameters +
ALL
@@ -42,6 +47,7 @@ LIMIT { ALL | integer_expression }
### Examples + {% highlight sql %} CREATE TABLE person (name STRING, age INT); INSERT INTO person VALUES @@ -54,31 +60,28 @@ INSERT INTO person VALUES -- Select the first two rows. SELECT name, age FROM person ORDER BY name LIMIT 2; - +------+---+ - |name |age| + | name|age| +------+---+ - |Anil B|18 | - |Jack N|16 | + |Anil B| 18| + |Jack N| 16| +------+---+ -- Specifying ALL option on LIMIT returns all the rows. SELECT name, age FROM person ORDER BY name LIMIT ALL; - +-------+---+ - |name |age| + | name|age| +-------+---+ - |Anil B |18 | - |Jack N |16 | - |John A |18 | - |Mike A |25 | - |Shone S|16 | - |Zen Hui|25 | + | Anil B| 18| + | Jack N| 16| + | John A| 18| + | Mike A| 25| + |Shone S| 16| + |Zen Hui| 25| +-------+---+ -- A function expression as an input to LIMIT. -SELECT name, age FROM person ORDER BY name LIMIT length('SPARK') - +SELECT name, age FROM person ORDER BY name LIMIT length('SPARK'); +-------+---+ | name|age| +-------+---+ @@ -90,17 +93,17 @@ SELECT name, age FROM person ORDER BY name LIMIT length('SPARK') +-------+---+ -- A non-foldable expression as an input to LIMIT is not allowed. -SELECT name, age FROM person ORDER BY name LIMIT length(name) - -org.apache.spark.sql.AnalysisException: The limit expression must evaluate to a constant value ... +SELECT name, age FROM person ORDER BY name LIMIT length(name); + org.apache.spark.sql.AnalysisException: The limit expression must evaluate to a constant value ... {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) diff --git a/docs/sql-ref-syntax-qry-select-orderby.md b/docs/sql-ref-syntax-qry-select-orderby.md index eb99dbb06096d..cc75367968053 100644 --- a/docs/sql-ref-syntax-qry-select-orderby.md +++ b/docs/sql-ref-syntax-qry-select-orderby.md @@ -18,16 +18,21 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The ORDER BY clause is used to return the result rows in a sorted manner in the user specified order. Unlike the [SORT BY](sql-ref-syntax-qry-select-sortby.html) clause, this clause guarantees a total order in the output. ### Syntax + {% highlight sql %} ORDER BY { expression [ sort_direction | nulls_sort_oder ] [ , ... ] } {% endhighlight %} ### Parameters +
ORDER BY
@@ -64,6 +69,7 @@ ORDER BY { expression [ sort_direction | nulls_sort_oder ] [ , ... ] }
### Examples + {% highlight sql %} CREATE TABLE person (id INT, name STRING, age INT); INSERT INTO person VALUES @@ -75,77 +81,73 @@ INSERT INTO person VALUES -- Sort rows by age. By default rows are sorted in ascending manner with NULL FIRST. SELECT name, age FROM person ORDER BY age; - +-----+----+ - |name |age | + | name| age| +-----+----+ |Jerry|null| - |Mary |null| - |John |30 | - |Dan |50 | - |Mike |80 | + | Mary|null| + | John| 30| + | Dan| 50| + | Mike| 80| +-----+----+ -- Sort rows in ascending manner keeping null values to be last. SELECT name, age FROM person ORDER BY age NULLS LAST; - +-----+----+ - |name |age | + | name| age| +-----+----+ - |John |30 | - |Dan |50 | - |Mike |80 | - |Mary |null| + | John| 30| + | Dan| 50| + | Mike| 80| + | Mary|null| |Jerry|null| +-----+----+ -- Sort rows by age in descending manner, which defaults to NULL LAST. SELECT name, age FROM person ORDER BY age DESC; - +-----+----+ - |name |age | + | name| age| +-----+----+ - |Mike |80 | - |Dan |50 | - |John |30 | + | Mike| 80| + | Dan| 50| + | John| 30| |Jerry|null| - |Mary |null| + | Mary|null| +-----+----+ -- Sort rows in ascending manner keeping null values to be first. SELECT name, age FROM person ORDER BY age DESC NULLS FIRST; - +-----+----+ - |name |age | + | name| age| +-----+----+ |Jerry|null| - |Mary |null| - |Mike |80 | - |Dan |50 | - |John |30 | + | Mary|null| + | Mike| 80| + | Dan| 50| + | John| 30| +-----+----+ -- Sort rows based on more than one column with each column having different -- sort direction. SELECT * FROM person ORDER BY name ASC, age DESC; - +---+-----+----+ - |id |name |age | + | id| name| age| +---+-----+----+ - |500|Dan |50 | + |500| Dan| 50| |400|Jerry|null| - |100|John |30 | - |200|Mary |null| - |300|Mike |80 | + |100| John| 30| + |200| Mary|null| + |300| Mike| 80| +---+-----+----+ {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-setops.md b/docs/sql-ref-syntax-qry-select-setops.md index 8ed6e4880f7db..09a207a70c149 100644 --- a/docs/sql-ref-syntax-qry-select-setops.md +++ b/docs/sql-ref-syntax-qry-select-setops.md @@ -19,161 +19,178 @@ license: | limitations under the License. --- +### Description + Set operators are used to combine two input relations into a single one. Spark SQL supports three types of set operators: -- `EXCEPT` or `MINUS` -- `INTERSECT` -- `UNION` + + - `EXCEPT` or `MINUS` + - `INTERSECT` + - `UNION` Note that input relations must have the same number of columns and compatible data types for the respective columns. ### EXCEPT + `EXCEPT` and `EXCEPT ALL` return the rows that are found in one relation but not the other. `EXCEPT` (alternatively, `EXCEPT DISTINCT`) takes only distinct rows while `EXCEPT ALL` does not remove duplicates from the result rows. Note that `MINUS` is an alias for `EXCEPT`. #### Syntax + {% highlight sql %} [ ( ] relation [ ) ] EXCEPT | MINUS [ ALL | DISTINCT ] [ ( ] relation [ ) ] {% endhighlight %} -### INTERSECT -`INTERSECT` and `INTERSECT ALL` return the rows that are found in both relations. `INTERSECT` (alternatively, `INTERSECT DISTINCT`) takes only distinct rows while `INTERSECT ALL` does not remove duplicates from the result rows. +#### Examples -#### Syntax {% highlight sql %} -[ ( ] relation [ ) ] INTERSECT [ ALL | DISTINCT ] [ ( ] relation [ ) ] -{% endhighlight %} - -### UNION -`UNION` and `UNION ALL` return the rows that are found in either relation. `UNION` (alternatively, `UNION DISTINCT`) takes only distinct rows while `UNION ALL` does not remove duplicates from the result rows. - -#### Syntax -{% highlight sql %} -[ ( ] relation [ ) ] UNION [ ALL | DISTINCT ] [ ( ] relation [ ) ] -{% endhighlight %} - -### Examples -{% highlight sql %} --- Use number1 and number2 tables to demonstrate set operators. +-- Use number1 and number2 tables to demonstrate set operators in this page. SELECT * FROM number1; -+---+ -| c| -+---+ -| 3| -| 1| -| 2| -| 2| -| 3| -| 4| -+---+ - + +---+ + | c| + +---+ + | 3| + | 1| + | 2| + | 2| + | 3| + | 4| + +---+ + SELECT * FROM number2; -+---+ -| c| -+---+ -| 5| -| 1| -| 2| -| 2| -+---+ + +---+ + | c| + +---+ + | 5| + | 1| + | 2| + | 2| + +---+ SELECT c FROM number1 EXCEPT SELECT c FROM number2; -+---+ -| c| -+---+ -| 3| -| 4| -+---+ + +---+ + | c| + +---+ + | 3| + | 4| + +---+ SELECT c FROM number1 MINUS SELECT c FROM number2; -+---+ -| c| -+---+ -| 3| -| 4| -+---+ + +---+ + | c| + +---+ + | 3| + | 4| + +---+ SELECT c FROM number1 EXCEPT ALL (SELECT c FROM number2); -+---+ -| c| -+---+ -| 3| -| 3| -| 4| -+---+ + +---+ + | c| + +---+ + | 3| + | 3| + | 4| + +---+ SELECT c FROM number1 MINUS ALL (SELECT c FROM number2); -+---+ -| c| -+---+ -| 3| -| 3| -| 4| -+---+ + +---+ + | c| + +---+ + | 3| + | 3| + | 4| + +---+ +{% endhighlight %} +### INTERSECT + +`INTERSECT` and `INTERSECT ALL` return the rows that are found in both relations. `INTERSECT` (alternatively, `INTERSECT DISTINCT`) takes only distinct rows while `INTERSECT ALL` does not remove duplicates from the result rows. + +#### Syntax + +{% highlight sql %} +[ ( ] relation [ ) ] INTERSECT [ ALL | DISTINCT ] [ ( ] relation [ ) ] +{% endhighlight %} + +#### Examples + +{% highlight sql %} (SELECT c FROM number1) INTERSECT (SELECT c FROM number2); -+---+ -| c| -+---+ -| 1| -| 2| -+---+ + +---+ + | c| + +---+ + | 1| + | 2| + +---+ (SELECT c FROM number1) INTERSECT DISTINCT (SELECT c FROM number2); -+---+ -| c| -+---+ -| 1| -| 2| -+---+ + +---+ + | c| + +---+ + | 1| + | 2| + +---+ (SELECT c FROM number1) INTERSECT ALL (SELECT c FROM number2); -+---+ -| c| -+---+ -| 1| -| 2| -| 2| -+---+ + +---+ + | c| + +---+ + | 1| + | 2| + | 2| + +---+ +{% endhighlight %} + +### UNION +`UNION` and `UNION ALL` return the rows that are found in either relation. `UNION` (alternatively, `UNION DISTINCT`) takes only distinct rows while `UNION ALL` does not remove duplicates from the result rows. + +#### Syntax + +{% highlight sql %} +[ ( ] relation [ ) ] UNION [ ALL | DISTINCT ] [ ( ] relation [ ) ] +{% endhighlight %} + +### Examples + +{% highlight sql %} (SELECT c FROM number1) UNION (SELECT c FROM number2); -+---+ -| c| -+---+ -| 1| -| 3| -| 5| -| 4| -| 2| -+---+ + +---+ + | c| + +---+ + | 1| + | 3| + | 5| + | 4| + | 2| + +---+ (SELECT c FROM number1) UNION DISTINCT (SELECT c FROM number2); -+---+ -| c| -+---+ -| 1| -| 3| -| 5| -| 4| -| 2| -+---+ + +---+ + | c| + +---+ + | 1| + | 3| + | 5| + | 4| + | 2| + +---+ SELECT c FROM number1 UNION ALL (SELECT c FROM number2); -+---+ -| c| -+---+ -| 3| -| 1| -| 2| -| 2| -| 3| -| 4| -| 5| -| 1| -| 2| -| 2| -+---+ - + +---+ + | c| + +---+ + | 3| + | 1| + | 2| + | 2| + | 3| + | 4| + | 5| + | 1| + | 2| + | 2| + +---+ {% endhighlight %} -### Related Statement -- [SELECT Statement](sql-ref-syntax-qry-select.html) +### Related Statements + * [SELECT Statement](sql-ref-syntax-qry-select.html) diff --git a/docs/sql-ref-syntax-qry-select-sortby.md b/docs/sql-ref-syntax-qry-select-sortby.md index 9b52738ee7926..315faa5e7d501 100644 --- a/docs/sql-ref-syntax-qry-select-sortby.md +++ b/docs/sql-ref-syntax-qry-select-sortby.md @@ -18,6 +18,9 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The SORT BY clause is used to return the result rows sorted within each partition in the user specified order. When there is more than one partition SORT BY may return result that is partially ordered. This is different @@ -25,11 +28,13 @@ than [ORDER BY](sql-ref-syntax-qry-select-orderby.html) clause which guarantees total order of the output. ### Syntax + {% highlight sql %} SORT BY { expression [ sort_direction | nulls_sort_order ] [ , ... ] } {% endhighlight %} ### Parameters +
SORT BY
@@ -66,6 +71,7 @@ SORT BY { expression [ sort_direction | nulls_sort_order ] [ , ... ] }
### Examples + {% highlight sql %} CREATE TABLE person (zip_code INT, name STRING, age INT); INSERT INTO person VALUES @@ -83,103 +89,98 @@ INSERT INTO person VALUES -- Sort rows by `name` within each partition in ascending manner SELECT /*+ REPARTITION(zip_code) */ name, age, zip_code FROM person SORT BY name; - +--------+----+--------+ - |name |age |zip_code| + | name| age|zip_code| +--------+----+--------+ - |Anil K |27 |94588 | - |Dan Li |18 |94588 | - |John V |null|94588 | - |Zen Hui |50 |94588 | - |Aryan B.|18 |94511 | - |David K |42 |94511 | - |Lalit B.|null|94511 | + | Anil K| 27| 94588| + | Dan Li| 18| 94588| + | John V|null| 94588| + | Zen Hui| 50| 94588| + |Aryan B.| 18| 94511| + | David K| 42| 94511| + |Lalit B.|null| 94511| +--------+----+--------+ -- Sort rows within each partition using column position. SELECT /*+ REPARTITION(zip_code) */ name, age, zip_code FROM person SORT BY 1; - +--------+----+--------+ - |name |age |zip_code| + | name| age|zip_code| +--------+----+--------+ - |Anil K |27 |94588 | - |Dan Li |18 |94588 | - |John V |null|94588 | - |Zen Hui |50 |94588 | - |Aryan B.|18 |94511 | - |David K |42 |94511 | - |Lalit B.|null|94511 | + | Anil K| 27| 94588| + | Dan Li| 18| 94588| + | John V|null| 94588| + | Zen Hui| 50| 94588| + |Aryan B.| 18| 94511| + | David K| 42| 94511| + |Lalit B.|null| 94511| +--------+----+--------+ -- Sort rows within partition in ascending manner keeping null values to be last. SELECT /*+ REPARTITION(zip_code) */ age, name, zip_code FROM person SORT BY age NULLS LAST; - +----+--------+--------+ - |age |name |zip_code| + | age| name|zip_code| +----+--------+--------+ - |18 |Dan Li |94588 | - |27 |Anil K |94588 | - |50 |Zen Hui |94588 | - |null|John V |94588 | - |18 |Aryan B.|94511 | - |42 |David K |94511 | - |null|Lalit B.|94511 | + | 18| Dan Li| 94588| + | 27| Anil K| 94588| + | 50| Zen Hui| 94588| + |null| John V| 94588| + | 18|Aryan B.| 94511| + | 42| David K| 94511| + |null|Lalit B.| 94511| +----+--------+--------+ -- Sort rows by age within each partition in descending manner, which defaults to NULL LAST. SELECT /*+ REPARTITION(zip_code) */ age, name, zip_code FROM person SORT BY age DESC; - +----+--------+--------+ - |age |name |zip_code| + | age| name|zip_code| +----+--------+--------+ - |50 |Zen Hui |94588 | - |27 |Anil K |94588 | - |18 |Dan Li |94588 | - |null|John V |94588 | - |42 |David K |94511 | - |18 |Aryan B.|94511 | - |null|Lalit B.|94511 | + | 50| Zen Hui| 94588| + | 27| Anil K| 94588| + | 18| Dan Li| 94588| + |null| John V| 94588| + | 42| David K| 94511| + | 18|Aryan B.| 94511| + |null|Lalit B.| 94511| +----+--------+--------+ -- Sort rows by age within each partition in descending manner keeping null values to be first. SELECT /*+ REPARTITION(zip_code) */ age, name, zip_code FROM person SORT BY age DESC NULLS FIRST; - +----+--------+--------+ - |age |name |zip_code| + | age| name|zip_code| +----+--------+--------+ - |null|John V |94588 | - |50 |Zen Hui |94588 | - |27 |Anil K |94588 | - |18 |Dan Li |94588 | - |null|Lalit B.|94511 | - |42 |David K |94511 | - |18 |Aryan B.|94511 | + |null| John V| 94588| + | 50| Zen Hui| 94588| + | 27| Anil K| 94588| + | 18| Dan Li| 94588| + |null|Lalit B.| 94511| + | 42| David K| 94511| + | 18|Aryan B.| 94511| +----+--------+--------+ -- Sort rows within each partition based on more than one column with each column having -- different sort direction. SELECT /*+ REPARTITION(zip_code) */ name, age, zip_code FROM person - SORT BY name ASC, age DESC; - + SORT BY name ASC, age DESC; +--------+----+--------+ - |name |age |zip_code| + | name| age|zip_code| +--------+----+--------+ - |Anil K |27 |94588 | - |Dan Li |18 |94588 | - |John V |null|94588 | - |Zen Hui |50 |94588 | - |Aryan B.|18 |94511 | - |David K |42 |94511 | - |Lalit B.|null|94511 | + | Anil K| 27| 94588| + | Dan Li| 18| 94588| + | John V|null| 94588| + | Zen Hui| 50| 94588| + |Aryan B.| 18| 94511| + | David K| 42| 94511| + |Lalit B.|null| 94511| +--------+----+--------+ {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select-usedb.md b/docs/sql-ref-syntax-qry-select-usedb.md index 92ac91ac51769..2a05085218978 100644 --- a/docs/sql-ref-syntax-qry-select-usedb.md +++ b/docs/sql-ref-syntax-qry-select-usedb.md @@ -20,12 +20,14 @@ license: | --- ### Description + `USE` statement is used to set the current database. After the current database is set, the unqualified database artifacts such as tables, functions and views that are referenced by SQLs are resolved from the current database. The default database name is 'default'. ### Syntax + {% highlight sql %} USE database_name {% endhighlight %} @@ -40,21 +42,18 @@ USE database_name ### Example + {% highlight sql %} -- Use the 'userdb' which exists. USE userdb; -+---------+--+ -| Result | -+---------+--+ -+---------+--+ -- Use the 'userdb1' which doesn't exist USE userdb1; -Error: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'userdb1' not found;(state=,code=0) + Error: org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'userdb1' not found;(state=,code=0) {% endhighlight %} -### Related statements. -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) -- [CREATE TABLE ](sql-ref-syntax-ddl-create-table.html) +### Related Statements + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) + * [CREATE TABLE ](sql-ref-syntax-ddl-create-table.html) diff --git a/docs/sql-ref-syntax-qry-select-where.md b/docs/sql-ref-syntax-qry-select-where.md index 106053d16f8bd..1960367cd42f0 100644 --- a/docs/sql-ref-syntax-qry-select-where.md +++ b/docs/sql-ref-syntax-qry-select-where.md @@ -18,15 +18,20 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + The WHERE clause is used to limit the results of the FROM clause of a query or a subquery based on the specified condition. ### Syntax + {% highlight sql %} WHERE boolean_expression {% endhighlight %} ### Parameters +
boolean_expression
@@ -37,6 +42,7 @@ WHERE boolean_expression
### Examples + {% highlight sql %} CREATE TABLE person (id INT, name STRING, age INT); INSERT INTO person VALUES @@ -48,38 +54,38 @@ INSERT INTO person VALUES -- Comparison operator in `WHERE` clause. SELECT * FROM person WHERE id > 200 ORDER BY id; +---+----+---+ - |id |name|age| + | id|name|age| +---+----+---+ - |300|Mike|80 | - |400|Dan |50 | + |300|Mike| 80| + |400| Dan| 50| +---+----+---+ -- Comparison and logical operators in `WHERE` clause. SELECT * FROM person WHERE id = 200 OR id = 300 ORDER BY id; +---+----+----+ - |id |name|age | + | id|name| age| +---+----+----+ |200|Mary|null| - |300|Mike|80 | + |300|Mike| 80| +---+----+----+ -- IS NULL expression in `WHERE` clause. SELECT * FROM person WHERE id > 300 OR age IS NULL ORDER BY id; +---+----+----+ - |id |name|age | + | id|name| age| +---+----+----+ |200|Mary|null| - |400|Dan |50 | + |400| Dan| 50| +---+----+----+ -- Function expression in `WHERE` clause. SELECT * FROM person WHERE length(name) > 3 ORDER BY id; +---+----+----+ - |id |name|age | + | id|name| age| +---+----+----+ - |100|John|30 | + |100|John| 30| |200|Mary|null| - |300|Mike|80 | + |300|Mike| 80| +---+----+----+ -- `BETWEEN` expression in `WHERE` clause. @@ -94,31 +100,31 @@ SELECT * FROM person WHERE id BETWEEN 200 AND 300 ORDER BY id; -- Scalar Subquery in `WHERE` clause. SELECT * FROM person WHERE age > (SELECT avg(age) FROM person); +---+----+---+ - |id |name|age| + | id|name|age| +---+----+---+ - |300|Mike|80 | + |300|Mike| 80| +---+----+---+ -- Correlated Subquery in `WHERE` clause. SELECT * FROM person AS parent -WHERE EXISTS ( - SELECT 1 FROM person AS child - WHERE parent.id = child.id AND child.age IS NULL - ); + WHERE EXISTS ( + SELECT 1 FROM person AS child + WHERE parent.id = child.id AND child.age IS NULL + ); +---+----+----+ |id |name|age | +---+----+----+ |200|Mary|null| +---+----+----+ - {% endhighlight %} -### Related Clauses -- [SELECT Main](sql-ref-syntax-qry-select.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) +### Related Statements + + * [SELECT Main](sql-ref-syntax-qry-select.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) diff --git a/docs/sql-ref-syntax-qry-select.md b/docs/sql-ref-syntax-qry-select.md index 8ecc2c630f090..94f69d4d733c4 100644 --- a/docs/sql-ref-syntax-qry-select.md +++ b/docs/sql-ref-syntax-qry-select.md @@ -18,33 +18,38 @@ license: | See the License for the specific language governing permissions and limitations under the License. --- + +### Description + Spark supports a `SELECT` statement and conforms to the ANSI SQL standard. Queries are used to retrieve result sets from one or more tables. The following section describes the overall query syntax and the sub-sections cover different constructs of a query along with examples. ### Syntax + {% highlight sql %} [ WITH with_query [ , ... ] ] select_statement [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select_statement, ... ] -[ ORDER BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ...] } ] -[ SORT BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ...] } ] -[ CLUSTER BY { expression [ , ...] } ] -[ DISTRIBUTE BY { expression [, ...] } ] -[ WINDOW { named_window [ , WINDOW named_window, ... ] } ] -[ LIMIT { ALL | expression } ] + [ ORDER BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ...] } ] + [ SORT BY { expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [ , ...] } ] + [ CLUSTER BY { expression [ , ...] } ] + [ DISTRIBUTE BY { expression [, ...] } ] + [ WINDOW { named_window [ , WINDOW named_window, ... ] } ] + [ LIMIT { ALL | expression } ] {% endhighlight %} While `select_statement` is defined as {% highlight sql %} SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] } - FROM { from_item [ , ...] } - [ WHERE boolean_expression ] - [ GROUP BY expression [ , ...] ] - [ HAVING boolean_expression ] + FROM { from_item [ , ...] } + [ WHERE boolean_expression ] + [ GROUP BY expression [ , ...] ] + [ HAVING boolean_expression ] {% endhighlight %} ### Parameters +
with_query
@@ -141,15 +146,16 @@ SELECT [ hints , ... ] [ ALL | DISTINCT ] { named_expression [ , ... ] }
-### Related Clauses -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) -- [TABLESAMPLE](sql-ref-syntax-qry-sampling.html) -- [JOIN](sql-ref-syntax-qry-select-join.html) -- [SET Operators](sql-ref-syntax-qry-select-setops.html) +### Related Statements + + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) + * [TABLESAMPLE](sql-ref-syntax-qry-sampling.html) + * [JOIN](sql-ref-syntax-qry-select-join.html) + * [SET Operators](sql-ref-syntax-qry-select-setops.html) diff --git a/docs/sql-ref-syntax-qry.md b/docs/sql-ref-syntax-qry.md index 37414acd57a38..477c347eed800 100644 --- a/docs/sql-ref-syntax-qry.md +++ b/docs/sql-ref-syntax-qry.md @@ -26,13 +26,12 @@ and brief description of supported clauses are explained in ability to generate logical and physical plan for a given query using [EXPLAIN](sql-ref-syntax-qry-explain.html) statement. - -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) -- [EXPLAIN Statement](sql-ref-syntax-qry-explain.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) + * [EXPLAIN Statement](sql-ref-syntax-qry-explain.html) diff --git a/docs/sql-ref-syntax.md b/docs/sql-ref-syntax.md index dd611d75d3429..94bd476ffb7b1 100644 --- a/docs/sql-ref-syntax.md +++ b/docs/sql-ref-syntax.md @@ -22,62 +22,66 @@ license: | Spark SQL is Apache Spark's module for working with structured data. The SQL Syntax section describes the SQL syntax in detail along with usage examples when applicable. This document provides a list of Data Definition and Data Manipulation Statements, as well as Data Retrieval and Auxiliary Statements. ### DDL Statements -- [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) -- [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) -- [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) -- [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) -- [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) -- [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) -- [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) -- [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) -- [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) -- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) -- [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) -- [REPAIR TABLE](sql-ref-syntax-ddl-repair-table.html) -- [TRUNCATE TABLE](sql-ref-syntax-ddl-truncate-table.html) -- [USE DATABASE](sql-ref-syntax-qry-select-usedb.html) + + * [ALTER DATABASE](sql-ref-syntax-ddl-alter-database.html) + * [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) + * [ALTER VIEW](sql-ref-syntax-ddl-alter-view.html) + * [CREATE DATABASE](sql-ref-syntax-ddl-create-database.html) + * [CREATE FUNCTION](sql-ref-syntax-ddl-create-function.html) + * [CREATE TABLE](sql-ref-syntax-ddl-create-table.html) + * [CREATE VIEW](sql-ref-syntax-ddl-create-view.html) + * [DROP DATABASE](sql-ref-syntax-ddl-drop-database.html) + * [DROP FUNCTION](sql-ref-syntax-ddl-drop-function.html) + * [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) + * [DROP VIEW](sql-ref-syntax-ddl-drop-view.html) + * [REPAIR TABLE](sql-ref-syntax-ddl-repair-table.html) + * [TRUNCATE TABLE](sql-ref-syntax-ddl-truncate-table.html) + * [USE DATABASE](sql-ref-syntax-qry-select-usedb.html) ### DML Statements -- [INSERT INTO](sql-ref-syntax-dml-insert-into.html) -- [INSERT OVERWRITE](sql-ref-syntax-dml-insert-overwrite-table.html) -- [INSERT OVERWRITE DIRECTORY](sql-ref-syntax-dml-insert-overwrite-directory.html) -- [INSERT OVERWRITE DIRECTORY with Hive format](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) -- [LOAD](sql-ref-syntax-dml-load.html) + + * [INSERT INTO](sql-ref-syntax-dml-insert-into.html) + * [INSERT OVERWRITE](sql-ref-syntax-dml-insert-overwrite-table.html) + * [INSERT OVERWRITE DIRECTORY](sql-ref-syntax-dml-insert-overwrite-directory.html) + * [INSERT OVERWRITE DIRECTORY with Hive format](sql-ref-syntax-dml-insert-overwrite-directory-hive.html) + * [LOAD](sql-ref-syntax-dml-load.html) ### Data Retrieval Statements -- [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) -- [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) -- [EXPLAIN](sql-ref-syntax-qry-explain.html) -- [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) -- [HAVING Clause](sql-ref-syntax-qry-select-having.html) -- [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) -- [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) -- [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) -- [WHERE Clause](sql-ref-syntax-qry-select-where.html) + + * [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html) + * [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html) + * [EXPLAIN](sql-ref-syntax-qry-explain.html) + * [GROUP BY Clause](sql-ref-syntax-qry-select-groupby.html) + * [HAVING Clause](sql-ref-syntax-qry-select-having.html) + * [LIMIT Clause](sql-ref-syntax-qry-select-limit.html) + * [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html) + * [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html) + * [WHERE Clause](sql-ref-syntax-qry-select-where.html) ### Auxiliary Statements -- [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) -- [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) -- [ANALYZE TABLE](sql-ref-syntax-aux-analyze-table.html) -- [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) -- [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) -- [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) -- [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) -- [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) -- [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) -- [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) -- [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) -- [REFRESH](sql-ref-syntax-aux-cache-refresh.html) -- [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) -- [SET](sql-ref-syntax-aux-conf-mgmt-set.html) -- [SHOW COLUMNS](sql-ref-syntax-aux-show-columns.html) -- [SHOW CREATE TABLE](sql-ref-syntax-aux-show-create-table.html) -- [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) -- [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) -- [SHOW PARTITIONS](sql-ref-syntax-aux-show-partitions.html) -- [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) -- [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) -- [SHOW TBLPROPERTIES](sql-ref-syntax-aux-show-tblproperties.html) -- [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) -- [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) -- [UNSET](sql-ref-syntax-aux-conf-mgmt-reset.html) + + * [ADD FILE](sql-ref-syntax-aux-resource-mgmt-add-file.html) + * [ADD JAR](sql-ref-syntax-aux-resource-mgmt-add-jar.html) + * [ANALYZE TABLE](sql-ref-syntax-aux-analyze-table.html) + * [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html) + * [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html) + * [DESCRIBE DATABASE](sql-ref-syntax-aux-describe-database.html) + * [DESCRIBE FUNCTION](sql-ref-syntax-aux-describe-function.html) + * [DESCRIBE QUERY](sql-ref-syntax-aux-describe-query.html) + * [DESCRIBE TABLE](sql-ref-syntax-aux-describe-table.html) + * [LIST FILE](sql-ref-syntax-aux-resource-mgmt-list-file.html) + * [LIST JAR](sql-ref-syntax-aux-resource-mgmt-list-jar.html) + * [REFRESH](sql-ref-syntax-aux-cache-refresh.html) + * [REFRESH TABLE](sql-ref-syntax-aux-refresh-table.html) + * [SET](sql-ref-syntax-aux-conf-mgmt-set.html) + * [SHOW COLUMNS](sql-ref-syntax-aux-show-columns.html) + * [SHOW CREATE TABLE](sql-ref-syntax-aux-show-create-table.html) + * [SHOW DATABASES](sql-ref-syntax-aux-show-databases.html) + * [SHOW FUNCTIONS](sql-ref-syntax-aux-show-functions.html) + * [SHOW PARTITIONS](sql-ref-syntax-aux-show-partitions.html) + * [SHOW TABLE EXTENDED](sql-ref-syntax-aux-show-table.html) + * [SHOW TABLES](sql-ref-syntax-aux-show-tables.html) + * [SHOW TBLPROPERTIES](sql-ref-syntax-aux-show-tblproperties.html) + * [SHOW VIEWS](sql-ref-syntax-aux-show-views.html) + * [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html) + * [UNSET](sql-ref-syntax-aux-conf-mgmt-reset.html)