With these operations, it is possible to get information on the currently registered tables and their columns. The output format is mostly compatible with the presto format.
SHOW SCHEMAS SHOW TABLES FROM <schema-name> SHOW COLUMNS FROM <table-name> DESCRIBE <table-name> ANALYZE TABLE <table-name> COMPUTE STATISTICS [ FOR ALL COLUMNS | FOR COLUMNS <column>, [ ,... ] ]
See :ref:`sql` for information on how to reference schemas and tables correctly.
Show the schemas registered in dask-sql
.
Only included for compatibility reasons.
There is always just a one called "schema", where all the data is located and an additional schema, called "information_schema",
which is needed by some BI tools (which is empty).
Example:
SHOW SCHEMAS
Result:
Schema |
---|
schema |
information_schema |
Show the registered tables in a given schema.
Example:
SHOW TABLES FROM "schema"
Result:
Table |
---|
timeseries |
Show column information on a specific table.
Example:
SHOW COLUMNS FROM "timeseries"
Result:
Column | Type | Extra Comment |
---|---|---|
id | bigint | |
name | varchar | |
x | double | |
y | double |
The column "Extra Comment" is shown for compatibility with presto.
Calculate statistics on a given table (and the given columns or all columns)
and return it as a query result.
Please note, that this process can be time consuming on large tables.
Even though this statement is very similar to the ANALYZE TABLE
statement in e.g. Apache Spark, it does not optimize subsequent queries (as the pendent in Spark will do).
Example:
ANALYZE TABLE "timeseries" COMPUTE STATISTICS FOR COLUMNS x, y
Result:
x | y | |
---|---|---|
count | 30 | 30 |
mean | 0.140374 | -0.107481 |
std | 0.568248 | 0.573106 |
min | -0.795112 | -0.966043 |
25% | -0.379635 | -0.561234 |
50% | 0.0104101 | -0.237795 |
75% | 0.70208 | 0.263459 |
max | 0.990747 | 0.947069 |
data_type | double | double |
col_name | x | y |