/
lexical-structure.rst
341 lines (267 loc) · 8.92 KB
/
lexical-structure.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
.. highlight:: psql
.. _sql_lexical:
=================
Lexical structure
=================
An SQL input consists of a sequence of commands each of which is a sequence of
tokens, terminated by a semicolon (``;``).
The syntax of a command defines its set of valid tokens. A token can be a key
word, an identifier, a quoted identifier, a literal (or constant), or a special
character symbol.
.. rubric:: Table of contents
.. contents::
:local:
.. _string_literal:
String literal
==============
String literals are defined as an arbitrary sequence of characters that are
delimited with single quotes ``'`` as defined in ANSI SQL, for example
``'This is a string'``.
In addition, CrateDB supports dollar quoted strings to help avoid escaping
single quotes within single quoted strings.
For example, ``'I''m a string'`` can be re-written as
``$<tag>$I'm a string$<tag>$``, where the matching pair of ``<tag>`` can be
zero or more characters in length.
::
cr> select 'I''m a string' = $tag1$I'm a string$tag1$;
+------+
| true |
+------+
| TRUE |
+------+
SELECT 1 row in set (... sec)
.. NOTE::
Nested dollar quoted strings are currently not supported.
Escape strings
--------------
The escape character in CrateDB is the single-quote ``'``. A character gets
escaped when adding a single-quote before it. For example a single quote
character within a string literal can be included by writing two adjacent
single quotes, e.g., ``'Jack''s car'``.
.. NOTE::
Two adjacent single quotes are **not** equivalent to the double-quote
character ``"``.
.. _sql_escape_string_literals:
String literals with C-Style escapes
------------------------------------
In addition to the escaped character ``'``, CrateDB supports C-Style escaped
string sequences. Such a sequence is constructed by prefixing the string
literal with the letter ``E`` or ``e``, for example, ``e'hello\nWorld'``.
The following escaped sequences are supported:
================================================== ================
Escape Sequence Interpretation
================================================== ================
``\b`` backspace
``\f`` form feed
``\n`` newline
``\r`` carriage return
``\t`` tab
``\o``, ``\oo``, ``\ooo`` (``o`` = [0-7]) octal byte value
``\xh``, ``xhh`` (``h`` = [0-9,A-F,a-f]) hexadecimal byte value
``\uxxxx``, ``\Uxxxxxxxx`` (``x`` = [0-9,A-F,a-f]) 16 or 32-bit hexadecimal Unicode character value
================================================== ================
For instance, the escape string literal ``e'\u0061\x61\141'`` is equivalent to
the ``'aaa'`` string literal.
::
cr> select e'\u0061\x61\141' as col1;
+------+
| col1 |
+------+
| aaa |
+------+
SELECT 1 row in set (... sec)
Any other character following a backslash is taken literally. Thus, to include
a backslash character ``\``, two adjacent backslashes need to be used
(i.e. ``\\``).
::
cr> select e'aa\\nbb' as col1;
+--------+
| col1 |
+--------+
| aa\nbb |
+--------+
SELECT 1 row in set (... sec)
Finally, a single quote can be included in an escape string literal by also
using the escape backslash character: ``\'``, in addition to the single-quote
described in the :ref:`string literals <string_literal>` section.
::
cr> select e'aa\'bb' as col1;
+-------+
| col1 |
+-------+
| aa'bb |
+-------+
SELECT 1 row in set (... sec)
.. _sql_lexical_keywords_identifiers:
Key words and identifiers
=========================
The table below lists all *reserved key words* in CrateDB. These need to be
quoted if used as identifiers::
cr> SELECT word FROM pg_catalog.pg_get_keywords() WHERE catcode = 'R' ORDER BY 1;
+-------------------+
| word |
+-------------------+
| add |
| all |
| alter |
| and |
| any |
| array |
| as |
| asc |
| between |
| by |
| called |
| case |
| cast |
| column |
| constraint |
| costs |
| create |
| cross |
| current_date |
| current_role |
| current_schema |
| current_time |
| current_timestamp |
| current_user |
| default |
| delete |
| deny |
| desc |
| describe |
| directory |
| distinct |
| drop |
| else |
| end |
| escape |
| except |
| exists |
| extract |
| false |
| first |
| for |
| from |
| full |
| function |
| grant |
| group |
| having |
| if |
| in |
| index |
| inner |
| input |
| insert |
| intersect |
| into |
| is |
| join |
| last |
| left |
| like |
| limit |
| match |
| natural |
| not |
| null |
| nulls |
| object |
| offset |
| on |
| or |
| order |
| outer |
| persistent |
| recursive |
| reset |
| returns |
| revoke |
| right |
| select |
| session_user |
| set |
| some |
| stratify |
| table |
| then |
| transient |
| true |
| try_cast |
| unbounded |
| union |
| update |
| user |
| using |
| when |
| where |
| with |
+-------------------+
SELECT 96 rows in set (... sec)
Tokens such as ``my_table``, ``id``, ``name``, or ``data`` in the example below
are *identifiers*, which identify names of tables, columns, and other database
objects.
Example::
CREATE TABLE my_table (
id INTEGER,
name STRING,
data OBJECT
) WITH (number_of_replicas = 0);
.. NOTE::
Key words and unquoted identifiers are case insensitive while quoted
identifiers are case sensitive.
This means that::
select foo from t;
is equivalent to::
select Foo from t;
or::
select FOO from t;
To query a table named ``Foo``::
select "Foo" from t;
A widely used convention is to write key words in uppercase and identifiers in
lowercase, such as
::
ALTER TABLE foo ADD COLUMN new_column INTEGER;
::
INSERT INTO foo (id, name) VALUES (1, 'bar');
Quoted identifiers can contain an arbitrary sequence of characters enclosed by
double quotes (``"``). Quoted identifiers are never keywords, so you can use
``"update"`` as a table or column name.
.. _sql_lexical_special_chars:
Special characters
==================
Some non-alphanumeric characters do have a special meaning. For their usage
please refer to the sections where the respective syntax elements are
described.
:Semicolon:
The semicolon (``;``) terminates an SQL statement. It cannot appear
anywhere else within the command, except within a string or quoted
identifier.
:Comma:
The comma (``,``) is used in various syntactical elements to separate
elements of a list.
:Brackets:
Square brackets (``[]``) are used to select elements of arrays and objects,
e.g. ``arr[1]`` or ``obj['key']``.
:Asterisk:
The asterisk (``*``) is used in some contexts to denote all columns of a
table. As an argument in global :ref:`aggregate functions
<aggregation-functions>` it has the meaning of *any field*,
e.g. ``COUNT(*)``.
:Period:
The period (``.``) is used for numeric values and to separate schema and
table names, e.g. ``blob.my_blob_table``.
.. _sql_lexical_comments:
Comments
========
An SQL statement can contain comments. Single line comments start with a double
dash (``--``) and end at the end of that line. Multi line comments start with
``/*`` and end with ``*/``.
Example::
/*
* Retrieve information about all tables in the 'doc' schema.
*/
SELECT *
FROM information_schema.tables
WHERE table_schema = 'doc'; -- query information schema for doc tables