-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathwindow.html
215 lines (191 loc) · 13.5 KB
/
window.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>6.15. Window Functions — Presto 0.184 Documentation</title>
<link rel="stylesheet" href="../_static/presto.css" type="text/css" />
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<script type="text/javascript">
var DOCUMENTATION_OPTIONS = {
URL_ROOT: '../',
VERSION: '0.184',
COLLAPSE_INDEX: false,
FILE_SUFFIX: '.html',
HAS_SOURCE: true
};
</script>
<script type="text/javascript" src="../_static/jquery.js"></script>
<script type="text/javascript" src="../_static/underscore.js"></script>
<script type="text/javascript" src="../_static/doctools.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="top" title="Presto 0.184 Documentation" href="../index.html" />
<link rel="up" title="6. Functions and Operators" href="../functions.html" />
<link rel="next" title="6.16. Color Functions" href="color.html" />
<link rel="prev" title="6.14. Aggregate Functions" href="aggregate.html" />
</head>
<body role="document">
<div class="header">
<h1 class="heading"><a href="../index.html">
<span>Presto 0.184 Documentation</span></a></h1>
<h2 class="heading"><span>6.15. Window Functions</span></h2>
</div>
<div class="topnav">
<p class="nav">
<span class="left">
« <a href="aggregate.html">6.14. Aggregate Functions</a>
</span>
<span class="right">
<a href="color.html">6.16. Color Functions</a> »
</span>
</p>
</div>
<div class="content">
<div class="section" id="window-functions">
<h1>6.15. Window Functions</h1>
<p>Window functions perform calculations across rows of the query result.
They run after the <code class="docutils literal"><span class="pre">HAVING</span></code> clause but before the <code class="docutils literal"><span class="pre">ORDER</span> <span class="pre">BY</span></code> clause.
Invoking a window function requires special syntax using the <code class="docutils literal"><span class="pre">OVER</span></code>
clause to specify the window. A window has three components:</p>
<ul class="simple">
<li>The partition specification, which separates the input rows into different
partitions. This is analogous to how the <code class="docutils literal"><span class="pre">GROUP</span> <span class="pre">BY</span></code> clause separates rows
into different groups for aggregate functions.</li>
<li>The ordering specification, which determines the order in which input rows
will be processed by the window function.</li>
<li>The window frame, which specifies a sliding window of rows to be processed
by the function for a given row. If the frame is not specified, it defaults
to <code class="docutils literal"><span class="pre">RANGE</span> <span class="pre">UNBOUNDED</span> <span class="pre">PRECEDING</span></code>, which is the same as
<code class="docutils literal"><span class="pre">RANGE</span> <span class="pre">BETWEEN</span> <span class="pre">UNBOUNDED</span> <span class="pre">PRECEDING</span> <span class="pre">AND</span> <span class="pre">CURRENT</span> <span class="pre">ROW</span></code>. This frame contains all
rows from the start of the partition up to the last peer of the current row.</li>
</ul>
<p>For example, the following query ranks orders for each clerk by price:</p>
<div class="highlight-sql"><div class="highlight"><pre><span></span><span class="k">SELECT</span> <span class="n">orderkey</span><span class="p">,</span> <span class="n">clerk</span><span class="p">,</span> <span class="n">totalprice</span><span class="p">,</span>
<span class="n">rank</span><span class="p">()</span> <span class="n">OVER</span> <span class="p">(</span><span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">clerk</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">totalprice</span> <span class="k">DESC</span><span class="p">)</span> <span class="k">AS</span> <span class="n">rnk</span>
<span class="k">FROM</span> <span class="n">orders</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">clerk</span><span class="p">,</span> <span class="n">rnk</span>
</pre></div>
</div>
<div class="section" id="aggregate-functions">
<h2>Aggregate Functions</h2>
<p>All <a class="reference internal" href="aggregate.html"><span class="doc">Aggregate Functions</span></a> can be used as window functions by adding the <code class="docutils literal"><span class="pre">OVER</span></code>
clause. The aggregate function is computed for each row over the rows within
the current row’s window frame.</p>
<p>For example, the following query produces a rolling sum of order prices
by day for each clerk:</p>
<div class="highlight-sql"><div class="highlight"><pre><span></span><span class="k">SELECT</span> <span class="n">clerk</span><span class="p">,</span> <span class="n">orderdate</span><span class="p">,</span> <span class="n">orderkey</span><span class="p">,</span> <span class="n">totalprice</span><span class="p">,</span>
<span class="k">sum</span><span class="p">(</span><span class="n">totalprice</span><span class="p">)</span> <span class="n">OVER</span> <span class="p">(</span><span class="n">PARTITION</span> <span class="k">BY</span> <span class="n">clerk</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">orderdate</span><span class="p">)</span> <span class="k">AS</span> <span class="n">rolling_sum</span>
<span class="k">FROM</span> <span class="n">orders</span>
<span class="k">ORDER</span> <span class="k">BY</span> <span class="n">clerk</span><span class="p">,</span> <span class="n">orderdate</span><span class="p">,</span> <span class="n">orderkey</span>
</pre></div>
</div>
</div>
<div class="section" id="ranking-functions">
<h2>Ranking Functions</h2>
<dl class="function">
<dt id="cume_dist">
<code class="descname">cume_dist</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Returns the cumulative distribution of a value in a group of values.
The result is the number of rows preceding or peer with the row in the
window ordering of the window partition divided by the total number of
rows in the window partition. Thus, any tie values in the ordering will
evaluate to the same distribution value.</p>
</dd></dl>
<dl class="function">
<dt id="dense_rank">
<code class="descname">dense_rank</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Returns the rank of a value in a group of values. This is similar to
<a class="reference internal" href="#rank" title="rank"><code class="xref py py-func docutils literal"><span class="pre">rank()</span></code></a>, except that tie values do not produce gaps in the sequence.</p>
</dd></dl>
<dl class="function">
<dt id="ntile">
<code class="descname">ntile</code><span class="sig-paren">(</span><em>n</em><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Divides the rows for each window partition into <code class="docutils literal"><span class="pre">n</span></code> buckets ranging
from <code class="docutils literal"><span class="pre">1</span></code> to at most <code class="docutils literal"><span class="pre">n</span></code>. Bucket values will differ by at most <code class="docutils literal"><span class="pre">1</span></code>.
If the number of rows in the partition does not divide evenly into the
number of buckets, then the remainder values are distributed one per
bucket, starting with the first bucket.</p>
<p>For example, with <code class="docutils literal"><span class="pre">6</span></code> rows and <code class="docutils literal"><span class="pre">4</span></code> buckets, the bucket values would
be as follows: <code class="docutils literal"><span class="pre">1</span></code> <code class="docutils literal"><span class="pre">1</span></code> <code class="docutils literal"><span class="pre">2</span></code> <code class="docutils literal"><span class="pre">2</span></code> <code class="docutils literal"><span class="pre">3</span></code> <code class="docutils literal"><span class="pre">4</span></code></p>
</dd></dl>
<dl class="function">
<dt id="percent_rank">
<code class="descname">percent_rank</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Returns the percentage ranking of a value in group of values. The result
is <code class="docutils literal"><span class="pre">(r</span> <span class="pre">-</span> <span class="pre">1)</span> <span class="pre">/</span> <span class="pre">(n</span> <span class="pre">-</span> <span class="pre">1)</span></code> where <code class="docutils literal"><span class="pre">r</span></code> is the <a class="reference internal" href="#rank" title="rank"><code class="xref py py-func docutils literal"><span class="pre">rank()</span></code></a> of the row and
<code class="docutils literal"><span class="pre">n</span></code> is the total number of rows in the window partition.</p>
</dd></dl>
<dl class="function">
<dt id="rank">
<code class="descname">rank</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Returns the rank of a value in a group of values. The rank is one plus
the number of rows preceding the row that are not peer with the row.
Thus, tie values in the ordering will produce gaps in the sequence.
The ranking is performed for each window partition.</p>
</dd></dl>
<dl class="function">
<dt id="row_number">
<code class="descname">row_number</code><span class="sig-paren">(</span><span class="sig-paren">)</span> → bigint</dt>
<dd><p>Returns a unique, sequential number for each row, starting with one,
according to the ordering of rows within the window partition.</p>
</dd></dl>
</div>
<div class="section" id="value-functions">
<h2>Value Functions</h2>
<dl class="function">
<dt id="first_value">
<code class="descname">first_value</code><span class="sig-paren">(</span><em>x</em><span class="sig-paren">)</span> → [same as input]</dt>
<dd><p>Returns the first value of the window.</p>
</dd></dl>
<dl class="function">
<dt id="last_value">
<code class="descname">last_value</code><span class="sig-paren">(</span><em>x</em><span class="sig-paren">)</span> → [same as input]</dt>
<dd><p>Returns the last value of the window.</p>
</dd></dl>
<dl class="function">
<dt id="nth_value">
<code class="descname">nth_value</code><span class="sig-paren">(</span><em>x</em>, <em>offset</em><span class="sig-paren">)</span> → [same as input]</dt>
<dd><p>Returns the value at the specified offset from beginning the window.
Offsets start at <code class="docutils literal"><span class="pre">1</span></code>. The offset can be any scalar
expression. If the offset is null or greater than the number of values in
the window, null is returned. It is an error for the offset to be zero or
negative.</p>
</dd></dl>
<dl class="function">
<dt id="lead">
<code class="descname">lead</code><span class="sig-paren">(</span><em>x</em><span class="optional">[</span>, <em>offset</em><span class="optional">[</span>, <em>default_value</em><span class="optional">]</span><span class="optional">]</span><span class="sig-paren">)</span> → [same as input]</dt>
<dd><p>Returns the value at <code class="docutils literal"><span class="pre">offset</span></code> rows after the current row in the window.
Offsets start at <code class="docutils literal"><span class="pre">0</span></code>, which is the current row. The
offset can be any scalar expression. The default <code class="docutils literal"><span class="pre">offset</span></code> is <code class="docutils literal"><span class="pre">1</span></code>. If the
offset is null or larger than the window, the <code class="docutils literal"><span class="pre">default_value</span></code> is returned,
or if it is not specified <code class="docutils literal"><span class="pre">null</span></code> is returned.</p>
</dd></dl>
<dl class="function">
<dt id="lag">
<code class="descname">lag</code><span class="sig-paren">(</span><em>x</em><span class="optional">[</span>, <em>offset</em><span class="optional">[</span>, <em>default_value</em><span class="optional">]</span><span class="optional">]</span><span class="sig-paren">)</span> → [same as input]</dt>
<dd><p>Returns the value at <code class="docutils literal"><span class="pre">offset</span></code> rows before the current row in the window
Offsets start at <code class="docutils literal"><span class="pre">0</span></code>, which is the current row. The
offset can be any scalar expression. The default <code class="docutils literal"><span class="pre">offset</span></code> is <code class="docutils literal"><span class="pre">1</span></code>. If the
offset is null or larger than the window, the <code class="docutils literal"><span class="pre">default_value</span></code> is returned,
or if it is not specified <code class="docutils literal"><span class="pre">null</span></code> is returned.</p>
</dd></dl>
</div>
</div>
</div>
<div class="bottomnav">
<p class="nav">
<span class="left">
« <a href="aggregate.html">6.14. Aggregate Functions</a>
</span>
<span class="right">
<a href="color.html">6.16. Color Functions</a> »
</span>
</p>
</div>
<div class="footer" role="contentinfo">
</div>
</body>
</html>