Skip to content

Commit

Permalink
- Slightly improved WINDOW function documentation. Created a new chapter
Browse files Browse the repository at this point in the history
  and cross linked to the Functions and Arguments chapter.
  • Loading branch information
jconway committed Aug 18, 2011
1 parent b2f332d commit c1db504
Showing 1 changed file with 130 additions and 23 deletions.
153 changes: 130 additions & 23 deletions doc/plr.sgml
Original file line number Diff line number Diff line change
Expand Up @@ -204,20 +204,6 @@ CREATE OR REPLACE FUNCTION sd(vals float8[]) RETURNS float AS '
For example:

<programlisting>
-- create test table
CREATE TABLE test_data (
fyear integer,
firm float8,
eps float8
);

-- insert randomly pertubated data for test
INSERT INTO test_data
SELECT (b.f + 1) % 10 + 2000 AS fyear,
floor((b.f+1)/10) + 50 AS firm,
f::float8/100 + random()/10 AS eps
FROM generate_series(-500,499,1) b(f);

CREATE OR REPLACE
FUNCTION r_regr_slope(float8, float8)
RETURNS float8 AS
Expand All @@ -229,22 +215,22 @@ $BODY$
return(slope)
$BODY$
LANGUAGE plr WINDOW;

SELECT *, r_regr_slope(eps, lag_eps) OVER w AS slope_R
FROM (SELECT firm, fyear, eps,
lag(eps) OVER (ORDER BY firm, fyear) AS lag_eps
FROM test_data) AS a
WHERE eps IS NOT NULL
WINDOW w AS (ORDER BY firm, fyear ROWS 8 PRECEDING);
</programlisting>

In the preceding example, <literal>farg1</> and <literal>farg2</> are
R vectors containing the current row's data plus that of the preceding
8 rows. The example also illustrates one of two additional autogenerated
R vectors containing the current row's data plus that of related
rows. The determination as to which rows qualify as related is
determined by the frame specification of the query at run time.
</para>

<para>
The example also illustrates one of two additional autogenerated
arguments. <literal>fnumrows</> is the number of rows in the current
<literal>WINDOW</> frame. The other (not shown) auto-argument is called
<literal>prownum</>. This argument provides the 1-based row offset of the
current row in the current <literal>PARTITION</>.
See <xref linkend="plr-window-funcs"> for more information and a more
complete example.
</para>

<para>
Expand Down Expand Up @@ -1342,6 +1328,127 @@ select f1, median(f2) from foo group by f1 order by f1;
</para>
</chapter>

<chapter id="plr-window-funcs">
<title>Window Functions</title>
<para>
Starting with version 8.4, PostgreSQL supports <literal>WINDOW</literal>
functions which provide the ability to perform calculations across sets
of rows that are related to the current query row. This is comparable to
the type of calculation that can be done with an aggregate function. But
unlike regular aggregate functions, use of a window function does not
cause rows to become grouped into a single output row; the rows retain
their separate identities. Behind the scenes, the window function is able
to access more than just the current row of the query result. See the
PostgreSQL documentation for more general information related to the use
of this capability.
</para>

<para>
PL/R functions may be defined as <literal>WINDOW</literal>. For example:
<programlisting>
CREATE OR REPLACE
FUNCTION r_regr_slope(float8, float8)
RETURNS float8 AS
$BODY$
slope <- NA
y <- farg1
x <- farg2
if (fnumrows==9) try (slope <- lm(y ~ x)$coefficients[2])
return(slope)
$BODY$
LANGUAGE plr WINDOW;
</programlisting>
</para>

<para>
A number of variables are automatically provided by PL/R to the R
interpreter:
</para>

<variablelist>
<varlistentry>
<term><literal>farg</><replaceable>N</replaceable>
</term>
<listitem>
<para>
<literal>farg1</> and <literal>farg2</> are R vectors containing
the current row's data plus that of the related rows.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>fnumrows</>
</term>
<listitem>
<para>
The number of rows in the current <literal>WINDOW</> frame.
</para>
</listitem>
</varlistentry>

<varlistentry>
<term><literal>prownum</> (not shown)
</term>
<listitem>
<para>
Provides the 1-based row offset of the current row in the current
<literal>PARTITION</>.
</para>
</listitem>
</varlistentry>
</variablelist>

<para>
A more complete example follows:

<programlisting>
-- create test table
CREATE TABLE test_data (
fyear integer,
firm float8,
eps float8
);

-- insert randomly pertubated data for test
INSERT INTO test_data
SELECT (b.f + 1) % 10 + 2000 AS fyear,
floor((b.f+1)/10) + 50 AS firm,
f::float8/100 + random()/10 AS eps
FROM generate_series(-500,499,1) b(f);

CREATE OR REPLACE
FUNCTION r_regr_slope(float8, float8)
RETURNS float8 AS
$BODY$
slope <- NA
y <- farg1
x <- farg2
if (fnumrows==9) try (slope <- lm(y ~ x)$coefficients[2])
return(slope)
$BODY$
LANGUAGE plr WINDOW;

SELECT *, r_regr_slope(eps, lag_eps) OVER w AS slope_R
FROM (SELECT firm, fyear, eps,
lag(eps) OVER (ORDER BY firm, fyear) AS lag_eps
FROM test_data) AS a
WHERE eps IS NOT NULL
WINDOW w AS (ORDER BY firm, fyear ROWS 8 PRECEDING);
</programlisting>
</para>

<para>
In this example, the variables <literal>farg1</> and <literal>farg2</>
contain the current row value for eps and lag_eps, as well as the
preceding 8 rows which are also in the same <literal>WINDOW</> frame
within the same <literal>PARTITION</>. In this case since no
<literal>PARTITION</> is explicitly defined, the <literal>PARTITION</>
is the entire set of rows returned from the inner sub-select.
</para>

</chapter>

<chapter id="plr-module-funcs">
<title>Loading R Modules at Startup</title>
<para>
Expand Down

0 comments on commit c1db504

Please sign in to comment.