# PostgreSQL Summary Stats and Window Functions

# Window Functions

When you are working with aggregate values it is necessary that use GROUP BY with **all** non-aggregate columns

Window functions are a class of functions that perform calculations on a result set that has already been generated, also referred to as a "window".
- You can use window functions to perform aggregate calculations without having to group your data, just as you did with a subquery in SELECT
- You can also use them to calculate information such as running totals, rankings, and moving averages.

For instance, you can calculate the AVG of home_goal and away_goal, and follow it with the *OVER* clause. This clause tells SQL to *"pass this aggregate value over this existing result set."*

~~~~sql
SELECT date,    (home_goal + away_goal) AS goals,
       AVG(home_goal + away_goal) OVER() AS overall_avg
FROM match
WHERE season = '2011/2012';
~~~~

## Generate a RANK
A RANK simply creates a column numbering your data set from highest to lowest, or lowest to highest, based on a column that you specify. 
What is the rank of matches based on number of goals scored?

~~~~sql
SELECT date,    (home_goal + away_goal) AS goals,
                RANK() OVER(ORDER BY home_goal + away_goal DESC) AS goals_rank
FROM match
WHERE season = '2011/2012';
~~~~

## Key Differences
- window functions are processed after the entire query except the final ORDER BY statement. 
- the window function uses the result set to calculate information, as opposed to using the database directly. 
- window functions are available in PostgreSQL, Oracle, MySQL, but not in SQLite

## Window Partitions
A partition allows you to calculate separate values for different categories established in a partition.
- calculate different calculations in the same column

~~~~sql
AVG(home_goal) OVER(PARTITION BY season)
~~~~

For instance, to answer how many goals were scored in each match, and how did that compare to season's average?

~~~~sql
SELECT date,
       (home_goal + away_goal) AS goals,
       AVG(home_goal + away_goal) OVER(PARTITION BY season) as season_avg
FROM match;
~~~~

### PARTITION by Multiple Columns
~~~~sql
AVG(home_goal) OVER(PARTITION BY season, name)
~~~~

PARTITION BY is a pretty straight forward addition to the OVER clause.
- You can partition calculations by 1 or more columns as necessary to answer a question you may have.
- You can use a PARTITION with any kind of window function
    - calculation, rank, or others
    
## Sliding windows
Sliding windows are functions that perform calculations relative to the current row of a data set.
- to calculate a wide variety of information that aggregates one row at a time down your data set
- can also be partitioned by one or more columns, just like a non-sliding window.
- contains specific functions within the OVER clause to specify the data you want to use in your calculations.

~~~~sql
ROWS BETWEEN <start> AND <finish>
~~~~ 
In the ROWS BETWEEN statement, you can specify a number of keywords as shown here. 
- PRECEDING and FOLLOWING are used to specify the number of rows before, or after, the current row that you want to include in a calculation.
- UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING tell SQL that you want to include every row since the beginning, or the end, of the data set in your calculations.
- CURRENT ROW tells SQL that you want to stop your calculation at the current row.