In [3]:
import pandas as pd

What is a **window function** exactly? It is a function that performs calculations across a set of table rows. The rows are somehow related to the current row.
For example, with window functions you can compute sum of values in the current row, one before and one after, as in the picture:



![image.png](attachment:image.png)




We call it window functions precisely because the set of rows is called a **window** or a **window frame**. Take a look at the syntax:



`OVER (...)`



can be an **aggregate function** that you already know (`COUNT()`, `SUM()`, `AVG()` etc.), or another function, such as a **ranking** or an **analytical function** that you'll get to know in this course.


The window frame is defined in the `OVER(...)` clause. 


# OVER() 
Let's focus on `OVER (...)`, which defines the window. The most basic example is `OVER()` and means that the window consists of all rows in the query. Take a look:

> `SELECT
  first_name,
  last_name,
  salary,  
  AVG(salary) OVER()
FROM employee;`

That's not a very complicated query, but take a look at the last column:

`AVG(salary) OVER()`

`AVG(salary)` means we're looking for the **average salary**. Where exactly? EVERYWHERE we can, because `OVER()` means **'for all rows in the query result'**. In others words, we're looking for the average salary in the entire company.

Note that we did **NOT** group rows. `OVER()` makes it possible to show the details of single rows and the result of an aggregating function together. That wouldn't be so easy with `GROUP BY` — we would have to write a subquery, which is more complicated and less effective. `OVER()` makes our work simple and efficient at the same time.

### Exercise
For each employee, find their **first name**, **last name**, **salary** and the **sum of all salaries** in the company.

Note that the last column is an aggregated column, even though you're not using a `GROUP BY.`
> `SELECT
  first_name,
  last_name,
  salary,  
  SUM(salary) OVER()
FROM employee;`

Query Results: (first 5 rows)



In [2]:
pd.read_clipboard()

Unnamed: 0,first_name,last_name,salary,sum
0,Diane,Turner,5330,94219
1,Clarence,Robinson,3617,94219
2,Eugene,Phillips,4877,94219
3,Philip,Mitchell,5259,94219


### Exercise
For each **item** in the `purchase` table, select its  **name** (column `item`), `price` and the **average price** of all items 

> `SELECT
  item,
  price,  
  AVG(price) OVER()
FROM purchase;`

Query Results: (first 5 rows)

In [5]:
pd.read_clipboard()

Unnamed: 0,item,price,avg
0,monitor,531,479.95
1,printer,315,479.95
2,whiteboard,170,479.95
3,training,117,479.95
4,computer,2190,479.95
