# Investigating window functions in postgres

https://www.postgresql.org/docs/9.1/static/tutorial-window.html

## log into postgres server

use this command

```
psql -h mids-w205.c0bx3q0sdpgn.us-east-1.rds.amazonaws.com -p 5432 -U amitb -d dvdrental2
```

password is 

```
gobears92
```

## the HARD WAY to compute deviations from average

### get first letter and legnth

```
select substring(first_name from 1 for 1), length(first_name) 
from actor;
```

### get first letter and avg length

```
select substring(first_name from 1 for 1) as letter, avg(length(first_name)) 
from actor 
group by letter 
order by letter;
```

### join with original table
```
select 
  actor.first_name, 
  substring(actor.first_name from 1 for 1) as letter1, 
  tt.letter as letter2, 
  length(actor.first_name), 
  tt.avg_len,
  (length(actor.first_name) - tt.avg_len) as diff

from actor, 
(
  select 
    substring(first_name from 1 for 1) as letter, 
    avg(length(first_name))as avg_len 
  from actor group by letter order by letter
  ) as tt 
where substring(actor.first_name from 1 for 1) = tt.letter
order by actor.first_name
;
```




## now use magical window functions

### first attempt 
```
SELECT
 substring(actor.first_name from 1 for 1) as letter, 
 avg(length(first_name))

OVER (PARTITION BY substring(actor.first_name from 1 for 1))

FROM actor
;
```

### check against first name

```
SELECT
  first_name,
  substring(actor.first_name from 1 for 1) as letter, 
  length(first_name),
  avg(length(first_name))

OVER (PARTITION BY substring(actor.first_name from 1 for 1))

FROM actor
;

```

### final result

```
SELECT
  first_name,
  substring(actor.first_name from 1 for 1) as letter, 
  length(first_name) - avg(length(first_name)) 

OVER (PARTITION BY substring(actor.first_name from 1 for 1))

FROM actor
;

```


## exercise 0

make sure you understand and can run all of the above commands

## exercise 1
find the length of the country name in the **country** table and compare to the average length of a country name.

HINT: since there is no field to group by, you can use the syntax >>  OVER ()

## exercise 2

connect to the **midstest** database
```
psql -h mids-w205.c0bx3q0sdpgn.us-east-1.rds.amazonaws.com -p 5432 -U amitb -d midstest
```

or switch to the **midstest** database
```
\c midstest
```

in the **empsalary** table, find the percentage of the total salaries that each employee is paid