# Lecture 6: Views, CTEs, window functions, indexing

## Announcements 
- We will be having worksheets next week. 
- ~ 15 -20 min at the end of leture to get started with MQL and get going with the setups. This will be the last lecture with iclickers.

## Theme
- Understand WHY, HOW, and WHERE to use views
- Understand how to use CTEs to simplify queries. Also, WHY to use them?
- Discuss window functions and their use cases

```{admonition} Recap iclicker: Find employees who work in departments where at least one employee earns more than $100,000
discuss and try writing a query OR just workout the logic.
```






```sql
A. SELECT e.name
   FROM employee e
   WHERE EXISTS (SELECT * 
    FROM employee emp
    WHERE e.department_id = emp.department_id 
    AND emp.salary > 100000);
```
```sql
B. SELECT e.name
   FROM employee e
   WHERE e.department_id = ANY (
    SELECT emp.department_id 
    FROM employee emp
    WHERE emp.salary > 100000)
```
```sql
C. SELECT e.name
   FROM employee e
   WHERE e.department_id IN (
    SELECT emp.department_id 
    FROM employee emp
    WHERE emp.salary > 100000)
```
```sql
D. SELECT e.name
   FROM employee e
   JOIN employee emp ON e.department_id = emp.department_id
   WHERE emp.salary > 100000;
```

E. I tried some other logic....

Let's dive in to todays theme.

In [1]:
# This is how you deal with credentials in a notebook
import json
import urllib.parse

with open('../lectures/data/credentials.json') as f:
    login = json.load(f)
    
username = login['user']
password = urllib.parse.quote(login['password'])
host = login['host']
port = login['port']

In [2]:
%load_ext sql
%config SqlMagic.displaylimit = 30

In [3]:
%sql postgresql://{username}:{password}@{host}:{port}/world

Okay, ***let's start with a query we did in the last lecture.***

**Example:** Which countries speak at least one language that is not spoken in any other country on their continent?

In [4]:
%%sql

DROP TABLE IF EXISTS ccl;

CREATE TEMPORARY TABLE ccl AS (
    SELECT
        co.name, co.continent, cl.language
    FROM
        country co
    JOIN
        countrylanguage cl
    ON
        co.code = cl.countrycode
    WHERE
        co.continent IN ('Asia', 'Europe')
)
;

 * postgresql://postgres:***@localhost:5432/world
Done.
441 rows affected.


[]

We are using the temporary table we created, `ccl`, in the below query.

In [5]:
%%sql

SELECT
    t1.*
FROM
    ccl t1
WHERE
    NOT EXISTS (
        SELECT
            *
        FROM
            ccl t2
        WHERE
            t1.name <> t2.name
            AND
            t1.continent = t2.continent
            AND
            t1.language = t2.language
    )

 * postgresql://postgres:***@localhost:5432/world
129 rows affected.


name,continent,language
Bhutan,Asia,Dzongkha
Philippines,Asia,Pilipino
Faroe Islands,Europe,Faroese
Georgia,Asia,Georgiana
Indonesia,Asia,Javanese
Iceland,Europe,Icelandic
Japan,Asia,Japanese
Kyrgyzstan,Asia,Kirgiz
Cyprus,Asia,Greek
Latvia,Europe,Latvian


```{admonition} Discussion Question: What are some of the disadvantages of creating a temporary table?

<img src="img/discuss.png" width="120">
```

```{toggle}
- You have to create the table every time you want to use it.
- No one else can use the temporary tables. What if a group of people wants to perform the same query? Do you think it is ideal for them to create their own temporary tables? That is redundant work !!!!

So, if we define a use case, it could be the case that a group of people are only interested in the following columns in the following tables:
- In the `country` table, the columns `name` and `continent`.
- In the `countrylanguage` table, just one column, `language`.

Or, if we take the use case further, it could also be the case that this group of people is only allowed to access these columns (perhaps due to security reasons).

This is when you want to think about a VIEW. This way, we don't have to grant certain users permissions to the master tables but can grant permissions to certain columns in certain tables.
```

Let's create a view for the above query.

In [6]:
%%sql
-- I am dropping the temporary table that we created 
DROP TABLE IF EXISTS ccl;

CREATE OR REPLACE VIEW ccl_view AS (
    SELECT
        co.name, co.continent, cl.language
    FROM
        country co
    JOIN
        countrylanguage cl
    ON
        co.code = cl.countrycode
    WHERE
        co.continent IN ('Asia', 'Europe')
)
;

 * postgresql://postgres:***@localhost:5432/world
Done.
Done.


[]

In [7]:
%%sql

SELECT
    t1.*
FROM
    ccl_view t1
WHERE
    NOT EXISTS (
        SELECT
            *
        FROM
            ccl_view t2
        WHERE
            t1.name <> t2.name
            AND
            t1.continent = t2.continent
            AND
            t1.language = t2.language
    )

 * postgresql://postgres:***@localhost:5432/world
129 rows affected.


name,continent,language
Bhutan,Asia,Dzongkha
Philippines,Asia,Pilipino
Faroe Islands,Europe,Faroese
Georgia,Asia,Georgiana
Indonesia,Asia,Javanese
Iceland,Europe,Icelandic
Japan,Asia,Japanese
Kyrgyzstan,Asia,Kirgiz
Cyprus,Asia,Greek
Latvia,Europe,Latvian


Wonderful! I am happy that I can work out the query by creating views. But are there any benefits of creating views?

```{admonition} (Just to hear your thoughts) iclicker: (T/F) Simple views in a database store data on disk:

A) True
B) False
```

```{admonition} iclicker: (T/F) Simple views makes queries run faster:

A) True
B) False
```

```{toggle}
B) False
B) False

Views just store the query. They don't store the data. So they don't make queries run faster. But they do make queries easier to write and read. 

But what if we want to store the query results on disk? So that our query will run faster. This is when we want to think about materialized views. And we gonna learn it. 

You can further make the query run faster by creating indexes on the materialized views, which you can't do on simple views. We won't cover indexes in this course, but you can learn more about them in the [PostgreSQL documentation](https://www.postgresql.org/docs/13/indexes.html). I will also try to give you a brief overview of indexes.

```

## Materialized Views

Let's see who is going to win the race. 

Materialized view VS simple view

<img src="img/race.gif" width="500">

In [8]:
%sql postgresql://{username}:{password}@{host}:{port}/imdb

### Materialized Views Demo

In [15]:
%%sql

SELECT
    *
FROM
    movies m
JOIN
    movie_genres mg
ON
    m.id = mg.movie_id
WHERE        
    rating > 8
    AND
    nvotes > (
        SELECT
            AVG(m2.nvotes)
        FROM
            movies m2
        JOIN
            movie_genres mg2
        ON
            m2.id = mg2.movie_id
        WHERE
            mg.genre = mg2.genre
            AND
            m.id <> m2.id
    )
;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
795 rows affected.


id,title,orig_title,start_year,end_year,runtime,rating,nvotes,movie_id,genre
10042876,Rashomon,Rashômon,1950,,88,8.2,138304,10042876,crime
10042876,Rashomon,Rashômon,1950,,88,8.2,138304,10042876,drama
10042876,Rashomon,Rashômon,1950,,88,8.2,138304,10042876,mystery
10043014,Sunset Blvd.,,1950,,110,8.4,183282,10043014,drama
10043014,Sunset Blvd.,,1950,,110,8.4,183282,10043014,film-noir
10044741,Ikiru,,1952,,143,8.3,60061,10044741,drama
10044837,Limelight,,1952,,137,8.1,16445,10044837,music
10045152,Singin' in the Rain,,1952,,103,8.3,201077,10045152,comedy
10045152,Singin' in the Rain,,1952,,103,8.3,201077,10045152,musical
10045152,Singin' in the Rain,,1952,,103,8.3,201077,10045152,romance


In [17]:
%%sql

CREATE VIEW
    simple_view
AS
    SELECT
        *
    FROM
        movies m
    JOIN
        movie_genres mg
    ON
        m.id = mg.movie_id
    WHERE        
        rating > 8
        AND
        nvotes > (
            SELECT
                AVG(m2.nvotes)
            FROM
                movies m2
            JOIN
                movie_genres mg2
            ON
                m2.id = mg2.movie_id
            WHERE
                mg.genre = mg2.genre
                AND
                m.id <> m2.id
        )
    ;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
Done.


[]

In [18]:
%%timeit -r 1 -n 1
%%sql

SELECT * FROM simple_view;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
795 rows affected.
12.1 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [19]:
%%timeit -r 1 -n 1
%%sql

CREATE MATERIALIZED VIEW
    mat_view
AS
    SELECT
        *
    FROM
        movies m
    JOIN
        movie_genres mg
    ON
        m.id = mg.movie_id
    WHERE        
        rating > 8
        AND
        nvotes > (
            SELECT
                AVG(m2.nvotes)
            FROM
                movies m2
            JOIN
                movie_genres mg2
            ON
                m2.id = mg2.movie_id
            WHERE
                mg.genre = mg2.genre
                AND
                m.id <> m2.id
        )
;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
795 rows affected.
11.7 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [20]:
%%timeit -r 1 -n 1
%%sql

SELECT * FROM mat_view;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
795 rows affected.
4.93 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [14]:
%%sql

DROP VIEW simple_view;
DROP MATERIALIZED VIEW mat_view;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
Done.
Done.


[]

```{admonition} Discussion Question: What are some of the disadvantages of creating a materialized view?

<img src="img/discuss.png" width="120">
```

```{toggle}
- It takes up space on the disk.
- It is not always up to date. What if the data in the master tables change? The materialized view will not reflect the changes. You have to refresh it.
```

```{admonition} Discussion Question: Can you think of some advantages of a temporary table over a simple view?

<img src="img/discuss.png" width="120">
```

```{toggle}
- Temporary tables only exist for a particular session, which might be helpful in some cases. Maybe the user wants it to perform some analytics and then delete it.
- We get efficiency if we use Temporary tables as the data is stored physically. It can also be indexed, which you can't do with the simple view. 
```

Consider a situation

- I don't want to create a temporary table. Because 
    - I don't want to store the data and take up space.
    - I only want to use it for one query.
- I also don't want to create a simple view because no one else will use that query, and I might not have permission to create a view.

What can I do here? That is why we want to learn about CTEs.

## CTE

In [None]:
%sql postgresql://{username}:{password}@{host}:{port}/world

Following is how we did it previously 

In [None]:
%%sql

DROP TABLE IF EXISTS ccl;

CREATE TEMPORARY TABLE ccl AS (
    SELECT
        co.name, co.continent, cl.language
    FROM
        country co
    JOIN
        countrylanguage cl
    ON
        co.code = cl.countrycode
    WHERE
        co.continent IN ('Asia', 'Europe')
)
;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
Done.
441 rows affected.


[]

In [None]:
%%sql

SELECT
    t1.*
FROM
    ccl t1
WHERE
    NOT EXISTS (
        SELECT
            *
        FROM
            ccl t2
        WHERE
            t1.name <> t2.name
            AND
            t1.continent = t2.continent
            AND
            t1.language = t2.language
    )

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
129 rows affected.


name,continent,language
Bhutan,Asia,Dzongkha
Philippines,Asia,Pilipino
Faroe Islands,Europe,Faroese
Georgia,Asia,Georgiana
Indonesia,Asia,Javanese
Iceland,Europe,Icelandic
Japan,Asia,Japanese
Kyrgyzstan,Asia,Kirgiz
Cyprus,Asia,Greek
Latvia,Europe,Latvian


Using CTE, we can do the same thing as follows:

In [None]:
%%sql
WITH ccl AS (
    SELECT
        co.name, co.continent, cl.language
    FROM
        country co
    JOIN
        countrylanguage cl
    ON
        co.code = cl.countrycode
    WHERE
        co.continent IN ('Asia', 'Europe')
)
SELECT
    t1.*
FROM
    ccl t1
WHERE
    NOT EXISTS (
        SELECT
            *
        FROM
            ccl t2
        WHERE
            t1.name <> t2.name
            AND
            t1.continent = t2.continent
            AND
            t1.language = t2.language
    )

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
129 rows affected.


name,continent,language
Bhutan,Asia,Dzongkha
Philippines,Asia,Pilipino
Faroe Islands,Europe,Faroese
Georgia,Asia,Georgiana
Indonesia,Asia,Javanese
Iceland,Europe,Icelandic
Japan,Asia,Japanese
Kyrgyzstan,Asia,Kirgiz
Cyprus,Asia,Greek
Latvia,Europe,Latvian


```{note}
You can give column names, or else it will use column names from select
```sql
WITH
ccl (namenew, continentnew, languagenew)
AS (...)
...
```

```{admonition} Discussion Question: What is the advantage of using CTEs over temporary tables?

<img src="img/discuss.png" width="120">
```

```{toggle}
- CTEs are more readable than temporary tables.
- CTEs are more efficient than temporary tables. 
- You are not creating a table in the database. So you don't have to worry about the table name.
- You can use CTEs in a single query. You can't do that with temporary tables. 
- If the data is modified in any way before another subsequent query, it won't be reflected in the temporary table.
- The temporary table has to be created each time a connection to the database is made
```

```{admonition} Discussion Question: What is the advantage of using temporary tables over CTEs?

<img src="img/discuss.png" width="120">
```

```{toggle}
- You can use temporary tables in multiple queries as tables persist for the duration of the session. You can't do that with CTEs, as CTE persist only for the duration of the query. 
- You can index temporary tables. You can't do that with CTEs. 
- You can store the data in temporary tables. You can't do that with CTEs. 
```

```{admonition} Discussion Question: Why can't we use views instead of CTEs?

<img src="img/discuss.png" width="120">
```

```{toggle}
- We don't want to create a bunch of views for something that is going to be helpful only for a single query or/a single session. 
- Creating views requires special access privileges, which we might not have as a user!
```

## Window function

Motivation: We all know by now what the following query means. It lists the continent and the maximum population in that continent. 

In [None]:
%%sql

SELECT continent, MAX(population) as maximum_population
FROM country
GROUP BY continent;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
7 rows affected.


continent,maximum_population
Asia,1277558000
South America,170115000
North America,278357000
Oceania,18886000
Antarctica,0
Africa,111506000
Europe,146934000


```{admonition} iclicker question: We want to list out the continent along with additional information about the country and its population. Does the following query works?


    SELECT name, population, continent, MAX(population)
    FROM country
    GROUP BY continent;

A) Yes

B) No

```{toggle}
No. It won't work. Because we can't use the column name in the SELECT clause that is not aggregated or in the GROUP BY clause, revisit lecture 3.

We can, of course, work it out using a subquery that we learned. Like below
```

In [None]:
%%sql

WITH temp AS (
       SELECT
        continent,
        MAX(population) as max_population
    FROM
        country
    GROUP BY
        continent
    )
SELECT
    c.name,
    c.population,
    temp.*
FROM
    country c JOIN temp ON temp.continent = c.continent
order by continent asc
;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
239 rows affected.


name,population,continent,max_population
Algeria,31471000,Africa,111506000
Western Sahara,293000,Africa,111506000
Madagascar,15942000,Africa,111506000
Uganda,21778000,Africa,111506000
Malawi,10925000,Africa,111506000
Mali,11234000,Africa,111506000
Morocco,28351000,Africa,111506000
Côte dIvoire,14786000,Africa,111506000
Mauritania,2670000,Africa,111506000
Mauritius,1158000,Africa,111506000


In situations like this, we can use window functions. It is a special type of function that allows us to perform calculations across a set of rows that are related to the current row. 

In [None]:
%%sql

SELECT
    name,
    population,
    continent,
    MAX(population) OVER (PARTITION BY continent)
FROM
    country
;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
239 rows affected.


name,population,continent,max
Algeria,31471000,Africa,111506000
Western Sahara,293000,Africa,111506000
Madagascar,15942000,Africa,111506000
Uganda,21778000,Africa,111506000
Malawi,10925000,Africa,111506000
Mali,11234000,Africa,111506000
Morocco,28351000,Africa,111506000
Côte dIvoire,14786000,Africa,111506000
Mauritania,2670000,Africa,111506000
Mauritius,1158000,Africa,111506000


Below, we are doing it over the entire table without doing any partitioning. 

In [None]:
%%sql

SELECT
    name,
    population,
    continent,
    MAX(population) OVER ()
        -- DOES OVER ENTIRE TABLE 
FROM
    country
    ORDER BY NAME
;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
239 rows affected.


name,population,continent,max
Afghanistan,22720000,Asia,1277558000
Albania,3401200,Europe,1277558000
Algeria,31471000,Africa,1277558000
American Samoa,68000,Oceania,1277558000
Andorra,78000,Europe,1277558000
Angola,12878000,Africa,1277558000
Anguilla,8000,North America,1277558000
Antarctica,0,Antarctica,1277558000
Antigua and Barbuda,68000,North America,1277558000
Argentina,37032000,South America,1277558000


> **Note:** Window functions in SQL are processed **after** `GROUP BY` and `HAVING`, and **before** `SELECT`. This is a crucial point to understand when using window functions in SQL.

> **Note:** Window functions are only allowed in the `SELECT` and `ORDER BY` clauses.

---

**Example:** Using window functions, count the number of countries in which each language is officially spoken. Remove duplicates and sort the results in descending order by the count value.

---

In [None]:
%%sql

SELECT
    DISTINCT language,
    COUNT(*) OVER (PARTITION BY language) AS count
FROM
    countrylanguage
WHERE
    isofficial = TRUE
ORDER BY
    count DESC

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
102 rows affected.


language,count
English,44
Arabic,22
Spanish,20
French,18
German,6
Portuguese,6
Dutch,4
Italian,4
Malay,4
Danish,3


```{admonition} iclicker question: In the above query, is it possible to use a WHERE clause to limit the results to languages that are spoken in more than one country?

    SELECT
        DISTINCT language,
        COUNT(*) OVER (PARTITION BY language) AS count
    FROM countrylanguage
    WHERE isofficial = TRUE and count  > 1
    ORDER BY count DESC;

A) Yes
B) No
```

```{admonition} iclicker question: In the above query, is it possible to use a WHERE clause to limit the results to languages that are spoken in more than one country?

    SELECT
        DISTINCT language,
        COUNT(*) OVER (PARTITION BY language) AS count
    FROM countrylanguage
    WHERE isofficial = TRUE and COUNT(*) OVER (PARTITION BY language)  > 1
    ORDER BY count DESC;

A) Yes
B) No
```

```{toggle}
- No

- No

> **Note:** Window functions in SQL are processed **after** `GROUP BY` and `HAVING`, and **before** `SELECT`. This is a crucial point to understand when using window functions in SQL.

> **Note:** Window functions are only allowed in the `SELECT` and `ORDER BY` clauses.

So inorder to achieve this we can use a subquery. 
```

In [None]:
%%sql

WITH temp AS(
    SELECT
    DISTINCT language,
    COUNT(*) OVER (PARTITION BY language) AS count
FROM
    countrylanguage
WHERE
    isofficial = TRUE
ORDER BY
    count DESC
    )
    select * from temp where count > 1;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
23 rows affected.


language,count
English,44
Arabic,22
Spanish,20
French,18
German,6
Portuguese,6
Dutch,4
Italian,4
Malay,4
Danish,3


Here is some dummy data to help understand the concept of some window functions. 

In [None]:
%%sql
-- Creating a temporary table
DROP table if exists examp;
CREATE TEMPORARY TABLE if not exists examp (
    id INT,
    value INT,
    samplegroup VARCHAR(1)
);

INSERT INTO examp (id, value, samplegroup) VALUES
(1, 10, 'A'),
(2, 20, 'A'),
(3, 30, 'A'),
(4, 40, 'A'),
(5, 50, 'B'),
(6, 60, 'B'),
(7, 70, 'B');

SELECT
  id,
    value,
    LAG(value) OVER (ORDER BY value) AS lag_value,
    LEAD(value) OVER (ORDER BY value) AS lead_value,
    RANK() OVER (ORDER BY value) AS rank_value,
    RANK() OVER (ORDER BY value desc) AS rank_value,
    DENSE_RANK() OVER (ORDER BY value) AS dense_rank_value,
    AVG(value) OVER () AS avg_value_overall,
    COUNT(value) OVER () AS count_value_overall,
    MAX(value) OVER () AS max_value_overall,
    MAX(value) OVER ( PARTITION by samplegroup) AS max_value_partition
    FROM
    examp;

   postgresql://postgres:***@localhost:5432/imdb
 * postgresql://postgres:***@localhost:5432/world
Done.
Done.
7 rows affected.
7 rows affected.


id,value,lag_value,lead_value,rank_value,rank_value_1,dense_rank_value,avg_value_overall,count_value_overall,max_value_overall,max_value_partition
1,10,,20.0,1,7,1,40.0,7,70,40
2,20,10.0,30.0,2,6,2,40.0,7,70,40
3,30,20.0,40.0,3,5,3,40.0,7,70,40
4,40,30.0,50.0,4,4,4,40.0,7,70,40
5,50,40.0,60.0,5,3,5,40.0,7,70,70
6,60,50.0,70.0,6,2,6,40.0,7,70,70
7,70,60.0,,7,1,7,40.0,7,70,70


>Exercise (if time permits, else you can try it on your own):

```{admonition} iclicker: Retrieve the names of employees with the highest and second-highest salaries.
discuss and try writing a query OR just workout the logic.
```



```sql
A. select name from employee
	order by salary desc
	LIMIT 2
```
```sql
B. with temp AS
	(select name, dense_rank() OVER (order by salary desc) as rank from employee)
   select name from temp
	where rank <=2;
```
```sql
C. SELECT name
   FROM employee
   WHERE salary IN (
    SELECT DISTINCT salary
    FROM employee
    ORDER BY salary DESC
    LIMIT 2) 
```
```sql
D. SELECT name
   FROM employee
   WHERE salary >= ANY (
    SELECT DISTINCT salary
    FROM employee
    ORDER BY salary DESC
    LIMIT 2)
```

E. I tried some other logic....


>Note: Option A is not correct. Your query must be always robust. Don't fall in any of these traps.

Try below, here Marky and Mark share the same salary. So both should be returned along with Alice.

```sql
CREATE TABLE employee (
    id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department_id INT,
    salary DECIMAL(10, 2),
    FOREIGN KEY (department_id) REFERENCES department(id)
);
INSERT INTO employee (id, name, department_id, salary)
VALUES 
    (1, 'John', 101, 9000),
    (2, 'Mark', 102, 200000),
	(22, 'Marky', 102, 200000),
    (3, 'Alice', 101, 105000),
    (4, 'Bob', 103, 80000);
    ```

Here is another example (more practice during lab time)

---

**Example:** Write a query that in each row returns each movie's name, production year (`start_year`), and its number of votes, for all movies produced in or after 2018. Your query should also return a column that ranks each movie in descending order, according to its number of votes in the year it was produced.

- Sort your results according the rank value in ascending order

- Retrieve only 20 rows

---


In [None]:
%sql postgresql://{username}:{password}@{host}:{port}/imdb

In [None]:
%%sql

SELECT
    title,
    start_year,
    nvotes,
    RANK() OVER (PARTITION BY start_year ORDER BY nvotes DESC) AS rank
FROM
    movies
WHERE
    start_year >= 2018
ORDER BY rank
LIMIT 20;

 * postgresql://postgres:***@localhost:5432/imdb
   postgresql://postgres:***@localhost:5432/world
20 rows affected.


title,start_year,nvotes,rank
Avengers: Infinity War,2018,711500,1
Avengers: Endgame,2019,567968,1
Black Panther,2018,543002,2
Captain Marvel,2019,362609,2
Once Upon a Time... in Hollywood,2019,204266,3
Deadpool 2,2018,416811,3
Bohemian Rhapsody,2018,382622,4
Spider-Man: Far from Home,2019,198519,4
A Quiet Place,2018,325428,5
Shazam!,2019,182200,5


## Indexing

Check my additional notes if you want to know more about it.

## Moral of the story
- CTEs are more readable than temporary tables.
- Views vs materialized views and when to use them
- Difference between CTEs vs temporary tables vs views
- Window functions and how to use them
