# <center>Big Data &ndash; Exercise 1</center>
## <center>Fall 2024 &ndash; Week 1 &ndash; ETH Zurich</center>

### Aims
- **After this exercise:** 
    - Understand the SQL language and its common query patterns.
    - Understand the 'table' data shape, normalization, and when they can (and should) be used.
    - be able to query data in tables with the SQL language.
- **Later in the semester:** 
    - Relate these language features and query patterns relative to other data shapes, technologies, and the languages designed to query them.
    - Understand when tables are not the appropriate shape for your data and when you can (and should) throw normalization away!

### Prerequisites
In this exercise, you will brush-up the fundamental concepts of relational databases and SQL. If you haven't taken an introductory databases course (or want to refresh your knowledge) we recommend you to read the following:

Garcia-Molina, Ullman, Widom: Database Systems: The Complete Book. Pearson, 2. Edition, 2008. (Chapters 1, 2, 3, and 6) [Available in the ETH Library] [[Online]](https://ebookcentral.proquest.com/lib/ethz/detail.action?pq-origsite=primo&docID=5832965) [[Selected solutions]](http://infolab.stanford.edu/~ullman/dscbsols/sols.html).

Or have a look at the recordings from Information Systems for Engineers - ETH Zurich, available on [[YouTube]](https://www.youtube.com/c/GhislainFournysLectures).

### Database Set-up
We will be once again working in the ExamMagicBox (you can find it in the following [[link]](https://polybox.ethz.ch/index.php/s/wa57XqDKkxRMb0q) if you have not downloaded it yet): please drag this Notebook in the folder. Just like last week, activate the docker container for the exercise sheet with `docker compose up`; please wait for the message `PostgreSQL init process complete; ready for start up` in the docker logs before proceeding! Alternatively you can start the Docker with `docker compose up -d` and wait for the command to execute: please note that you are creating the containers in the background this way. You can then type `docker compose down` when you are done.

As before, we set up our connection to the database and enable use of `%sql` and `%%sql`.

In [1]:
server='db'
user='postgres'
password='example'
database='postgres'
connection_string=f'postgresql://{user}:{password}@{server}:5432/{database}'

In [2]:
%reload_ext sql
%sql $connection_string

In [3]:
%%sql
SELECT version();

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


version
"PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on aarch64-unknown-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit"


### Origin of the data
You can find more information on the dataset in the following links
- [Discogs](https://www.discogs.com/)
- [Discogs XML data dumps](http://data.discogs.com/)

If you do not want to use Docker or it does not work you can download the dataset from this [link](https://cloud.inf.ethz.ch/s/DtjCHTLRHT39BRN/download/discogs.dump.xz), see `postgres-init.sh` to see how to import it)

## Exercise 1: Explore the dataset
We want to first understand the dataset a bit better. You will find some queries below to help you explore the schema.

### List tables
The following query retrieves a list of tables in the database from a system table describing the current database.

In [4]:
%%sql 
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public';

 * postgresql://postgres:***@db:5432/postgres
22 rows affected.


table_name
companies
jobs
employees02
badges
comments
inventory
posthistory
postlinks
posts
tags


### List attributes/columns
The following query retrieves a list of columns from the tables in the database.

In [5]:
%%sql 
SELECT table_name, column_name, data_type, is_nullable, ordinal_position
FROM information_schema.columns
WHERE table_schema = 'public' AND table_name IN ('artists', 'released_by', 'releases', 'tracks')
AND table_name NOT LIKE 'pg_%'
ORDER BY table_name, ordinal_position;

 * postgresql://postgres:***@db:5432/postgres
17 rows affected.


table_name,column_name,data_type,is_nullable,ordinal_position
artists,artist_id,integer,NO,1
artists,name,character varying,YES,2
artists,realname,text,YES,3
artists,profile,text,YES,4
artists,url,text,YES,5
released_by,release_id,integer,NO,1
released_by,artist_id,integer,NO,2
releases,release_id,integer,NO,1
releases,released,date,NO,2
releases,title,text,NO,3


### Have a look at the datasets
The following simple query gives the first 5 rows of the `artists` dataset

In [6]:
%%sql
SELECT * FROM artists LIMIT 5;

 * postgresql://postgres:***@db:5432/postgres
5 rows affected.


artist_id,name,realname,profile,url
1,The Persuader,Jesper Dahlbäck,,
2,Mr. James Barth & A.D.,Cari Lekebusch & Alexi Delano,,
3,Josh Wink,Joshua Winkelman,"After forming [l=Ovum Recordings] as an independent label in October 1994 with former partner [a=King Britt], Josh recorded the cult classic 'Liquid Summer'. He went on to release singles for a wide variety of revered European labels ranging from Belgium's [l=R & S Records] to England's [l=XL Recordings]. In 1995, Wink became one of the first DJ-producers to translate his hard work into mainstream success when he unleashed a string of classics including 'Don't Laugh'¸ 'I'm Ready' and 'Higher State of Consciousness' that topped charts worldwide. More recently he has had massive club hits such as 'How's Your Evening So Far' and 'Superfreak' but he has also gained a lot of attention trough his remixes for [a=FC Kahuna], [a=Paul Oakenfold], [a=Ladytron], [a=Clint Mansell], [a=Sting] and [a=Depeche Mode], among others.",http://www.joshwink.com/
4,Johannes Heil,Johannes Heil,"Electronic music producer, musician and live performer, born 3 February 1978 near the town of Bad Nauheim, Germany. Founder of [l=JH] and [l=Metatron Recordings].",http://johannes-heil.com/
5,Heiko Laux,Heiko Laux,German DJ and producer based in Berlin. He is the founder of [l=Kanzleramt].,http://www.heiko-laux.com


Naturally we could write similar queries to better understand each of the other tables.

#### With what you now know about the datasets, try to answer the following questions

1. Which concepts are modelled in the dataset and how do they relate to each other? <b>Hint</b>: how do the tables connect logically?
2. Why do you think this shape (table) was chosen for the data and why not the other shapes?
3. In which normal forms are the corresponding relations?
4. How can we denormalise the data to make some queries more efficient? <b>Hint</b>: have a look at the queries in the next session of the exercises to see if adding some columns to some tables could reduce the need to `JOIN`.
5. What potential problems could result from adding redundancy?

In [None]:
# 1. attributes; links?
# 2. 
# 3. 
# 4. 
# 5. 

## Exercise 2: SQL warm-up
Now that we familiarised ourself with the tables and relationship, we will begin with several SQL queries to ease us back into the language.

<b>Practical tips:</b>
- You might want to begin by retrieving a few rows from each of the database tables to get a sense of what is stored. 
- When testing your queries, it is good practice to add a "LIMIT" clause to avoid inadvertedly retrieving hundreds of rows.

The following is an example query that contains some common SQL expressions. A complete list can be found at: https://www.postgresql.org/docs/current/sql-select.html

In [7]:
%%sql
SELECT DISTINCT
    a.name AS column1,
    COUNT(t.track_id) AS column2,
    AVG(t.duration) AS column3
FROM
    artists a
    JOIN released_by rb USING(artist_id)
    JOIN releases r USING(release_id)
    JOIN tracks t USING(release_id)
WHERE
    t.duration > 123
    AND t.title != 'My Query'
    AND r.country = 'Switzerland'
GROUP BY
    a.artist_id, a.name
HAVING
    COUNT(t.track_id) > 0
ORDER BY
    column2 DESC,
    column3 DESC
LIMIT 5;

 * postgresql://postgres:***@db:5432/postgres
5 rows affected.


column1,column2,column3
Various Artists,2814,349.5806680881308
DJ Snowman,548,287.56204379562047
DJ Noise,504,401.86507936507934
Dave 202,425,326.88
DJ Nonsdrome,242,306.38429752066116


The following is a visual representation of the database schema for quick reference.

<img src="https://polybox.ethz.ch/index.php/s/8CqNffQrR0EDbuC/download" width=800/>

#### 1. Retrieve all releases that were released after January 1, 2017.

In [13]:
%%sql
SELECT * 
FROM releases 
WHERE releases.released > '2017-1-1'
;

 * postgresql://postgres:***@db:5432/postgres
18 rows affected.


release_id,released,title,country,genre
264023,2018-01-01,The Bad Behaviour E.P,UK,Electronic
332457,2018-01-01,Sudd. Autumn Collection 03,Sweden,Electronic
396986,2018-01-01,You Wanna Do What / In One Hand,UK,Electronic
396989,2018-01-01,Happy To Be Sad / I Was Just Leaving,UK,Electronic
400057,2018-01-01,No-Harm,UK,Electronic
406547,2018-01-01,Melburn / Into The Storm,UK,Electronic
501126,2018-01-01,Lectronic,Sweden,Electronic
514740,2018-01-01,Invisible Agent 002,Ireland,Electronic
618428,2018-01-01,Tribal Natty (Aphrodite Remix),UK,Electronic
100000,2018-12-14,"Kizomba Mix, Vol. 2 [2018] 2 CDs",Portugal,Electronic


#### 2. Find all tracks with a duration longer than 7 hours. Assume the 'duration' column in the 'tracks' table is in seconds.

In [14]:
%%sql
SELECT *
FROM tracks
WHERE tracks.duration > 7*60*60
;

 * postgresql://postgres:***@db:5432/postgres
3 rows affected.


release_id,position,title,duration,track_id
478281,4,Live 1996.12.30.,31934,2526159
47796,9,Rapper's Relight,25579,256970
47796,11,Dialectical Transformation III Peace In Rwanda Mix,27196,256972


#### 3. Retrieve the titles of 5 releases along with the names of the artists who released them.

In [34]:
%%sql
SELECT 
    releases.title,
    artists.name
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
LIMIT 5;

 * postgresql://postgres:***@db:5432/postgres
5 rows affected.


title,name
Stockholm,The Persuader
Knockin' Boots Vol 2 Of 2,Mr. James Barth & A.D.
Profound Sounds Vol. 1,Josh Wink
Flowerhead,DATacide
Knockin' Boots (Vol 1 Of 2),Mr. James Barth & A.D.


#### 4. List each genre and the number of releases in that genre.

In [24]:
%%sql
SELECT
    releases.genre,
    COUNT(*)
FROM
    releases
GROUP BY
    releases.genre
;

 * postgresql://postgres:***@db:5432/postgres
15 rows affected.


genre,count
Blues,48
Brass & Military,4
Children's,6
Classical,257
Electronic,183766
"Folk, World, & Country",101
Funk / Soul,3674
Hip Hop,10598
Jazz,3325
Latin,153


#### 5. Identify the top 5 artists who have the most releases.

In [32]:
%%sql
SELECT
    artists.name,
    COUNT(*) as cnt
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
GROUP BY
    artists.name
ORDER BY
    cnt
DESC
LIMIT 5;

 * postgresql://postgres:***@db:5432/postgres
5 rows affected.


name,cnt
Various Artists,46123
Madonna,617
Pet Shop Boys,600
Faithless,336
Michael Jackson,332


#### 6. Find the artist who has the longest total duration of tracks across all their releases.

In [111]:
%%sql
SELECT
    artists.name,
    SUM(tracks.duration) as du
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
    JOIN tracks USING(release_id)
GROUP BY
    artists.name
ORDER BY
    du
DESC
LIMIT 1;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


name,du
Various Artists,227180023


#### 7. Find how many releases that have tracks with duplicate titles.

In [112]:
%%sql
SELECT
    COUNT(*)
FROM (
    SELECT DISTINCT
        releases.release_id
    FROM
        releases
        JOIN tracks USING(release_id)
    GROUP BY
        releases.release_id, tracks.title
    HAVING
        COUNT(*) > 1
             );

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


count
9046


#### 8. Retrieve the artists with the name of 'Coldplay'.

In [44]:
%%sql
SELECT *
FROM
    artists
WHERE
    artists.name LIKE '%Coldplay%'
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


artist_id,name,realname,profile,url
29735,Coldplay,,"Coldplay is an English rock band from London, England. They've been a band since January 16, 1998 when they lost a demotape competition on XFM in London. Philip Christopher Harvey is the band's manager.  [b][u]Line-up:[/u][/b]  Jonny Buckland (Jonathan Mark Buckland) - Guitar  Will Champion (William Champion) - Drums  Guy Berryman (Guy Rupert Berryman) - Bass  Chris Martin (Christopher Anthony John Martin) - Vocals",http://coldplay.com/


#### 9. List the titles of all releases by that artist in alphabetical order.
<b>Hint</b>: Ignore the fact that different relases can have the same title.

In [51]:
%%sql
SELECT DISTINCT
    releases.title
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
WHERE
    artists.name = 'Coldplay'
ORDER BY
    releases.title
;

 * postgresql://postgres:***@db:5432/postgres
40 rows affected.


title
Acoustic
A Rush Of Blood To The Head
Boot Of Sound
Brothers & Sisters
Clocks
Clocks...
Clocks / Chime Trance Remixes
Clocks (Cosmos Rmx)
Clocks (Dean Coleman Remix)
Clocks (Planet Rockers Remixes)


#### 10. How many tracks from 'Coldplay' have position '1'?

In [55]:
%%sql
SELECT
    COUNT(*)
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
    JOIN tracks USING(release_id)
WHERE
    artists.name = 'Coldplay' 
    AND tracks.position = '1'
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


count
32


#### 11. List the titles of all releases by Coldplay that contain less than 2 tracks.

In [117]:
%%sql
SELECT DISTINCT
    releases.title
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
    JOIN tracks USING(release_id)
WHERE
    artists.name = 'Coldplay' 
GROUP BY
    releases.title, releases.release_id
HAVING
    COUNT(*) < 2
;

 * postgresql://postgres:***@db:5432/postgres
14 rows affected.


title
Boot Of Sound
Clocks
Clocks (Cosmos Rmx)
Clocks (Dean Coleman Remix)
Clocks (Remix)
God
In My Place
One I Love
Speed Of Sound (Karl G Remix)
Talk


#### 12. What is the average track duration?

In [120]:
%%sql
SELECT
    AVG(tracks.duration)
FROM
    tracks
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


avg
325.0749298696788


#### 13. How many artists have released tracks longer than twice the average?

In [119]:
%%sql
SELECT 
    COUNT(DISTINCT artists.artist_id)
FROM artists
        JOIN released_by USING(artist_id)
        JOIN releases USING(release_id)
        JOIN tracks USING(release_id)
WHERE
    tracks.duration > 2 * (
    SELECT
        AVG(tracks.duration)
    FROM
        tracks
)
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


count
6386


## Exercise 3: more SQL
We will now see more complex SQL queries.

<b>Practical tips:</b>

When writing complex queries, you might want to split them into smaller parts by using <b>Common Table Expressions</b> (CTEs). A CTE is a named temporary result set that you can reference within statements (SELECT, INSERT, UPDATE, ... ). You can find more about CTEs at: https://www.postgresql.org/docs/current/queries-with.html

The following is an example of a query using two CTEs:

In [None]:
%%sql
WITH countries AS (
    SELECT DISTINCT country FROM releases
),
genres AS (
    SELECT DISTINCT genre FROM releases
)
SELECT c.country as column1, g.genre as column2
FROM countries c, genres g
LIMIT 5;

In some exercises, you might also want to use <b>subqueries</b>. A subquery is a nested query, usually with the purpose of retrieving data that will be used in in the outer query. For instance, subqueries can appear in WHERE, FROM and SELECT clauses.

The following is an example of a query than includes a subquery:

In [None]:
%%sql
SELECT release_id as column1 FROM (
    SELECT release_id, title, COUNT(*) FROM tracks
    GROUP BY release_id, title
    HAVING COUNT(*) > 1
) sub
LIMIT 5;

#### 1. What is the title of the album from 'Coldplay' with the most amount of tracks?

In [121]:
%%sql
SELECT
    releases.title, COUNT(*)
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
    JOIN tracks USING(release_id)
WHERE
    artists.name = 'Coldplay' 
GROUP BY 
    releases.title, releases.release_id
ORDER BY
    COUNT(*)
DESC
LIMIT 1;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


title,count
X&Y (Special Dutch Edition),19


#### 2. What is the name of the first artist in alphabetical order with releases in the most genres. Please make sure to exclude "Various Artists".

In [86]:
%%sql
SELECT
    name, cnt
FROM
(
    SELECT 
        name, RANK() OVER (ORDER BY cnt DESC) as rk, cnt
    FROM
    (
        SELECT
            name, COUNT(*) as cnt
        FROM
        (
            SELECT DISTINCT
                artists.name as name, releases.genre
            FROM
                artists
                JOIN released_by USING(artist_id)
                JOIN releases USING(release_id)
                JOIN tracks USING(release_id)
            WHERE
                artists.name != 'Various Artists'
            GROUP BY
                artists.name, releases.genre
        )
        GROUP BY
            name
    )
)
WHERE
    rk = 1
ORDER BY
    name
LIMIT 1
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


name,cnt
Diana Ross,7


#### 3. In what year did they (the artist from the previous question) release their first album?

In [88]:
%%sql
SELECT
    releases.released
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
WHERE
    artists.name = 'Diana Ross'
ORDER BY
    releases.released
LIMIT 1
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


released
1967-01-01


#### 4. How many artists have released an album with total track duration above twice the average total track duration?

<b>Hint</b>: this is not the same as exercise 2.13 since we are lookong at the <b>total</b> track duration of the album.

In [125]:
%%sql
WITH info AS
(
    SELECT
        artists.name as name, releases.released as released, SUM(tracks.duration) as duration
    FROM
        artists
        JOIN released_by USING(artist_id)
        JOIN releases USING(release_id)
        JOIN tracks USING(release_id)
    GROUP BY
        artists.name, releases.released
)
SELECT
    count(DISTINCT info.name)
FROM info
WHERE
    info.duration > 2 * (SELECT AVG(duration) FROM (SELECT SUM(duration) as duration FROM releases JOIN tracks USING(release_id) GROUP BY releases.release_id))
;

 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


count
6524


#### 5. How many artists have both a release with a track longer than twice the average and one with total duration longer than twice the average?

<b>Hint</b>: you can use `INTERSECT` or `EXISTS` to write your query.

In [103]:
%%sql
SELECT COUNT(*) FROM (
WITH release_info AS
(
    SELECT
        artists.name as name, releases.released as released, SUM(tracks.duration) as duration
    FROM
        artists
        JOIN released_by USING(artist_id)
        JOIN releases USING(release_id)
        JOIN tracks USING(release_id)
    GROUP BY
        artists.name, releases.released
)
,
 album_info AS(
    SELECT
        artists.name as name, tracks.duration as duration
    FROM
        artists
        JOIN released_by USING(artist_id)
        JOIN releases USING(release_id)
        JOIN tracks USING(release_id)
)
SELECT 
    DISTINCT release_info.name
FROM release_info
WHERE
    release_info.duration > 2 * (SELECT AVG(duration) from release_info)

INTERSECT

SELECT
    DISTINCT album_info.name
FROM album_info
WHERE
    album_info.duration > 2 * (SELECT AVG(duration) from album_info)
);


 * postgresql://postgres:***@db:5432/postgres
1 rows affected.


count
1570


#### 6. Show the artists have more than 200 releases in total but have no releases with the genre 'Pop' in reversed alphabetical order.

In [109]:
%%sql
SELECT *
FROM
(
SELECT artists.name
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
GROUP BY
    artists.name
HAVING
    COUNT(*) > 200
EXCEPT
SELECT DISTINCT artists.name
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
WHERE
    releases.genre = 'Pop'
GROUP BY
    artists.name
)
ORDER BY
    name
DESC
;

 * postgresql://postgres:***@db:5432/postgres
11 rows affected.


name
Underworld
The Shamen
The Art Of Noise
Technotronic
Tangerine Dream
Pet Shop Boys
Orbital
Kool & The Gang
Faithless
Beastie Boys


## Exercise 4: Discuss query patterns and language features of SQL
1. What patterns did you use in many of the queries above? 

2. What is the usual pattern of an SQL query? Which operations happen pre-grouping and which ones post-grouping?

3. What makes SQL a declarative language and what advantages does that have?

4. What makes SQL a functional language and what advantages does that have?

5. How would the denormalization we talked about previously simplify the queries?

## Exercise 5: Limits of SQL (optional)
Explain what the following query does.
<b>Hints</b>: The query treats the data as if it was in graph shape.

In [110]:
%%sql
WITH RECURSIVE
    X AS (SELECT 3 AS Value),
    artist_releases AS (
        SELECT artists.artist_id, artists.name, releases.release_id, releases.title
        FROM artists, released_by, releases
        WHERE artists.artist_id = released_by.artist_id
        AND released_by.release_id = releases.release_id
    ),
    collaborations AS (
        SELECT DISTINCT ar1.artist_id AS left_id, ar1.name AS left_name, 
                ar2.artist_id AS right_id, ar2.name AS right_name, 1 AS distance
        FROM artist_releases AS ar1, artist_releases AS ar2
        WHERE ar1.release_id = ar2.release_id
        AND ar1.artist_id != ar2.artist_id
    ),
    X_hop_collaborations AS (
        SELECT * FROM collaborations  -- base case
        UNION
        SELECT c1.left_id, c1.left_name, c2.right_id, c2.right_name, c1.distance + 1 AS distance
        FROM X_hop_collaborations AS c1
        JOIN collaborations c2 ON c1.right_id = c2.left_id
        WHERE c1.distance < (SELECT * FROM X)
    )
SELECT * 
FROM X_hop_collaborations
WHERE left_name = 'Coldplay'
ORDER BY distance, right_name;

 * postgresql://postgres:***@db:5432/postgres
26 rows affected.


left_id,left_name,right_id,right_name,distance
29735,Coldplay,1654,DK,1
29735,Coldplay,392179,G Synth,1
29735,Coldplay,10916,Jan Johnston,1
29735,Coldplay,1279,Orbital,1
29735,Coldplay,10785,Angelo Badalamenti,2
29735,Coldplay,29735,Coldplay,2
29735,Coldplay,11101,Cosmic Gate,2
29735,Coldplay,7090,Freefall,2
29735,Coldplay,2604010,Jada (7),2
29735,Coldplay,18836,Kirk Hammett,2


In [138]:
%%sql
SELECT
    artists.name, artists.artist_id, COUNT(DISTINCT releases.country) as cnt
FROM
    artists
    JOIN released_by USING(artist_id)
    JOIN releases USING(release_id)
WHERE
    artists.name != 'Various Artists'
GROUP BY
    artists.name, artists.artist_id
ORDER BY
    cnt DESC,
    name ASC
LIMIT 5;

 * postgresql://postgres:***@db:5432/postgres
5 rows affected.


name,artist_id,cnt
Technotronic,14457,24
ATB,5797,21
Madonna,8760,21
Pet Shop Boys,7552,21
Faithless,4118,20


In [147]:
%%sql
SELECT
    genre, AVG(cnt) as av
FROM (
    SELECT
        releases.genre, releases.release_id, COUNT(*) as cnt
    FROM
        releases
        JOIN tracks USING(release_id)
    GROUP BY
        releases.genre, releases.release_id
)
GROUP BY
    genre
ORDER BY
    av
DESC
;

 * postgresql://postgres:***@db:5432/postgres
15 rows affected.


genre,av
Blues,16.583333333333332
Stage & Screen,14.41025641025641
Jazz,12.555488721804512
"Folk, World, & Country",12.207920792079207
Children's,12.0
Latin,11.738562091503267
Non-Music,11.212464589235127
Reggae,10.167149059334298
Rock,10.094236360526764
Brass & Military,10.0
