# GroupBy Extensions

If you've made it this far in the course, then you already know the basics of SQL.

In particular, I assume that you know how to group rows by single or multiple columns and how to calculate aggregate functions: sums, averages, maximum values, etc.

It’s time to learn some of its advanced features! 

In this section, you’ll learn how to use ROLLUP, CUBE, and GROUPING SETS operations.

ROLLUP is an extension of the GROUP BY clause that allows you to attach additional lines which represent subtotals. 

They are often called super-aggregated rows. 

They can be presented together with the total sum line. 

Thanks to the useful ROLLUP extension, you can generate multiple grouping sets using just one SQL query.

Similar to ROLLUP, the CUBE extension allows you to generate subtotals.

In addition, CUBE generates subtotals for all combinations of grouping columns specified in the GROUP BY clause.

GROUPING SETS is another step further because it allows for computing multiple GROUP BY clauses in a single SQL statement.

GROUP BY GROUPING SETS is equivalent to the UNION of two or more GROUP BY operations in the same result set.

# ROLLUP

Let's first get to know the tables that we'll use in this part of the course.

Check out the tables here: https://www.db-fiddle.com/f/q1yM9pKcshbvZehgqfMJGd/0

Our examples will be based on a table named contest_score. You can check its contents using the button on the right side of the screen.

```
SELECT 
  * 
FROM contest_score;
```

This table is used in a TV show in which contestants are given points for tasks in two categories: physical (Workout) and intellectual (Knowledge). scores from 1 to 10 are given over the course of five weeks to five contestants.

This table has the following columns: full_name, week, category, and score.


The second table is named delivery; it will be used in our exercises. It is used by a supermarket to keep track of all deliveries.

```
SELECT
  * 
FROM delivery;
```


The table has the following columns:

- supplier (the company that delivers the products).
- category (such as Toys or Office).
- delivery_date (when the products were delivered to the store).
- totalprice (how much was paid for the whole delivery).


# Grouping by a single column


To quickly recap what you already know about grouping rows in SQL, here is a simple problem: suppose we want to know the average score for each TV show contestant over the course of five weeks. 

In SQL, we'd write the following query:

```
SELECT
  full_name,
  AVG(score) AS avg_score
FROM contest_score
GROUP BY full_name;
```

In this simple query, we grouped the results by a single column (full_name) and used the AVG() function to calculate the average score for each contestant. 

Nothing really difficult.


# Exercise

How much, in total, was spent on deliveries for each category? Show two columns: category and the sum of total_price (as total).

```
SELECT 
  category, 
  SUM(total_price) AS total
FROM delivery
GROUP BY category;
```

# Grouping by multiple columns

Okay, we know how well each contestant did in general, but we would also like to know how well they did on average in each category. To achieve that, we can add another column to the GROUP BY clause:

```
SELECT
  full_name,
  category, 
  AVG(score) AS avg_score
FROM contest_score
GROUP BY full_name, category;
```

With this addition, we will now get each contestant's average score in each category.


# ROLLUP introduction

Now we know how well each contestant did in general (GROUP BY full_name), and also how well they did in each category (GROUP BY full_name, category). 

But note one thing: by adding the second column to the GROUP BY clause, we lost some information from the previous query.

With the `GROUP BY full_name, category clause`, we know the average scores for each contestant in each category, but we are no longer able to check the overall average for each contestant.

That's where GROUP BY ROLLUP comes in handy. Take a look:

```
SELECT
  full_name,
  category, 
  AVG(score) AS avg_score
FROM contest_score
GROUP BY ROLLUP (full_name, category);
```

Note the change in the GROUP BY clause. We added the ROLLUP operator, followed by a pair of parentheses. Inside, we put full_name and category.

Let's see what changes when we use ROLLUP.

Apart from averages by contestant and category, we can also see averages by contestant and an overall average across all contestants and categories.

e see that ROLLUP added two new rows to the query result: overall averages for each contestant and a general overall average for all contestants.

# Exercise

Show how much was spent for each category on each day, for each category in general, and for all days and all categories. 

Show the following columns: category, delivery_date, and the sum of total_price (rename the column total).

```
SELECT 
  category, 
  delivery_date,
  SUM(total_price) AS total
FROM delivery
GROUP BY ROLLUP(category, delivery_date
```

# Order of columns in ROLLUP

Now, you may have noticed that when we wrote:

```
SELECT
  full_name,
  category, 
  AVG(score) AS avg_score
FROM contest_score
GROUP BY ROLLUP (full_name, category);
```

we got the following groupings:

1. Average by full_name and category
2. Average by full_name
3. Overall average

In other words, we saw the full_name average, but we didn't see an average for category only. 

That's because the column order matters in ROLLUP. 

Let's check that out.

Run the template code:

```
SELECT
  full_name,
  category, 
  AVG(score) AS avg_score
FROM contest_score
GROUP BY ROLLUP (category, full_name);
```

As you can see, we now have the following groupings:

1. Average by full_name and category
2. Average by category
3. Overall average

You can see that by reversing the order of the columns inside the ROLLUP's parentheses, we changed one of the groupings.

As a general rule, ROLLUP will always show new grouping combinations by removing columns one by one, starting from the right:

```
GROUP BY ROLLUP (A, B, C) =
GROUP BY (A, B, C) +
GROUP BY (A, B) +
GROUP BY (A) +
GROUP BY ()
```

# Exercise

Show how much was spent in each category on each day, on each day in general, and in general among all days and categories. Show the following columns: category, delivery_date, and the sum of total_price (as total).

```
SELECT 
  category, 
  delivery_date,
  SUM(total_price) AS total
FROM delivery
GROUP BY ROLLUP (delivery_date, category);
```

# The GROUPING() function

When using multiple columns inside ROLLUP's parentheses, it's quite easy to get lost among the resulting rows. 

SQL offers a function that tells you if the column is included in the grouping: GROUPING().

The GROUPING() function takes one column as an argument.

It returns a 0 if the column is used in the grouping and a 1 if it is not. 

Take a look:

```
SELECT
  full_name,
  category, 
  week,
  AVG(score) AS avg_score,
  GROUPING(full_name) AS F,
  GROUPING(category) AS C, 
  GROUPING(week) AS W
FROM contest_score
GROUP BY 
  ROLLUP (full_name, category, week);
```

As you can see, we added three GROUPING() functions in our SELECT clause. Inside the parentheses, we put all columns from the parentheses of ROLLUP. 

Let's see what the result is.

# Columns outside ROLLUP

Another thing you may be wondering about is whether you need to include all grouping columns inside the ROLLUP parentheses. 

No, you don't! 

You can leave some columns outside ROLLUP:

```
SELECT
  full_name,
  category, 
  week,
  AVG(score) AS avg_price
FROM contest_score
GROUP BY 
  ROLLUP (full_name, category),
  week;
```

In the query above, all rows will be grouped by columns not included in ROLLUP. 

This means that the following grouping levels will be applied:

1. GROUP BY full_name, category, week
2. GROUP BY full_name, week
3. GROUP BY week


# Multiple ROLLUPs

If you need even more fine-grained control over the grouping combinations, you can also use multiple ROLLUPs:

```
SELECT
  full_name,
  category, 
  week,
  AVG(score) AS avg_score
FROM contest_score
GROUP BY 
  ROLLUP(full_name, category), 
  ROLLUP(week);
```

The query above will create even more combinations than

`ROLLUP(full_name, category, week)`

because the grouping options of ROLLUP(full_name, category) and the grouping options of ROLLUP(week) are combined together. 

# ROLLUP with COALESCE()

The last thing we'll show you in this part is how to get rid of those nasty NULL values in higher grouping levels. 

To that end, we'll use the COALESCE() function.

```
SELECT
  COALESCE(full_name, 'All Contestants') AS full_name,
  COALESCE(category, 'All Categories') AS category,
  week,
  AVG(score) AS avg_score
FROM contest_score
GROUP BY ROLLUP(full_name, category), week;
```

COALESCE() takes as many arguments as you wish and returns the first element that is not NULL. 

For instance,

```
COALESCE(full_name, 'All Contestants')
```

will produce either the respective full_name value or the string 'All Contestants' if full_name is NULL. Naturally, you can use any other text value instead of 'All Contestants'.

# Summary of RollUp

1. GROUP BY ROLLUP() works by creating additional rows with fewer grouping columns, as shown below:
```
GROUP BY ROLLUP (A, B, C) =
GROUP BY (A, B, C) +
GROUP BY (A, B) +
GROUP BY (A) +
GROUP BY ()
```

2. By changing the order of columns inside ROLLUP, we change the grouping levels created.

3. Not all columns must be included inside ROLLUP. Those outside its parentheses will always be used for grouping.

4. We can use COALESCE(column_name, substitute_value) to show the substitute_value when column_name is NULL.

5. The GROUPING(column_name) function shows if the column_name column is used in the grouping.

# CUBE

You learned how to use ROLLUP in the previous part. 

Here, we will practice CUBE – another GROUP BY extension that is mainly used in ETL processes (i.e., when working with data warehouses and creating advanced reports).

### Our tables

Go here for the fiddle: https://www.db-fiddle.com/f/wMiWiiZBpbXDeE3362U998/0

#### Vaccination Admin Table

Very appropriate for our current times (June 2020, COVID vaccinations) - we will use a table named vaccine_administration in our examples. 

It describes vaccination efficacy in various test patients. The following columns are available:

- id – A unique identifier for each record/row in the table.
- location – A patient’s location, either Norway or Argentina.
- gender – Denoted as either Male or Female.
- risk – The patient’s risk of contracting a disease: Low, Medium, or High.
- age – The patient’s age in years.
- efficacy – An integer value ranging from 1 (poor vaccination efficacy) to 3 (excellent efficacy).

```
SELECT
  *
FROM vaccine_administration;
```

#### Wildfire Incident Table

Let’s take a look at the table that you’ll be querying in the exercises. It is named wildfire_incident and it contains the following columns:

- id – A unique identifier for each record/row in the table.
- year – The year when a given wildfire started: 2016, 2017, or 2018.
- month – The month when a given wildfire started: May, June, or July.
- cause – Lightning Strike, Arson, Spontaneous Combustion, or Unintentional Human Involvement.
- damage_repair_cost – estimated cost to repair the damage caused by a given wildfire.
- duration – The number of hours a given wildfire lasted.

```
SELECT
  *
FROM wildfire_incident;
```
# Introduction to CUBE

In principle, ROLLUP and CUBE are similar. 

The difference is that CUBE does not remove columns from the right to create grouping levels. 

Instead, it creates every possible grouping combination based on the columns inside its parentheses. 

Take a look:

```
SELECT
  location, 
  gender, 
  risk, 
  AVG(age) AS avg_age
FROM vaccine_administration
GROUP BY CUBE (location, gender, risk);
```

Syntax-wise, we only replaced ROLLUP with CUBE. 

Let’s see what result we’ll get.


As you could see, the following rows were created:

```
GROUP BY CUBE(location, gender, risk) =
GROUP BY location, gender, risk +
GROUP BY location, gender +
GROUP BY gender, risk +
GROUP BY location, risk +
GROUP BY location +
GROUP BY gender +
GROUP BY risk +
GROUP BY ()
```


Quite a lot! 

For n columns passed in as parameters, CUBE creates 2^n grouping levels, while ROLLUP only creates n + 1 levels

You need to be aware that CUBE can significantly lower the performance of your queries. 

As few as three columns in CUBE create eight different types of groupings. 

Even though a query with CUBE is faster than separate grouping queries merged with UNION, performance can still be an issue for large tables.

# Exercise

Show the sum of damage_repair_cost (as sum_damage_repair_cost) for all possible grouping combinations based on the year, month, and cause columns.

Show the following columns in the query result: year, month, cause, and sum_damage_repair_cost.

```
SELECT
  year,
  month,
  cause,
  SUM(damage_repair_cost) As sum_damage_repair_cost
FROM wildfire_incident
GROUP BY CUBE (year, month, cause);
```

# Order of Columns in CUBE

CUBE creates all possible grouping combinations. 

This means that, unlike ROLLUP, CUBE does not change the query result when the order of columns inside the parentheses is changed.

Let’s verify this with a practice exercise.

```
SELECT
  year,
  month,
  cause,
  SUM(damage_repair_cost) As sum_damage_repair_cost
FROM wildfire_incident
GROUP BY CUBE (year, month, cause);
```
# CUBE with GROUPING()

As with ROLLUP, you can use the GROUPING() function with CUBE.

## Exercise

Show the average damage repair cost for all possible combinations of year, month, and cause. 

Remove the row containing the total average from the results with the help of the GROUPING() function.

Show the following columns in the query result: year, month, cause, and avg_cost.

```
SELECT
  year,
  month,
  cause,
  AVG(damage_repair_cost) AS avg_cost
FROM wildfire_incident
GROUP BY CUBE (year, month, cause)
HAVING (GROUPING(year) + GROUPING(month) + GROUPING(cause) < 3)
```

# Columns Outside CUBE

Now, if we want to reduce the number of grouping combinations, we can exclude some columns from CUBE:

```
SELECT
  location,
  gender,
  risk,
  MIN(efficacy) AS min_efficacy
FROM vaccine_administration
GROUP BY 
  CUBE (location, gender), risk;
```

In the query above, CUBE will create grouping combinations for location and gender, but risk will be added to each grouping combination. 

As a result, we’ll get the following levels:

```
GROUP BY CUBE (location, gender), risk =
GROUP BY location, gender, risk +
GROUP BY location, risk +
GROUP BY gender, risk +
GROUP BY risk
```

Run the template query and note that risk is now always used for grouping.

# CUBE with multiple pairs of parentheses

There is one more interesting modification of CUBE worth mentioning. 

Take a look:

```
SELECT
  location,
  gender,
  risk,
  AVG(age) AS avg_age
FROM vaccine_administration
GROUP BY 
  CUBE ((location, gender), risk);
```

Inside CUBE's parentheses, we put location and gender in another pair of parentheses! 

This means that location and gender will be treated as a single column – either both or neither of them will be used for grouping:

```
GROUP BY CUBE((location, gender), risk) =
GROUP BY location, gender, risk +
GROUP BY location, gender +
GROUP BY risk +
GROUP BY ()
```

Run the template query above. 

Note that each row is grouped by either location and gender together or by neither of these columns.

# CUBE with COALESCE()

Lastly, the function COALESCE() can be used in the SELECT clause to replace NULL values with the values of your choice:

```
SELECT
  COALESCE(location, '--') AS location,
  COALESCE(gender, '--') AS gender,
  COALESCE(risk, '--') AS risk,
  AVG(age) AS avg_age
FROM vaccine_administration
GROUP BY 
  CUBE (location, gender, risk);
```


In the query above, '--' will be shown when location, gender, or risk is NULL.

Run the template query and note how NULLs are replaced with '--'.

# CUBE Summary

1. GROUP BY CUBE () creates all possible grouping combinations with the columns inside its parentheses.

2. The order of columns within the parentheses of CUBE doesn't matter.

3. You can use the GROUPING() function to show if the column is included in the grouping.

4. You can leave some columns outside CUBE. Such columns will always be used for grouping.

5. You can use additional pairs of parentheses inside CUBE to indicate that certain columns should be treated as a single column by the CUBE clause.

6. You can use COALESCE() to replace NULL values with something more meaningful.

# GROUPING SETS

So far, you've learned how to use CUBE and ROLLUP for advanced reporting purposes. 

This time, we'll present GROUPING SETS – the third and last GROUP BY extension in this course.

Go here for the tables: https://www.db-fiddle.com/f/cqpaYtkTHQdbdUPsJ5s1EK/1

### Warranty Repair table

We'll use a table named warranty_repair in our examples. It has the following columns:

- id – the ID of the repair.
- customer_id – the ID of the customer who ordered a given repair.
- repair_center – denotes which warranty center repaired the device (USA, Germany, or Japan).
- date_received – the date when the repair center received the device for repair.
- repair_duration – the number of days it took to repair the device.
- repair_cost – the USD cost of spare parts used to repair the device (0 in the case of software problems).

```
SELECT 
  * 
FROM warranty_repair;
```

### Loan table

Now let's take a look at the table you'll be working with in the exercises. 

It's named loan and it contains information on loans made by an imaginary bank to imaginary clients.

- id – the ID of the loan.
- client_id – the ID of the client.
- sales_person – the first and last name of the sales person.
- country – the country where the loan was made.
- year – the year in which the loan was taken out.
- quarter – the yearly quarter when the loan was taken out.
- principal – the amount borrowed.
- interest – the interest rate on the loan.

```
SELECT
  * 
FROM loan;
```

# Multiple grouping with UNION ALL


All right, let's get started with GROUPING SETS.

Imagine that you need to create a report on the average repair duration for two grouping levels:

1. per customer_id and repair_center.
2. per date_received date.

You don't want to create any additional grouping levels because you need the report to stay clear and simple.

There is no way you could use either ROLLUP or CUBE to create those grouping levels. One thing you could do is write two separate queries and join them with UNION ALL:

```
SELECT
  NULL AS date_received, 
  customer_id, 
  repair_center, 
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY customer_id, repair_center
UNION ALL
SELECT
  date_received, 
  NULL, 
  NULL, 
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY date_received
```

Note that UNION ALL requires both queries to have the same number of columns. This is why we needed to add some NULLs as columns in the SELECT clause.


# Multiple groupings with GROUPING SETS

As you saw, using UNION ALL is one way of dealing with such reports, but it seems to cause all kinds of problems:

1. The statement becomes huge as more queries are added with UNION ALL
2. The table has to be accessed multiple times, which affects performance
3. Adding NULL values in the SELECT clause is awkward and error prone

That's where GROUPING SETS come in handy. 

They allow you to perform multiple groupings within a single query; each grouping is explicitly stated, as we see below:

```
SELECT
  date_received, 
  customer_id,   
  repair_center, 
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY GROUPING SETS
(
  (customer_id, repair_center),
  (date_received)
)
```

As you can see, GROUP BY GROUPING SETS is followed by a pair of parentheses. 

Inside, we put all the grouping combinations we wish to get, each in a separate pair of parentheses and separated by a comma.

# Using an empty grouping set

Let's say we also need to add the general average repair duration to our report. 

To that end, we can use an empty pair of parentheses:

```
SELECT
  date_received, 
  customer_id, 
  repair_center, 
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY GROUPING SETS
(
  (customer_id, repair_center),
  (date_received),
  ()
)
```

# Exercise

Find the average principal (as avg_principal) and average interest (as avg_interest) for the following grouping combinations:

1. year and quarter
2. country
3. None (overall averages)

```
SELECT
  year,
  quarter,
  country,
  AVG(principal) AS avg_principal,
  AVG(interest) AS avg_interest
FROM loan
GROUP BY GROUPING SETS
(
  (year, quarter),
  country,
  ()
)
```

# GROUPING SETS with GROUPING()

Of course, GROUPING SETS works with the GROUPING() function too:

```
SELECT
  date_received,
  customer_id,
  repair_center,
  AVG(repair_duration) AS avg_repair_duration,
  GROUPING(date_received) AS D,
  GROUPING(customer_id) AS C,
  GROUPING(repair_center) AS R
FROM warranty_repair
GROUP BY GROUPING SETS
(
  (customer_id, repair_center),
  (date_received)
)
```

# Exercise

Find the average interest amounts for the following grouping levels:

- year and quarter
- country

Show the following columns in the query result: year, quarter, country, avg_interest, Y, Q, and C. 

The Y, Q, and C columns show if the columns year, quarter, and country were used in the grouping.

```
SELECT
  year,
  quarter,
  country,
  AVG(interest) AS avg_interest,
  GROUPING(year) AS Y,
  GROUPING(quarter) AS Q,
  GROUPING(country) AS C
FROM loan
GROUP BY GROUPING SETS
(
  (quarter, year),
  (country)
)
```

# GROUPING SETS with COALESCE()

What can we do with those NULL values? Naturally, we can use COALESCE():

```
SELECT
  date_received,
  customer_id,
  COALESCE(repair_center, 'ALL'),
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY GROUPING SETS
(
  (customer_id, repair_center),
  (date_received)
)
```

In the query above, any NULL values in the repair_center column will be replaced with the word ALL.

# Exercise

Show the sum of all principal amounts for two grouping sets:

1. sales_person
2. country

Show the following columns in the query result: sales_person, country, and Sumprincipal.

Instead of NULL values in the columns sales_person and country, show a double dash (--).

```
SELECT
  COALESCE(sales_person, '--') AS sales_person,
  COALESCE(country, '--') AS country,
  SUM(principal) AS Sumprincipal
FROM loan
GROUP BY GROUPING SETS
(
  (sales_person),
  (country)
)
```

# GROUPING SETS with ROLLUP and CUBE

One more interesting thing we can do with GROUPING SETS is combine them with ROLLUP or CUBE:

```
SELECT
  date_received,
  customer_id,
  repair_center,
  AVG(repair_duration) AS avg_repair_duration
FROM warranty_repair
GROUP BY GROUPING SETS
(
  ROLLUP (customer_id, repair_center),
  (date_received)
)
```

The query above will merge ROLLUP grouping levels with a grouping level based on date_received. As a result, we'll get the following grouping combinations:

<img src='https://learnsql.com/static/sql-group-by-extensions-1.png'>

# Summary

It's time to wrap things up.

1. GROUP BY GROUPING SETS( ) works by creating the grouping levels provided explicitly within its parentheses.
2. You can use a pair of empty parentheses ( ) to introduce a general average, sum, etc.
3. The GROUPING SETS extension can be combined with COALESCE().
4. You can use both ROLLUP and CUBE to create even more sophisticated grouping combinations.


# HomeWork

## Roll Up

Use this fiddle: https://www.db-fiddle.com/f/q1yM9pKcshbvZehgqfMJGd/0

# Exercise

Show how much was spent, on average: for each supplier, in each category, for each supplier in general; and in general among all suppliers and categories.

Show the following columns: supplier, category, the average total_price (rename the column to avg_price), and two new columns (S and C) denoting whether the columns supplier or category are used in the grouping (0 if used, 1 otherwise).

```
SELECT
  supplier,
  category,
  AVG(total_price) as avg_price,
  GROUPING(supplier) as S,
  GROUPING(category) as C
FROM delivery
GROUP BY ROLLUP(supplier, category)
```

## Exercise

Show how much was spent on average for each category on each day and on each day in general. 

Show the following columns: category, delivery_date, and the average total_price (name the column avg_price). 

Do not show a single general average among all days and categories.

```
SELECT 
  category, 
  delivery_date,
  AVG(total_price) AS avg_price
FROM delivery
GROUP BY 
  ROLLUP(category),
  delivery_date;
```

## Exercise

Show how much was spent on average:

- in each category on each day.
- in each category.
- on each day.
- in general.

Show the following columns: category, delivery_date, and the average total_price (as avg_price).

```
SELECT 
  category, 
  delivery_date,
  AVG(total_price) AS avg_price
FROM delivery
GROUP BY 
  ROLLUP(delivery_date), 
  ROLLUP(category);
```

## Exercise

Show the total price paid for all deliveries in each category and a grand total for all categories. 

Show the category column and the sum of the total_price. 

The columns names should be category and total. 

Instead of NULL values in the category column, show two hyphens (--).

```
SELECT 
  COALESCE(category, '--') AS category,
  SUM(total_price) AS total
FROM delivery
GROUP BY ROLLUP(category);
```

# CUBE Homework

Use this fiddle: https://www.db-fiddle.com/f/wMiWiiZBpbXDeE3362U998/0

# Exercise

Show the average wildfire duration for all grouping combinations based on the year and month columns. 

Also, group each row by cause.

Show the following columns in the query result: year, month, cause, and avg_duration.

```
SELECT
  year,
  month,
  cause,
  AVG(duration) AS avg_duration
FROM wildfire_incident
GROUP BY 
  CUBE (year, month), 
  cause;
```

# Exercise

Show the sum of damage_repair_cost for grouping combinations based on the columns year, month, and cause. 

Treat year and month as a single column.

Show the following columns in the query result: year, month, cause, and sum_damage_repair_cost.

```
SELECT
  year,
  month,
  cause,
  SUM(damage_repair_cost) As sum_damage_repair_cost
FROM wildfire_incident
GROUP BY CUBE ((year, month), cause);
```

# Exercise

Show the average damage repair costs for all grouping combinations based on the year and month columns. 

Group all rows by cause. 

Replace any NULL values in month with the word 'ALL'.

Show the following columns in the query result: year, month, cause, and avg_damage_repair_cost.

```
SELECT
  year, 
  COALESCE(month, 'ALL') AS month,
  cause,
  AVG(damage_repair_cost) AS avg_damage_repair_cost
FROM wildfire_incident
GROUP BY 
  CUBE (year, month), 
  cause;
```

# Grouping Sets 

Use this fiddle: https://www.db-fiddle.com/f/cqpaYtkTHQdbdUPsJ5s1EK/1

# Exercise


Find the average principal amounts for the following two grouping combinations:

1. year and quarter
2. country

Show the following columns in the query result: year, quarter, country, and avg_principal.

```
SELECT
  year,
  quarter,
  NULL AS country,
  AVG(principal) AS avg_principal
FROM loan
GROUP BY year, quarter
UNION ALL
SELECT
  NULL AS year,
  NULL AS quarter,
  country,
  AVG(principal)
FROM loan
GROUP BY country;
```

# Exercise

Show the average interest amounts for the following grouping sets:

1. ROLLUP by year and quarter
2. country

Show the following columns in the query result: year, quarter, country, and avg_interest.

```
SELECT
  year,
  quarter,
  country,
  AVG(interest) AS avg_interest
FROM loan
GROUP BY GROUPING SETS
(
  ROLLUP (year, quarter),
  (country)
)
```