# Introduction to SQL for Excel Users – Part 7: Basic Groups

[Original post](https://www.daveondata.com/blog/introduction-to-sql-for-excel-users-part-7-basic-groups/)

## Groups in Excel

Grouping data is an extremely common and important analysis technique in Excel.

For example, grouping is the basis for Excel pivot tables.

As usual, I’m going to start with simple concepts and build upon them slowly – your continued patience is appreciated! 😁

Using the CallCenter data I can create a group by filtering on CallCenter.Shift = "PM1":

![excel group](07\excelgroup1.png)

grouping by filtering an excel table
The image is a bit small, open it in a new browser tab to see it in all of its glory.

To make things easier, I will just consider LevelOneOperators for this post.

![excel group operators](07\excelgroup2.png)

filtering tables in excel
Think of the ☝ screenshot of consisting of one possible group for a column. In this case the group corresponding to the value of "PM1" from the CallCenter.Shift column.

We can use Excel to do all kinds of interesting things with groups.

For example, we could find the total sum of LevelOneOperators for the above group (i.e., 81) or the average value of LevelOneOperators for the above group (i.e., 2.7).

Using basic Excel tables, I could repeat this filtering process for other values of CallCenter.Shift (e.g., filtering for the value of "AM"), but Excel provides a better way of working with groups.

Enter the mighty Excel pivot table…

## Pivot Tables Work With Groups

A good way to think of Excel pivot tables is as a short-cut mechanism for working with multiple groups of data at the same time.

Everything that is done in Excel pivot tables can be done with basic tables, it’s just a lot more work.

Take the following pivot table built with CallCenter data:

![pivot table 1](07\pivottable1.png)

excel pivot table
The above pivot table finds every group in the CallCenter.Shift column and then calculates the sum of LevelOneOperators for each group.

Pivot table are awesome as they quickly allow us to analyze data in a number of ways (e.g., sums, averages, min/max values, etc.) by important business categories (i.e., groups).

Right now you might be 🥱. Apologies, this is important.

We can add to the above pivot table quite easily for more insights:

![pivot table 2](07\pivottable2.png)

excel pivot table with min values
The above table can be thought of as, “group the data by the CallCenter.Shift column and then calculate the SUM, MIN, and MAX of the LevelOneOperators column for each group.”

Not surprisingly, groups of data are also important in SQL…

## Groups In SQL

The following is arguably the single most useful SQL snippet for the analytics professional:

```
GROUP BY
```

Think of this snippet as the SQL equivalent of Excel pivot tables.

And we all know how useful Excel pivot tables are for analyzing data!

Take the following SQL snippet:

```
FROM FactCallCenter FCC
GROUP BY FCC.Shift
```

If this doesn’t look familiar, go back and check out previous posts.

The above SQL snippet tells the DB that I want to pull data from the FactCallCenter table (using an alias of FCC) and I want the data grouped by the Shift column.

A rule that you have to follow is that any column you list in the GROUP BY must be in the SELECT list. This is conceptually the same as putting a column in the Rows of an Excel pivot table.

Here is some legit, but not particularly useful, SQL:

In [None]:
SELECT FCC.Shift
FROM FactCallCenter FCC
GROUP BY FCC.Shift

Sweet! Notice how the above looks an awful like the Excel pivot tables above?

One thing I need to mention at this point.

SQL does not order data by default (e.g., ascending order). Think of the SSMS results ☝ being in ascending order alphabetically as a happy fluke.

If you want your data sorted, then you need to explicitly do so using the following SQL syntax:

```
ORDER BY
```

I will improve the query to make sure that the results are in ascending order by Shift value:

In [None]:
SELECT FCC.Shift
FROM FactCallCenter FCC
GROUP BY FCC.Shift
ORDER BY FCC.Shift ASC

Since the results of running the above query in SSMS are the same I won’t bother showing them again.

In the above SQL the ASC keyword denotes ascending order. You can also use DESC to denote descending order.

Moving on.

SQL provides a number of aggregate functions for working with groups in your queries.

Aggregate functions are special in that they always return a single value.

This is conceptually the same as an Excel pivot table. It makes no sense to have more than one sum in a cell in a pivot table.

I will add some aggregate functions to the query to produce results that match the above pivot table:

In [None]:
SELECT FCC.Shift
      ,SUM(FCC.LevelOneOperators)
      ,MIN(FCC.LevelOneOperators)
      ,MAX(FCC.LevelOneOperators)
FROM FactCallCenter FCC
GROUP BY FCC.Shift
ORDER BY FCC.Shift ASC

Notice that unlike Excel, SQL aggregate functions do not produce column names by default.

I’ll need to add these myself:

In [None]:
SELECT FCC.Shift
      ,SUM(FCC.LevelOneOperators) AS TotalLevelOneOperators
      ,MIN(FCC.LevelOneOperators) AS MinLevelOneOperators
      ,MAX(FCC.LevelOneOperators) AS MaxLevelOneOperators
FROM FactCallCenter FCC
GROUP BY FCC.Shift
ORDER BY FCC.Shift ASC

Awesome!

Just as with the usage of pivot tables is a go-to analysis technique in Excel, so is GROUP BY in the SQL world.

You’ll use GROUP BY so often it will be come second nature to think of the business data in your DB by groupings.

In my daily work it is rare for a query not to have at least one GROUP BY and most have more than one.

## The Learning Arc

In the next post I will continue coverage of the mighty GROUP BY.

The topic will be slicing/dicing data with multiple GROUP BY columns.

Stay healthy and happy data sleuthing!