 a major advantage that SQL-based systems have over NoSQL
data storage solutions is how intuitive grouping and aggregation is in the
former category.

# Aggregate Functions

An aggregate function is used to compute summarization information
from a table or tables.

We have already seen the COUNT aggregate function

 there are other aggregation
functions in SQL like :
* AVG for calculating averages; 
* SUM for computing
totals; 
* and the extreme functions MAX, MIN for finding out maxima and
minima values respectively.

count and extreme functions work with all data types, but
functions like SUM and AVG make sense only with numeric types 


Finding the Average year of creation in our
programming languages table

```
tesdb=# SELECT AVG(year) FROM proglang_tbl;
```

a decimal number with a default of 16 digits
after the decimal point in PostgreSQL.

a value like this to specify year is not useful

we go about casting this average value to an intege

```
tesdb=# SELECT CAST(AVG(year) AS INTEGER)
        FROM proglang_tbl;
```

 CAST only works with compatible data
types like numerics and integers. 

```
tesdb=# SELECT CAST(language AS INTEGER)
        FROM proglang_tbl;
```

find the sum of the year values in our table 

```
tesdb=# SELECT SUM(year)
        FROM proglang_tbl;
```

Using SUM on a varchar field in PostgreSQL

```
tesdb=# SELECT SUM(language)
        FROM proglang_tbl;
```

# Using the Extreme Functions – MAX and MIN

find the extreme values from a set of column values. 

```
tesdb=# SELECT MIN(year)
        FROM proglang_tbl;

tesdb=# SELECT MAX(year)
        FROM proglang_tbl;

tesdb=# SELECT MAX(year),
        MIN(year)
        FROM proglang_tbl;

tesdb=# SELECT MAX(year),
        MIN(language)
        FROM proglang_tbl;        
```

Notice that APT was not chosen since L < T when comparing alphabets.

# Grouping Data

The GROUP BY clause of a SELECT query is used to group records based
upon their field values. This clause is placed after the WHERE conditional.


```
tesdb=# INSERT INTO proglang_tbl
        (id, language, author, year, standard)
        VALUES
        (8, 'Fortran', 'Backus', 1957, 'ANSI');

tesdb=# SELECT standard FROM proglang_tbl
        WHERE standard IS NOT NULL
        GROUP BY standard;

tesdb=# SELECT standard FROM proglang_tbl
        WHERE standard IS NOT NULL;      
```

Let’s try to add the
language column to the output of the above query 


```
tesdb=# SELECT standard,
        language
        FROM proglang_tbl
        WHERE standard IS NOT NULL
        GROUP BY standard;
```

The database engine gave us an error for this query. This makes sense
because while it bunched the different standards together because of
our grouping clause, which language it should choose to display with it
is ambiguous. 

the rule that the columns listed in
the SELECT clause must be present in the GROUP BY clause.

```
tesdb=# SELECT standard,
        language
        FROM proglang_tbl
        WHERE standard IS NOT NULL
        GROUP BY standard, language;
```

1. You cannot group by a column that is not present in
the SELECT list.
2. You must specify all the columns in the grouping
clause that are present in the SELECT list.

# Grouping and Aggregate Functions

combining the GROUP BY clause with the COUNT aggregate function

```
tesdb=# SELECT standard,
        COUNT(*)
        FROM proglang_tbl
        GROUP BY standard;

tesdb=# SELECT year,
        MIN(language),
        COUNT(*)
        FROM proglang_tbl
        GROUP BY year;
```

# The HAVING Clause

places conditions on the fields of a query, the HAVING
clause places conditions on the groups created by GROUP BY. It must be
placed immediately after the GROUP BY but before the ORDER BY clause

```
tesdb=# SELECT language,
        standard,
        year
        FROM proglang_tbl
        GROUP BY standard,
        year,
        language
        HAVING year < 1980;
```

You might wonder why we need two different filtering clauses – WHERE
and HAVING. A WHERE clause does not allow aggregate functions in its
conditionals, a prime target for the HAVING clause.


```
tesdb=# SELECT standard
        FROM proglang_tbl
        WHERE COUNT(standard) > 1
        GROUP BY standard;

tesdb=# SELECT standard
        FROM proglang_tbl
        GROUP BY standard
        HAVING COUNT(standard) > 1;        
```

It correctly gave us the names of the two standard values with
more than one occurrence. Interestingly, if we tweak the conditional to
COUNT(*), we get an additional row 

```
tesdb=# SELECT standard
        FROM proglang_tbl
        GROUP BY standard
        HAVING COUNT(*) > 1;
```