## Aggregate function

Instead, you might to know how many countries are on each continent. Or, to expand on the query, the average population across of all countries on each continent. This process of grouping by one column, then counting or performing some other operation on another column, is called aggregating.

Aggregate functions include

* COUNT
* SUM
* AVG
* MAX and MIN


SUM and AVG can only be used with number types, while MAX and MIN can be used for numbers, date/times, and even strings (though you will need to think about what it means to return a 'smaller' or 'larger' string). COUNT is available to any data type, and returns the number of non-null values in that column.

In [1]:
# SELECT continent
# FROM country
# GROUP BY continent;


# SELECT name, continent
# FROM country
# GROUP BY continent;


# SELECT COUNT(name), continent
# FROM country
# GROUP BY continent;


# SELECT SUM(population), continent
# FROM country
# GROUP BY continent;


# SELECT AVG(lifeexpectancy), continent
# FROM country
# GROUP BY continent;


# SELECT MAX(indepyear), continent
# FROM country
# GROUP BY continent;


# SELECT COUNT(*) as big_countries
# FROM country
# WHERE population > 100000000;

# SELECT SUM(population) as the_whole_world
# FROM country;

## SELECT DISTINCT

SELECT DISTINCT has some similarities to GROUP BY. It is a simple way to return all unique values of a given field. It is not meant to be used with aggregate functions; instead, its typical purpose is to remove duplicates.

Suppose you're interested in what types of government are common in different continents. If you query

In [2]:
# SELECT continent, governmentform
# FROM country
# ORDER BY continent, governmentform;


# SELECT DISTINCT continent, governmentform
# FROM country
# ORDER BY continent, governmentform;

1. Use an aggregate function to return the number of countries that became independent in the year 1918.

In [3]:
# agg_1918 = '''
# SELECT COUNT(name) FROM country WHERE indepyear = 1918;
# '''


In [4]:
# agg_1918 = '''
# SELECT COUNT(code)
# FROM country
# WHERE indepyear = 1918;
# '''

2. Write a query that returns the average population of countries whose government is a Constitutional Monarchy.

In [5]:
# agg_constmon = '''
# SELECT AVG(population) FROM country WHERE governmentform IN ('Constitutional Monarchy');
# '''

In [6]:
# agg_constmon = '''
# SELECT avg(population)
# FROM country
# WHERE governmentform = "Constitutional Monarchy";
# '''

In [7]:
# SELECT avg(population)
# FROM country
# WHERE governmentform LIKE "%Constitutional Monarchy%";

3. Write a query that returns each continent and the area of the largest country in that continent.

In [8]:
# agg_areas = '''
# SELECT continent, surfacearea
# FROM country
# WHERE surfacearea
# IN (SELECT MAX(surfacearea) FROM country GROUP BY continent);
# '''

In [None]:
# agg_areas = '''
# SELECT continent, MAX(surfacearea)
# FROM country
# GROUP BY continent;
# '''