# Intermediate SQL: Aggregating Data

Here you can access every table used in the course. To access each table, you will need to specify the `cinema` schema in your queries (e.g., `cinema.reviews` for the `reviews` `table.`

## Chapter 7: Summarizing Data
* To understand the dataset as a whole, can use aggregate functions to summarize data
* Aggregate functions perform calculations on several values to return one
    * `COUNT`
    * `AVERAGE`
    * `SUM`
    * `MIN`/`MAX`
* Aggregate functions operate on field, not individual records
* Many can be used w numerical and nonnumerical fields
    * `AVG` and `SUM` are numerical only, others can use various data types
* With nonnumerical functions, either orders alphabetically, chronologically, or numerically (for `MIN`, `MAX`, and `COUNT`)
    * NOT nonnumerical values that appear the least or most
* Queries automatically update field name to function, but best practice is to use alias so what the results represent is clear

In [10]:
-- Average
SELECT AVG(budget)
FROM cinema.films;
-- Sum
SELECT SUM(budget)
FROM cinema.films;
-- Min
SELECT MIN(budget)
FROM cinema.films;
-- Max
SELECT MAX(budget)
FROM cinema.films;
-- Aliasing
SELECT AVG(budget) AS average_budget
FROM cinema.films;
-- Calculate the average gross of films that start with A
SELECT AVG(gross) AS avg_gross_A 
FROM cinema.films 
WHERE title 
LIKE 'A%';
-- Calculate the highest gross film released between 2000-2012
SELECT MAX(gross) AS highest_gross 
FROM cinema.films 
WHERE release_year
	BETWEEN 2000 AND 2012;

Unnamed: 0,highest_gross
0,760505847


## Chapter 8: Summarizing Subsets
* Combine aggregate functions using `WHERE` since this executes before `SELECT`
* To clean up decimals, use round function to round to specified decimal
    * Two parameters: what we want to round and how many decimal places
        * `ROUND(number_to_round, decimal_places)`
        * If no number given, rounds to a whole number on default
    * To round to the left of the decimal point, use a negative number of decimal places
    * *Can only be used with numerical fields*

In [9]:
-- Example using AVG
SELECT AVG(budget) AS avg_budget
FROM cinema.films
WHERE release_year >= 2010;
-- Rounding
SELECT ROUND(AVG(budget),2) AS avg_budget
FROM cinema.films
WHERE release_year >= 2010;
-- Negative rounding
SELECT ROUND(AVG(budget), -5) AS avg_budget
FROM cinema.films
WHERE release_year >= 2010;
-- Calculate the average gross of films that start with A
SELECT AVG(gross) AS avg_gross_A 
FROM cinema.films 
WHERE title LIKE 'A%';

Unnamed: 0,avg_gross_a
0,47893240.0


## Chapter 9: Aliasing Arithmetic
* Can perform basic arithmeic using `+`, `-`, `*`, and `\`
* Using parentheses indicates when calculation should execute
    * Sequel assumes we want an integer if we divide an integer by an integer
        * Add decimal places for more precision
* Aggregate functions perform operations _`vertically`_ using fields
* Arithmetic functions perform operations _`horizontally`_ using rows
* Order of execution: 
    * `FROM`
    * `WHERE`
    * `SELECT` - **Aliases defined here**
    * `LIMIT`

In [21]:
-- Adding, Subtracting, Multiplying
SELECT (4+3);
SELECT (4-3);
SELECT (4*3);
-- Dividing
SELECT (4/3);
SELECT (4.0/3.0);
-- Find the profit of 5 unique, English-language films from 2000 to 2010
SELECT DISTINCT title, (gross-budget) AS profit
FROM cinema.films
WHERE language = 'English'
	AND release_year BETWEEN 2000 AND 2010
LIMIT 5;
-- Calculate the title and duration in hours from films
SELECT title, 
	(duration/60.0) AS duration_hours 
FROM cinema.films;
-- Calculate the percentage of people who are no longer alive
SELECT COUNT(deathdate) * 100.0 / COUNT(*) AS percentage_dead
FROM cinema.people;
-- Find the number of decades in the films table
SELECT (MAX(release_year)-MIN(release_year)) / 10.0 
		AS number_of_decades 
FROM cinema.films;
-- Round duration_hours to two decimal places
SELECT title, ROUND((duration / 60.0),2) 
		AS duration_hours
FROM cinema.films;

Unnamed: 0,title,duration_hours
0,Intolerance: Love's Struggle Throughout the Ages,2.05
1,Over the Hill to the Poorhouse,1.83
2,The Big Parade,2.52
3,Metropolis,2.42
4,Pandora's Box,1.83
...,...,...
4963,Unforgotten,0.75
4964,Wings,0.50
4965,Wolf Creek,
4966,Wuthering Heights,2.37
