# SQL query cheat sheet

This notebook contains some of the basic SQL commands for querying a database. You may find it a useful reference while working through notebooks 11 and 12.

## Querying a database

### `SELECT`ing columns of tables


The basic SQL `SELECT` statement queries one or more tables.

The basic query:

```SQL
SELECT *
FROM movie;
```

will return the entire contents of the `movie` table: all the columns, and all the rows.


If you specify some specific column names, only those columns will be returned, in the order specified:

```SQL
SELECT id, title
FROM movie;
```



You can rename columns with `AS`:

```SQL
SELECT id, title AS title_of_movie
FROM movie;
```


A `LIMIT` clause in the query will limit the number of rows returned:

```SQL
SELECT id, title
FROM movie
LIMIT 5;
```

(The SQL adapter will only _display_ a few rows of any returned data, even if the result is much longer.)

### Filtering rows of a table with `WHERE`

You can specify conditions in a query, and only rows matching those conditions will be returned. The condition is given by a `WHERE` clause. For instance, to find recent films you can filter by year:

```SQL
SELECT id, title, year
FROM movie
WHERE year > 1980;
```



If you want to give more than one condition, they must be connected by logical connectives (`AND`, `OR`) in a single `WHERE` clause:

```SQL
SELECT id, title, year
FROM movie
WHERE (year > 1980
    AND year < 2000)
    OR budget > 1000;
```

### `JOIN`ing tables

To include data from more than one table, include additional tables in the `FROM` clause. Depending on the query, you may want to include fields from several tables in the query's output.

Most often, you'll be wanting to join tables based on common identifiers. These joins will almost always be inner joins:

```SQL
SELECT movie.id, title, job
FROM movie, crew
WHERE movie.id = crew.movie_id;
```

You can refer to columns in different tables by prepending them with the table name, such as `movie.id` or `person.id`. Where there is no confusion between column names in different tables, you can omit the table name.



If you want to join more than one table, you need more than one condition in the `WHERE` clause. Additional filtering also needs to go in the same `WHERE` clause:

```SQL
SELECT movie.id, title, job, name
FROM movie, crew, person
WHERE movie.id = crew.movie_id
    AND crew.person_id = person.id
    AND year > 1980;
```

### Aggregation with `GROUP BY`

You can combine several rows of data with a `GROUP BY` clause, and include aggregate data in the query's output. Use `COUNT` to return the number of items:

```SQL
SELECT year, COUNT(id) AS movies_per_year
FROM movie
GROUP BY year;
```

It is good practice to rename aggregated columns, so use (say) `COUNT(id) AS movies_per_year` rather than just `COUNT(id)`.


Other useful aggregation functions are:

- `SUM` (add up the values in a column)
- `AVG` (find the mean average of the values in a column)
- `MIN` (find the smallest value in a column)
- `MAX` (find the largest value in a column)

For example, to find the average budget for making movies each year:

```SQL
SELECT year, AVG(budget) AS average_budget
FROM movie
GROUP BY year;
```

You can also filter whole groups with a `HAVING` clause. This query lists only the years in which more than 100 movies were made with a total budget of $100 million:

```SQL
SELECT year, SUM(budget) AS total_spent
FROM movie
GROUP BY year
HAVING SUM(budget) > 100000000
    AND COUNT(id) > 100;
```

(You can, and will probably need to, still include a `WHERE` clause together with a `GROUP BY` clause.)


All fields you want to report but not aggregate must be listed in the `GROUP BY` clause:

```SQL
SELECT id, title, COUNT(job) AS crew_size
FROM movie, crew
WHERE id = movie_id
GROUP BY id, title;
```

### `ORDER BY` to sort the returned values

You can order the output of a query with an `ORDER BY` clause:

```SQL
SELECT id, title, year
FROM movie
ORDER BY title
```



The sorting algorithm is [stable](https://en.wikipedia.org/wiki/Sorting_algorithm#Stability), and you can order by several fields. You can change the sort order from the default ascending order (`ASC`) to descending order with the `DESC` modifier:

```SQL
SELECT id, title, year
FROM movie
ORDER BY year DESC, title
```

sorts by `year` in decreasing order, followed by `title` in ascending (alphabetical) order.