## Intro to SQL for Data Science

## Chapter 1. Selecting columns

```SQL
1. SELECT single_column_name FROM table_name;

```

```SQL
2. SELECT first_column_name, second_column_name, ... FROM table_name;
```

```SQL
3. SELECT * FROM table_name;
```
```
    * - means all columns
```

```SQL
4. SELECT * FROM table_name LIMIT N;
```
```
    'N' - number of querying rows
```

```SQL
5. SELECT DISTINCT column_name FROM table_name;
```
```
    * 'DISTINCT' - only unique values
```

```SQL
6. SELECT COUNT (*) FROM table_name;
```
```
    * COUNT - for counting rows
    * If you want to count the number of non-missing values in a particular column, you can call COUNT on just that column
```

```SQL
7. SELECT COUNT(DISTINCT column_name) FROM table_name;
```
```
    * To count the number of unique values in a column
```

## Chapter 2. Filtering rows

```SQL
8. SELECT column_name FROM table_name WHERE condition;
```
```
    * Condition: column_value ? querying_value
        * = equal
        * <> not equal
        * < less than
        * > greater than
        * <= less than or equal to
        * >= greater than or equal to
```

```SQL
9. SELECT column_name FROM table_name WHERE condition1 AND condition2;
```
```
    * You can add as many AND conditions as you need
```

```SQL
10. SELECT column_name FROM table_name WHERE condition1 OR condition2;
```

```SQL
11. SELECT column_name FROM table_name WHERE (condition1 OR condition2) AND (condition1 OR condition2);
```

```SQL
12. SELECT column_name FROM table_name WHERE condition1 BETWEEN value1 AND value2;
```
```
    * Filter values in a specified range
```

```SQL
12. SELECT column_name FROM table_name WHERE condition1 IN (value1, value2, value3, ..., valueN);
```

```SQL
13. SELECT COUNT (*) FROM table1 WHERE column1 IS NULL;
```
```
    * Count the number of missing values in colmn1 in the table1
    * NULL represents a missing or unknown value
```

```SQL
14. SELECT column1
    FROM table1
    WHERE column1
    LIKE 'Data%';
```
```
    * Will return 'DataCamp', 'DataC', 'Data0', 'Data01234'
    * LIKE / NOT LIKE with querying pattern
    * % matches many characters
    * _ matches single character
```

## Chapter 3. Aggregate Functions

```SQL
15. SELECT AVG(column1) FROM table1;
```
```
    * AVG return average value
    * MAX
    * MIN
    * SUM
```

```SQL
16. SELECT (4 * 3);
    SELECT (4 / 3);
    SELECT (4.0 / 3.0) AS result;
```
```
    * Multiplication returns 12
    * Division integer by integer returns 3
    * Division float by float returts 1.333
```

```SQL
17. SELECT MAX(column1) AS max_column1,
    SELECT MAX(column2) AS max_column2
    FROM table1;
```
```
    * Gives alias for 1st SELECT - max_column1
    * Gives alias for 2nd SELECT - max_column2
```

```SQL
18. SELECT COUNT(deathdate) * 100.0 / COUNT(*)
    AS percentage_dead
    FROM people;
```
```
    * Get percentage of dead people
```

```SQL
19. SELECT COUNT(deathdate) * 100.0 / COUNT(*)
    AS percentage_dead
    FROM people;
```
```
    * Get the number of decades the films table covers
```

## Chapter 4. Sorting, grouping and joins

```SQL
20. SELECT column1
    FROM table1
    ORDER BY column2 DESC;
```
```
    * ORDER BY sorts results in ascending or descending order (ascending by default or alphabetically a-z)
    * ORDER BY can sort by the first column specified, then sort by the next, and so on
    * To specify multiple columns you separate the column names with a comma
    * The order of columns is important!
```

```SQL
21. SELECT column1, count(*)
    FROM table1
    GROUP BY column1;
```
```
    * ORDER BY always goes after GROUP BY
```
column1   | count
:---------|:-------
 value1   | count1  
 value2   | count2

```SQL
22. SELECT release_year
    FROM films
    GROUP BY release_year
    HAVING COUNT(title) > 10;
```
```
    * Shows only those years in which more than 10 films were released
    * HAVING is used instead of WHERE
```

```SQL
23. SELECT release_year, AVG(budget) AS avg_budget,
    AVG(gross) AS avg_gross
    FROM films
    WHERE release_year > 1990
    GROUP BY release_year
    HAVING AVG(budget) > 60000000
    ORDER BY avg_gross DESC
```

```SQL
24. SELECT country, AVG(budget) AS avg_budget, AVG(gross) AS avg_gross
    FROM films
    GROUP BY country
    HAVING COUNT(country) > 10
    ORDER BY country
    LIMIT 5
```
```
Get the country, average budget, and average gross take of countries that have made more than 10 films. Order the result by country name, and limit the number of results displayed to 5. You should alias the averages as avg_budget and avg_gross respectively
```

```SQL
25. SELECT title, imdb_score
    FROM films
    JOIN reviews
    ON films.id = reviews.film_id
    WHERE title = 'To Kill a Mockingbird';
```
```
In this case, you'd want to get the ID of the movie from the films table and then use it to get IMDB information from the reviews table
```