# SELECT ... FROM ... WHERE ... GROUP BY ... HAVING ... ORDER BY [LIMIT]


## SELECT DISTINCT

```mysql
SELECT DISTINCT col FROM ...
```

    In the example above, if col has NULL values, MySQL keeps only one NULL value because DISTINCT treats all NULL values as the same value.


## WHERE

```mysql
WHERE x1='a' AND x2 < '2016-05-28'
WHERE (x1='a' AND x2 < '2016-05-28') OR (x1='b' AND x3 IS NULL)
WHERE x IS NULL AND NOT (... OR ...)
WHERE x <> 'a'
WHERE x BETWEEN '2020-01-20' AND '2020-12-03'
WHERE x = (SELECT ...)

WHERE x <> (SELECT ...)     # is wrong when (SELECT ...) returns a set of values.

WHERE x IN (SELECT ...)
WHERE x IN ('apple','orange')
WHERE x NOT IN (...)
WHERE x <> ALL (...)

WHERE x LIKE '_a%' or x LIKE '%a'      # % is any number (>= 0) of characters and _ is one character.
WHERE x LIKE '(___)___-____'

WHERE EXISTS (SELECT 1 ...)
WHERE NOT EXISTS (SELECT 1 ...)

WHERE x REGEXP '(.{3}).{3}-.{4}'
```


* Find a row corresponding to the maximum of a column:

```mysql
SELECT ... FROM tbl
WHERE col = (SELECT MAX(col) FROM tbl)
```

* Find rows whose values of a column are greater than the average value of the column:

```mysql
SELECT ... FROM tbl
WHERE col > (SELECT AVG(col) FROM tbl)
```



## GROUP BY


A GROUP BY clause without using an aggregate function is like the DISTINCT clause.

The following are equal:

```mysql
SELECT col FROM tbl GROUP BY col;
SELECT DISTINCT col FROM tbl;
```

We can use an alias in the GROUP BY clause.

```mysql
SELECT expr AS e
FROM tbl
GROUP BY e
```


### GROUP BY ... HAVING

HAVING is similar to WHERE, but it applies to groups rather than to single rows.

HAVING can refer to aliases, but WHERE cannot do so.


```mysql
SELECT x FROM tbl 
GROUP BY x 
HAVING COUNT(x) > 1;

SELECT expr1 AS e1, expr2 AS e2 FROM tbl 
GROUP BY e1
HAVING e2 > 1;
```

### GROUP BY ... WITH ROLLUP, GROUPING()

```mysql
SELECT * FROM t;
+------+-------+----------+
| name | size  | quantity |
+------+-------+----------+
| ball | small |       10 |
| ball | large |       20 |
| hoop | small |       15 |
| hoop | large |        5 |
| ball | small |        8 |
| ball | large |       18 |
| hoop | small |       14 |
| hoop | large |       25 |
+------+-------+----------+

SELECT name, size, SUM(quantity) total
FROM t
GROUP BY name, size WITH ROLLUP;
+------+-------+-------+
| name | size  | total |
+------+-------+-------+
| ball | large |    38 |
| ball | small |    18 |
| ball | NULL  |    56 |
| hoop | large |    30 |
| hoop | small |    29 |
| hoop | NULL  |    59 |
| NULL | NULL  |   115 |
+------+-------+-------+
```

The GROUPING() function returns 1 when NULL occurs in a supper-aggregate row, otherwise, it returns 0.

```mysql
SELECT name, size, SUM(quantity) total, GROUPING(name), GROUPING(size)
FROM t
GROUP BY name, size WITH ROLLUP;
+------+-------+-------+----------------+----------------+
| name | size  | total | GROUPING(name) | GROUPING(size) |
+------+-------+-------+----------------+----------------+
| ball | large |    38 |              0 |              0 |
| ball | small |    18 |              0 |              0 |
| ball | NULL  |    56 |              0 |              1 |
| hoop | large |    30 |              0 |              0 |
| hoop | small |    29 |              0 |              0 |
| hoop | NULL  |    59 |              0 |              1 |
| NULL | NULL  |   115 |              1 |              1 |
+------+-------+-------+----------------+----------------+
```

```mysql
SELECT IF(GROUPING(name),'all_names',name) name, IF(GROUPING(size),'all_sizes',size) size, SUM(quantity) total
FROM t
GROUP BY name, size WITH ROLLUP;
+-----------+-----------+-------+
| name      | size      | total |
+-----------+-----------+-------+
| ball      | large     |    38 |
| ball      | small     |    18 |
| ball      | all_sizes |    56 |
| hoop      | large     |    30 |
| hoop      | small     |    29 |
| hoop      | all_sizes |    59 |
| all_names | all_sizes |   115 |
+-----------+-----------+-------+
```


## ORDER BY [FIELD()]

```mysql
ORDER BY x, y;

ORDER BY 2, 7;    # index of column

ORDER BY x DESC, y;

ORDER BY LEFT(x, 3);

ORDER BY RAND();   # Rearrange rows randomly

ORDER BY FIELD(col_name, val1, val2, val3);
# If x is a value in the column col_name, FIELD() returns 1 if x is val1, 2 if x is val2, 3 if x is val3, 0 otherwise.
```


## Derived table, Subquery

The following raises an error if an alias is not used:

```mysql
SELECT ... 
FROM (SELECT ... FROM ...) AS t;
```

If a subquery uses the data from its outer query, the subquery is evaluated once for each row in the outer query.

```mysql
SELECT x FROM t AS t1
WHERE x > (SELECT AVG(x) FROM t WHERE y = t1.y) 
```

# INNER JOIN, LEFT JOIN, RIGHT JOIN

In MySQL, JOIN, CROSS JOIN, and INNER JOIN (without using ON or USING) are equivalent.


```mysql
SELECT t1.x, t2.y
FROM tbl1 AS t1 INNER JOIN tbl2 AS t2
ON t1.a = t2.b         # or USING (col_name) if column names are equal
WHERE ...
```

You can use other operators in ON:

```mysql
...
ON t1.col1 = t2.col2 AND t1.col3 > t2.col4 
```


Be careful when you use a condition in a ON clause and in a WHERE clause. The following joins two tables by a left join and then select the rows whose id are 100.

```mysql
SELECT t1.id, t2.amount
FROM t1 LEFT JOIN t2 ON t1.id = t2.id 
WHERE t1.id = 100  
```

Meawhile, the following shows all rows of t1. If t1.id is not 100, then the row is of the form (id, NULL). Why? If t1.id is not 100, then the row does not satisfy the two conditions: t1.id = t2.id AND t1.id = 100. But since we use a left join, the row appears on the result and the values belong to t2 will be NULL. 

```mysql
SELECT t1.id, t2.amount
FROM t1 LEFT JOIN t2 ON t1.id = t2.id AND t1.id = 100
```

## Example: Use a self-join

Let col1 and col2 be two columns of tbl. We want to see the distinct rows of the form (val1, val2_1, val2_2), where val1 is a value in col1 and val2_1 and val2_2 are values in col2.

```mysql
SELECT t1.col1, t1.col2, t2.col2
FROM tbl t1 INNER JOIN tbl t2
ON t1.col1 = t2.col1 AND t1.col1 > t2.col2
[ORDER BY t1.col1, t1.col2, t2.col2];
```

## Example: Use all types of join

Consider the following data:

| table | columns |
| --- | --- |
| t1 | id, c1 |
| t2 | id, c2 |
| t3 | id1, id2, c3 |

where id1 and id2 in t3 are foreign keys referncing t1.id and t2.id, respectively.


The following may not show all combinations of (c1, c2), since we use inner joins.

```mysql
SELECT c1, c2, SUM(c3) AS total
FROM t3 
    INNER JOIN t1 ON t1.id = t3.id1
    INNER JOIN t2 ON t2.id = t3.id2
GROUP BY c1, c2;
```

Suppose we want to see all combinations of (c1, c2). If a pair of (c1, c2) does not exist in the inner-joinned table, the value of total in the row will be set to 0.

Step1: Make all combinations of t1 and t2 using the cross join:

```mysql
SELECT c1, c2, t1.id, t2.id                 # or SELECT *
FROM t1 JOIN t2 # or use CROSS JOIN
```

Step2: Make the above inner-joinned table.

```mysql
SELECT c1, c2, SUM(c3) AS total, t1.id, t2.id 
FROM t3 
    INNER JOIN t1 ON t1.id = t3.id1
    INNER JOIN t2 ON t2.id = t3.id2
GROUP BY c1, c2;
```

Step3: Join the two tables created in Step1 and Step2 by LEFT JOIN on id. Note that we need to modify the column total properly.

```mysql
SELECT t1.c1, t2.c2, IFNULL(t4.total, 0) AS total 
FROM t1 JOIN t2 LEFT JOIN
    (SELECT c1, c2, SUM(c3) AS total, t1.id AS t1_id, t2.id AS t2_id
     FROM t3 INNER JOIN t1 ON t1.id = t3.id1 INNER JOIN t2 ON t2.id = t3.id2
     GROUP BY c1, c2) AS t4 ON t4.t1_id = t1.id AND t4.t2_id = t2.id
[ORDER BY t1.c1, t2.c2];
```

# UNION, INTERSECT, MINUS


## UNION

UNION [DISTINCE] removes duplicate rows, but UNION ALL keeps all rows.

```mysql
SELECT ... UNION [ALL | DISTINCT] SELECT ...;

expr_1 UNION expr_2
```
where if `expr_i` is of the form `SELECT ... FROM t` or `TABLE t`.


## INTERSECT

MySQL does not support INTERSECT, but we can implement it easily.

```mysql
(SELECT c1 FROM t1) INTERSECT (SELECT c2 FROM t2);

# is equivalent either to
SELECT DISTINCT c1 FROM t1 INNER JOIN t2 ON t1.c1 = t2.c2;

# or to
SELECT DISTINCT c1 FROM t1 WHERE c1 IN (SELECT c2 FROM t2);
```

## MINUS

MySQL does not support MINUS, but we can implement it easily.


```mysql
(SELECT c1 FROM t1) MINUS (SELECT c2 FROM t2);

# is equivalent to
SELECT c1 FROM t1 LEFT JOIN t2 ON t1.c1 = t2.c2 WHERE t2.c2 IS NULL;

# or to
SELECT c1 FROM t1 WHERE c1 NOT IN (SELECT c2 FROM t2);
```