# 1. **SQL Language Categories**

SQL is divided into four major language categories based on functionality:



## **1. DDL ‚Äì Data Definition Language**

Used to define and manage the structure of database objects such as tables, schemas, indexes, etc.

| Command      | Description                                                                  |
| ------------ | ---------------------------------------------------------------------------- |
| **CREATE**   | Creates a new database object (table, view, index, etc.).                    |
| **ALTER**    | Modifies the structure of an existing table (e.g., add/drop/rename columns). |
| **DROP**     | Deletes a database object and its data permanently.                          |
| **TRUNCATE** | Removes all rows from a table quickly without logging individual deletions.  |



## **2. DML ‚Äì Data Manipulation Language**

Deals with data operations such as retrieval, insertion, update, and deletion.

| Command    | Description                                              |
| ---------- | -------------------------------------------------------- |
| **SELECT** | Retrieves data from one or more tables.                  |
| **INSERT** | Adds new rows of data to a table.                        |
| **UPDATE** | Modifies existing data in a table.                       |
| **DELETE** | Deletes specific rows from a table based on a condition. |



## **3. DCL ‚Äì Data Control Language**

Handles access control and permissions within the database.

| Command    | Description                                                  |
| ---------- | ------------------------------------------------------------ |
| **GRANT**  | Assigns specific privileges to users (e.g., SELECT, INSERT). |
| **REVOKE** | Removes previously granted privileges from users.            |



## **4. TCL ‚Äì Transaction Control Language**

Manages transactions to ensure database consistency and integrity.

| Command               | Description                                                |
| --------------------- | ---------------------------------------------------------- |
| **START TRANSACTION** | Begins a new transaction block.                            |
| **COMMIT**            | Saves all changes made during the current transaction.     |
| **ROLLBACK**          | Reverts all changes made in the current transaction.       |
| **SAVEPOINT**         | Defines a checkpoint within a transaction to roll back to. |


---
---

# **2. Data Types**

#### **Character Types**

| Type           | Description                          |
| -------------- | ------------------------------------ |
| **CHAR(n)**    | Fixed-length character string.       |
| **VARCHAR(n)** | Variable-length character string.    |
| **TEXT**       | Large text data (long form strings). |

#### **Numeric Types**

| Type                        | Description                                  |
| --------------------------- | -------------------------------------------- |
| **INT / SMALLINT / BIGINT** | Integer types (varying in size).             |
| **DECIMAL(p, s)**           | Fixed-point number with precision and scale. |
| **FLOAT / DOUBLE**          | Approximate floating-point numbers.          |

#### **Date & Time Types**

| Type          | Description                                   |
| ------------- | --------------------------------------------- |
| **DATE**      | Stores date (YYYY-MM-DD).                     |
| **TIME**      | Stores time (HH\:MM\:SS).                     |
| **DATETIME**  | Combines date and time.                       |
| **TIMESTAMP** | Similar to DATETIME, with time zone tracking. |
| **YEAR**      | Stores a year in 4-digit format.              |

---

#### **Common Date Functions**

| Function                                    | Purpose                                     |
| ------------------------------------------- | ------------------------------------------- |
| **YEAR(date\_column)**                      | Extracts the year from a date.              |
| **MONTH(date\_column)**                     | Extracts the month number.                  |
| **DAY(date\_column)**                       | Extracts the day of the month.              |
| **DAYOFWEEK(date\_column)**                 | Returns the weekday index (1=Sunday).       |
| **DAYNAME(date\_column)**                   | Returns the name of the day (e.g., Monday). |
| **DATE\_ADD(date\_column, INTERVAL n DAY)** | Adds days to a date.                        |
| **DATE\_SUB(date\_column, INTERVAL n DAY)** | Subtracts days from a date.                 |
| **DATEDIFF(date1, date2)**                  | Returns number of days between two dates.   |
| **DATE\_FORMAT(date\_column, '%Y-%m-%d')**  | Formats the date in a specific pattern.     |
| **CURDATE()**                               | Returns the current date.                   |
| **NOW()**                                   | Returns the current date and time.          |

> Example usage:
> `AND MONTH(birth_date) IN (2, 5, 12)`  ‚Üí February=2, May=5, December=12


---
---

# 3. Clauses & Core Syntax 


### **SELECT: Retrieve columns or expressions**

Here‚Äôs a neat list of common operations and functions you can use in SQL when selecting or filtering a single column (in `SELECT`, `WHERE`, or `HAVING`):



### **Text/String Functions**

| Function                           | Purpose                        |
| ---------------------------------- | ------------------------------ |
| `LOWER(column)`                    | convert text to lowercase      |
| `UPPER(column)`                    | convert text to uppercase      |
| `CONCAT(str1, str2)`               | combine strings                |
| `TRIM(column)`                     | remove leading/trailing spaces |
| `SUBSTRING(column, start, length)` | get part of a string           |
| `LENGTH(column)`                   | length of string               |
| `REPLACE(column, 'old', 'new')`    | replace substring              |



### **Boolean Aggregation Notes**

* `COUNT(gender='M')` counts all rows because the boolean expression is never NULL, so it behaves like `COUNT(*)`.
* `SUM(gender='M')` treats the boolean as 1 (true) or 0 (false), correctly counting how many have gender 'M'.
* `COUNT(CASE WHEN gender='M' THEN 1 END)` also correctly counts rows where gender is 'M' by counting non-null values.
* Use `SUM(gender='M')` or the `CASE` form to accurately count males.
* Avoid using `COUNT(gender='M')` for this purpose because it counts all rows regardless of gender.
* This distinction matters as `COUNT()` counts non-null values, while `SUM()` adds numeric boolean results.



### **Numeric Functions**

| Function                           | Purpose                       |
| ---------------------------------- | ----------------------------- |
| `ROUND(column, n)`                 | round to n decimal places     |
| `FLOOR(column)`                    | round down to nearest integer |
| `CEIL(column)` / `CEILING(column)` | round up to nearest integer   |
| `ABS(column)`                      | absolute value                |
| `MOD(column, n)`                   | remainder of division         |

Additional summary:

* `FLOOR(value)` always drops the decimal, making it the next lowest whole number (e.g., 10.82 ‚Üí 10).
* `ROUND(value, d)` rounds to the nearest at d decimals (e.g., 10.82 ‚Üí 10.8 with 1 decimal, ‚Üí 11 with 0).



### **Date Functions**

| Function                                       | Purpose                       |
| ---------------------------------------------- | ----------------------------- |
| `YEAR(column)`, `MONTH(column)`, `DAY(column)` | extract parts of date         |
| `DATE_FORMAT(column, '%Y-%m-%d')`              | format date as string         |
| `DATE_ADD(column, INTERVAL n DAY)`             | add days (or other intervals) |
| `DATE_SUB(column, INTERVAL n MONTH)`           | subtract intervals            |
| `DATEDIFF(date1, date2)`                       | difference in days            |
| `CURDATE()`, `NOW()`                           | current date or datetime      |



### **Window Functions & ORDER BY Behavior**

* Even if you use `ORDER BY` at the end of your query to sort the final results, window functions like `LAG()` need their own explicit `ORDER BY` inside the `OVER()` clause to know the sequence for calculating values.
* The `ORDER BY` inside `OVER()` tells `LAG()` which row comes before which, so it can fetch the previous day‚Äôs count correctly.
* The final `ORDER BY` sorts the entire result set for display.
* They serve two different purposes:

  * `ORDER BY` inside `OVER()` ‚Üí defines order for window calculation.
  * `ORDER BY` at the end ‚Üí defines order for the output rows.
* Without the first one, `LAG()` wouldn't know the correct previous row.
* Aggregates: `SUM() OVER`, `AVG() OVER`, etc.



### **Conditional & Null Handling**

| Function                                         | Description                     |
| ------------------------------------------------ | ------------------------------- |
| `IF(condition, value_if_true, value_if_false)`   | inline if                       |
| `CASE WHEN condition THEN result ELSE other END` | complex conditions              |
| `COALESCE(column, default_value)`                | replace NULL with default       |
| `IFNULL(column, default_value)`                  | same as COALESCE for two values |

Boolean & comparison operations:

* `IS NULL / IS NOT NULL` ‚Äî check for nulls
* `IN (value1, value2, ...)` ‚Äî match list of values
* `LIKE 'pattern%'` ‚Äî pattern matching
* `BETWEEN val1 AND val2` ‚Äî range check

> We can write directly without `CASE` in the `SELECT`, like `> 30`, so whichever is greater will return `TRUE` and the remaining will return `FALSE`.



### **Percentage & Conversion**

Convert percentage to decimal:
`column / 100.0`

Convert decimal to percentage:
`column * 100`

Format decimal as percentage string:
`CONCAT(ROUND(column * 100, 2), '%')`



### **Example usage in SELECT**

```sql
SELECT 
  LOWER(first_name) AS first_lower, 
  ROUND(score, 1) AS score_rounded, 
  COALESCE(allergies, 'None') AS allergies_or_none, 
  CONCAT(ROUND(percentage * 100, 2), '%') AS percent_display 
FROM patients 
WHERE UPPER(status) = 'ACTIVE' 
  AND score >= 75 
HAVING COUNT(*) > 1;
```



### **Query Clauses Overview**

| Clause     | Purpose                          |
| ---------- | -------------------------------- |
| `FROM`     | Specify source table(s).         |
| `WHERE`    | Filter rows before grouping.     |
| `GROUP BY` | Group rows sharing values.       |
| `HAVING`   | Filter groups after aggregation. |
| `ORDER BY` | Sort the final result.           |

* `AND` is invalid in `ORDER BY`; use a comma like `ORDER BY LENGTH(first_name), first_name` to sort by multiple columns.
* Yes, you can write `ORDER BY allergies ASC, first_name DESC, last_name ASC` to sort with mixed directions.



### **ORDER BY Methods for Numeric Columns**

* Ascending (default): `ORDER BY column_name`
* Descending: `ORDER BY column_name DESC`
* Absolute value (smallest magnitude first): `ORDER BY ABS(column_name)`
* Nulls last (if supported): `ORDER BY column_name ASC NULLS LAST`



### **ORDER BY Methods for String Columns**

* Alphabetical A‚ÄìZ (default): `ORDER BY column_name`
* Reverse alphabetical Z‚ÄìA: `ORDER BY column_name DESC`
* By length of string: `ORDER BY LENGTH(column_name)`
* Case-insensitive sort: `ORDER BY LOWER(column_name)`
* Custom substring sort (e.g., by last 3 characters): `ORDER BY RIGHT(column_name, 3)`


### **Other Clauses**

* `LIMIT` / `TOP`: Restrict row count.
* `DISTINCT`: Remove duplicate rows.
* `PARTITION BY ... OVER`: Define window partitions for functions.



---
---


# 4. Operators & Expressions 


### **Comparison Operators**

| Operator     | Meaning                  |
| ------------ | ------------------------ |
| `=`          | Equal                    |
| `!=` or `<>` | Not equal                |
| `>`          | Greater than             |
| `<`          | Less than                |
| `>=`         | Greater than or equal to |
| `<=`         | Less than or equal to    |



### **Logical Operators**

| Operator | Description                                    |
| -------- | ---------------------------------------------- |
| `AND`    | Combine multiple conditions (all must be true) |
| `OR`     | At least one condition must be true            |
| `NOT`    | Negates the condition                          |

* Use: `WHERE allergies IN ('Penicillin', 'Morphine')` ‚Äî it's the correct one-line way to filter for either value.



### **IN Operator**

* Checks if a value is in a list:
  `x IN (a, b, c)`

* ‚úÖ Correct: `IN`

* ‚ùå Incorrect: `IS IN` (will cause a **syntax error**)

**Clarification**:

* `IS` is used only with NULL values: `IS NULL`, `IS NOT NULL`
* It is **not** used with `IN`.



### **Other Operators**

| Operator          | Description                                                                      |
| ----------------- | -------------------------------------------------------------------------------- |
| `BETWEEN a AND b` | Inclusive range                                                                  |
| `LIKE`            | Pattern matching using `%` (any number of characters) and `_` (single character) |
| `IS NULL`         | Tests if value is null                                                           |
| `IS NOT NULL`     | Tests if value is not null                                                       |



---
---

# 5. Pattern Matching Wildcards 

### **Pattern Matching Wildcards**

| Symbol      | Meaning                                              | Example                                                  |
| ----------- | ---------------------------------------------------- | -------------------------------------------------------- |
| `%`         | Zero or more characters                              | `'a%z'` ‚Üí Starts with 'a', ends with 'z'                 |
| `_`         | Exactly one character                                | `'m___k'` ‚Üí Starts with 'm', 5 characters, ends with 'k' |
| `%cat%`     | Contains 'cat' anywhere                              |                                                          |
| `%ing`      | Ends with 'ing'                                      |                                                          |
| `'pre%ed'`  | Starts with 'pre', ends with 'ed'                    |                                                          |
| `'b___'`    | Exactly 4 characters, starts with 'b'                |                                                          |
| `'x%x'`     | Starts and ends with 'x', length ‚â• 3                 |                                                          |
| `'s____%s'` | Starts and ends with 's', at least 6 characters      |                                                          |
| `'t____r'`  | Exactly 7 characters, starts with 't', ends with 'r' |                                                          |
| `'%\_%'`    | Contains underscore character literally (use escape) |                                                          |

**SQL Server only:**

* `[abc]`, `[^abc]`, `[a-z]`

**Notes:**

* LIKE in SQL can only match simple patterns using `%` and `_`.
* It **cannot** validate character sets, enforce exact counts, or exclude invalid characters like `#` or multiple `@`.

---

### **REGEXP (Regular Expressions in SQL)**

**Purpose:** Match complex string patterns using rules for character sets, positions, repetitions, etc.

**Example Usage:**

```sql
REGEXP '^[A-Za-z][A-Za-z0-9._-]*@leetcode\.com$'
```



### **General Regex Components and Their Meaning**

| Symbol / Construct | Meaning / Rule                        | Example                          |
| ------------------ | ------------------------------------- | -------------------------------- |
| `^`                | Start of the string                   | `^abc` matches "abc" at start    |
| `$`                | End of the string                     | `xyz$` matches "xyz" at end      |
| `.`                | Any single character (except newline) | `a.c` matches "abc", "a1c"       |
| `[abc]`            | Any one character inside brackets     | Matches 'a', 'b', or 'c'         |
| `[a-z]`            | Any one character in range            | Matches any lowercase letter     |
| `[^abc]`           | Any one character NOT in set          | `[^0-9]` matches any non-digit   |
| `*`                | Zero or more of preceding token       | `a*` ‚Üí "", "a", "aa"             |
| `+`                | One or more of preceding token        | `a+` ‚Üí "a", "aa" (not "")        |
| `?`                | Zero or one of preceding token        | `a?` ‚Üí "" or "a"                 |
| `{n}`              | Exactly n times of preceding token    | `a{3}` ‚Üí "aaa"                   |
| `{n,}`             | At least n times                      | `a{2,}` ‚Üí "aa", "aaa", ...       |
| `{n,m}`            | Between n and m times                 | `a{1,3}` ‚Üí "a", "aa", "aaa"      |
| `\`                | Escape special character              | `\.` ‚Üí literal "."               |
| `\d`               | Digit `[0-9]`                         | `\d+` ‚Üí one or more digits       |
| `\D`               | Non-digit                             | `\D+` ‚Üí one or more non-digits   |
| `\w`               | Word character `[a-zA-Z0-9_]`         | `\w+` ‚Üí letters, digits, \_      |
| `\W`               | Non-word character                    | `\W+` ‚Üí anything not in \w       |
| `\s`               | Whitespace character                  | `\s+` ‚Üí spaces, tabs             |
| `\S`               | Non-whitespace character              | `\S+` ‚Üí any non-space character  |
| `(abc)`            | Grouping / capture                    | Captures "abc" for backreference |
| `\|`               | Alternation (OR)                      | Matches either side              |



### **How These Build Up Rules in Regex**

* **Anchors** (`^`, `$`) force match to start/end of string.
* **Character classes** (`[ ... ]`) specify allowed or disallowed characters.
* **Quantifiers** (`*`, `+`, `{n,m}`) control how many times something appears.
* **Escape sequences** (`\.`) treat special characters literally.
* **Groups and alternations** allow complex logical patterns.



### **Example: Email Regex Breakdown**

| Component         | Meaning                                       |
| ----------------- | --------------------------------------------- |
| `^[A-Za-z]`       | Start with a letter                           |
| `[A-Za-z0-9._-]*` | Zero or more of allowed characters after that |
| `@leetcode\.com$` | Must end with "@leetcode.com" (dot escaped)   |




---
---

# 6. Window Functions 

#### **üèÖ Ranking Functions**

* `ROW_NUMBER()` ‚Äî Assigns a unique sequential number to each row within a partition.
* `RANK()` ‚Äî Assigns rank with gaps for ties.
* `DENSE_RANK()` ‚Äî Assigns rank without gaps for ties.



#### **üì¶ Bucketing Function**

* `NTILE(N)` ‚Äî Divides rows into **N** approximately equal-sized groups or buckets.



#### **üìç Navigation Functions**

* `LAG()` ‚Äî Returns the value from the previous row within the same partition.
* `LEAD()` ‚Äî Returns the value from the next row within the same partition.



#### **üß† Important Clarification (ORDER BY in Window Functions)**

> Good question!
> Even if you use `ORDER BY` at the end of your query to sort the final results, window functions like `LAG()` need their own explicit `ORDER BY` inside the `OVER()` clause to know the sequence for calculating values.

* The `ORDER BY` inside `OVER()` tells `LAG()` which row comes before which, so it can fetch the previous day‚Äôs count correctly.
* The final `ORDER BY` sorts the **entire result set** for display.

They serve two different purposes:

| Purpose                    | Description                                   |
| -------------------------- | --------------------------------------------- |
| `ORDER BY` inside `OVER()` | Defines the **order for window calculation**. |
| Final `ORDER BY`           | Defines the **order of output rows**.         |

* Without the first one, `LAG()` wouldn't know the correct previous row.



#### **üìä Aggregates with OVER()**

* `SUM() OVER (...)`
* `AVG() OVER (...)`
* Other aggregate functions can also be used with `OVER()` to compute rolling or partition-based summaries.


---
---

# 7. Aggregate Functions 

- Aggregation functions can be used without GROUP BY to perform calculations over the entire result set. 

- COUNT(): Count rows. 

        Pandas: 
    ``` python 
    df[df['first_name'].value_counts()[df['first_name']] == 1]
    ```
    SQL: 
    ```sql
    SELECT first_name 
        FROM patients 
        GROUP BY first_name 
    HAVING COUNT(*) = 1 
    ```
    Or 
    ```sql
    select first_name 
    from(select first_name,count(first_name) as occurance from patients group by first_name) 
    where occurance=1 
    ```
    Both do the same thing: return rows where first_name occurs only once in the table or DataFrame. 

 

- SUM(): Sum values. 

- AVG(): Average. 

- MIN(), MAX(): Minimum, maximum. 

---
---

# 8. Control Flow 

- CASE WHEN ... THEN ... ELSE ... END conditional logic 
    ```sql
      select  
        sum(case when gender = 'M' then 1 end) as male_count, 
        sum(case when gender = 'F' then 1 end) as female_count  
        from patients;
    ```

- IF(expr, true, false) shorthand in MySQL. 

- IFNULL(a,b), NULLIF(a,b) null handling. 


---
---

# 9. Common Functions 

- String: CONCAT()  
    the CONCAT function in some SQL dialects but uses || as the concatenation operator.) 

    In MySQL, you can‚Äôt use + to concatenate strings. Use CONCAT() instead. 

    Correct query: 
    ```sql
    SELECT CONCAT(UPPER(last_name), ',', LOWER(first_name)) AS full_name 
    FROM patients 
    ORDER BY first_name DESC; 
    ```
- REPLACE(), LENGTH(). 

- Numeric: ROUND(), CEIL(), FLOOR(). 

- Date/Time: NOW(), CURDATE(), DATEDIFF(), DATE_ADD().

---
---

# 10. Permissions & Transactions 

GRANT SELECT, INSERT ON db.table TO 'user'; 
REVOKE UPDATE ON db.table FROM 'user'; 
START TRANSACTION; 
UPDATE accounts SET balance = balance - 100 WHERE id = 1; 
UPDATE accounts SET balance = balance + 100 WHERE id = 2; 
COMMIT; 

 

---
---


# 11. Joins & unions: 

## üîó JOINS

**Purpose:** Combine rows from two or more tables based on related columns.



### **Types of Joins**

| Type         | Description                                                             |
| ------------ | ----------------------------------------------------------------------- |
| `INNER JOIN` | Returns matching rows from both tables.                                 |
| `LEFT JOIN`  | Returns all rows from the **left** table + matched rows from the right. |
| `RIGHT JOIN` | Returns all rows from the **right** table + matched rows from the left. |
| `FULL JOIN`  | Returns all rows from both tables, matched or not.                      |
| `CROSS JOIN` | Produces the Cartesian product (all combinations of both tables).       |
| `SELF JOIN`  | A table is joined to itself using aliases.                              |



### üí° Performance Tip:

> JOINs are usually faster because they let the database **optimize data retrieval using indexes** and avoid repeated subquery executions.

**However**, subqueries may perform well when:

* Using **non-correlated subqueries**
* Doing **existence checks** (`EXISTS`)
* Filtering **aggregates**

üß™ Always **test and analyze query plans**, since performance depends on:

* Data size
* Indexes
* The database engine‚Äôs optimizer



### üîÅ Alternative Join Syntax

```sql
SELECT first_name, last_name, COUNT(*)
FROM doctors p, admissions a
WHERE a.attending_doctor_id = p.doctor_id
GROUP BY p.doctor_id;
```



### üß¨ Multi-Table Join Strategy

> To join multiple tables:
> Use **JOIN clauses one after another** with **ON conditions** linking related keys.

For example:
If **B** is the common link between **A** and **C**, use:

```sql
SELECT ...
FROM B
JOIN A ON ...
JOIN C ON ...
```



## üìö UNIONS

**Purpose:** Combine result sets from multiple `SELECT` statements.



### Types:

| Keyword     | Description                               |
| ----------- | ----------------------------------------- |
| `UNION`     | Merges results and **removes duplicates** |
| `UNION ALL` | Merges results **including duplicates**   |


## üß† CTE ‚Äì Common Table Expression

**Definition:**
A CTE is a temporary, named result set defined using `WITH`.
It helps break down complex queries into readable blocks.

### ‚úÖ Benefits:

* Improves **readability**
* Supports **reusability**
* Avoids **deep nesting** of subqueries



### üìå Example:

Get patients with Epilepsy whose doctor is Lisa

```sql
WITH epilepsy_patients AS (
  SELECT a.patient_id, p.first_name, p.last_name, a.attending_doctor_id
  FROM patients p
  JOIN admissions a ON p.patient_id = a.patient_id
  WHERE a.diagnosis = 'Epilepsy'
)
SELECT ep.patient_id, ep.first_name, ep.last_name, d.specialty
FROM epilepsy_patients ep
JOIN doctors d ON ep.attending_doctor_id = d.doctor_id
WHERE d.first_name = 'Lisa';
```


---
---

# 11. Confused Notes

You've gathered an *excellent*, dense collection of SQL concept notes ‚Äî well done! üëè
Below is a **cleaned-up, categorized, and structured version** of your **Confusion Notes**, keeping everything intact while improving readability and memorization:

---

## üßπ **Data Modification Commands**

| Command    | Purpose                                                               |
| ---------- | --------------------------------------------------------------------- |
| `DELETE`   | Removes selected rows (can have `WHERE`), rollbackable, logged.       |
| `TRUNCATE` | Removes all rows, **no WHERE**, faster, not rollbackable in many DBs. |
| `DROP`     | Deletes the entire table structure (schema + data).                   |

---

## üßæ **Insert vs Update**

* `INSERT`: Adds new rows to a table.
* `UPDATE`: Modifies existing rows based on condition.

---

## üîë **Primary Key vs Foreign Key**

| Constraint      | Duplicates | Nulls | Use Case                                |
| --------------- | ---------- | ----- | --------------------------------------- |
| **Primary Key** | ‚ùå No       | ‚ùå No  | Uniquely identifies each row            |
| **Foreign Key** | ‚úÖ Yes      | ‚úÖ Yes | References primary key in another table |

üß† *Example:*

* `Customers.customer_id`: **Primary key** ‚Üí unique.
* `Orders.customer_id`: **Foreign key** ‚Üí can repeat.

---

## üìä **Group By, Aggregation, and Logical Execution Order**

üßÆ **Important Execution Order (behind the scenes):**

1. `FROM`
2. `JOIN`
3. `WHERE`
4. `GROUP BY`
5. `HAVING`
6. `SELECT`
7. `ORDER BY`
8. `LIMIT`

> üîç **Logical evaluation differs from writing order**. For example:

```sql
SELECT FLOOR(weight / 10) * 10 AS weight_group, COUNT(*)
FROM patients
GROUP BY FLOOR(weight / 10) * 10;
```

* `FLOOR(weight / 10) * 10` is computed before `GROUP BY` despite appearing in `SELECT`.

---

## üéõ **Where vs Having**

| Clause   | Filters...              | Can use aggregates? |
| -------- | ----------------------- | ------------------- |
| `WHERE`  | Before grouping (rows)  | ‚ùå No                |
| `HAVING` | After grouping (groups) | ‚úÖ Yes               |

---

## ‚öñ **Integer Division & Casting**

* `175 / 100` = `1` (integer division)
* `175 / 100.0` = `1.75` (decimal)
* Use `CAST(height AS FLOAT)` or divide by `100.0` for BMI.

---

## üë• **Gender Counts ‚Äî Multiple Methods**

```sql
SELECT  
  (SELECT COUNT(*) FROM patients WHERE gender='M') AS male_count,  
  (SELECT COUNT(*) FROM patients WHERE gender='F') AS female_count;

SELECT  
  SUM(gender = 'M') AS male_count,  
  SUM(gender = 'F') AS female_count  
FROM patients;

SELECT  
  SUM(CASE WHEN gender = 'M' THEN 1 END) AS male_count,  
  SUM(CASE WHEN gender = 'F' THEN 1 END) AS female_count  
FROM patients;
```

---

## üìå **Miscellaneous Important Notes**

* `COUNT()` **never returns NULL**.

* `WHERE allergies IS NOT NULL` should be before `GROUP BY`; otherwise, use `HAVING`.

* Default value in `SELECT`:

```sql
SELECT first_name, last_name, 'Doctor' AS role FROM doctors;
```

---

## üîç **Query Examples**

### ‚û§ **Most recent admission for patient 542:**

```sql
SELECT *
FROM admissions
WHERE patient_id = 542
ORDER BY admission_date DESC
LIMIT 1;
```

### ‚û§ **Duplicates by name:**

```sql
SELECT first_name, last_name, COUNT(*) AS duplicate_count
FROM patients
GROUP BY first_name, last_name
HAVING COUNT(*) > 1;
```

### ‚û§ **Patients not in admissions:**

```sql
SELECT p.patient_id, p.first_name, p.last_name
FROM patients p
LEFT JOIN admissions a ON p.patient_id = a.patient_id
WHERE a.patient_id IS NULL;
```

### ‚û§ **Ontario always first in province list:**

```sql
SELECT name
FROM provinces
ORDER BY (name != 'Ontario'), name ASC;
```

---

## üß† **DISTINCT Note**

* `DISTINCT` applies to **entire rows**, not individual columns.

---

## üîÑ **Join Alternatives**

You can **omit `JOIN` clause** and use `WHERE`:

```sql
SELECT ...
FROM table1 t1, table2 t2
WHERE t1.id = t2.id;
```

But without `ON`, an `INNER JOIN` becomes a **CROSS JOIN**, and `WHERE` acts as the join condition.

---

## üìê **Efficiency Tips**

1. Joins
2. Non-correlated subqueries
3. Correlated subqueries

---

## üîç **Window Functions Review**

### ‚û§ **ROW\_NUMBER():**

Assigns a unique rank per row, e.g., for sequencing:

```sql
ROW_NUMBER() OVER (ORDER BY id) AS row_num
```

### ‚û§ **PARTITION BY vs GROUP BY:**

| Feature     | GROUP BY                  | PARTITION BY                    |
| ----------- | ------------------------- | ------------------------------- |
| Output Rows | Reduces (1 row per group) | Maintains original row count    |
| Aggregates  | Required                  | Optional with window functions  |
| Usage       | SUM, COUNT                | RANK, ROW\_NUMBER, FIRST\_VALUE |

---

## ‚è≥ **Null Rules Summary**

* NULL means "unknown"
* NULL = anything ‚Üí NULL (false in `WHERE`)
* Use `IS NULL` or `IS NOT NULL`
* `NULL < 1000` ‚Üí FALSE
* In expressions: `NULL + 1` = NULL

---

## üìÖ **Date Comparisons**

### ‚û§ **Compare with previous day:**

```sql
ON w1.recordDate = w2.recordDate + INTERVAL 1 DAY
```

First row will yield `NULL`, thus excluded automatically in WHERE.

---

## üß® **Nested Aggregates?**

* ‚ùå You **cannot use aggregate functions inside another aggregate** (like `SUM(MAX(...))`).

