# NOTES

### general notes

this notebook is intended to be used to practice SQL (MySQL) through individual questions, as well as gather notes on common topics, best practices, and any information that is new to me. studying SQL courses does not seem to be very effective for various reasons (slow, repetitive but information doesn't seem to stick, boring); trying this approach instead. will attempt to implement what is learned in projects further down the line. 

have not currently found a way to format this notebook nicely, so mostly only using markdown cells. doesn't look great but will do for now

1. problems denoted by * proved to be quite challenging for me upon first attempt
2. problems denoted by ** are exceedingly challenging and will require a lot of revision
3. problems denoted by @ remain incomplete
4. in leetcode, problems with notes in green are first attempts/solutions or reworks after failure. those with notes in blue are review

### study notes

1. study in more depth aggregations functions (COUNT(), MAX(), AVG(), etc.) and their impact on GROUP BY and ORDER BY clauses
2. might be worth just looking into GROUP BY by itself, to really understand what it does to the data

## Easy Problems

### 175. Combine Two Tables

#### <u>description</u>

Table: Person

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | personId    | int     |
    | lastName    | varchar |
    | firstName   | varchar |
    +-------------+---------+

personId is the primary key (column with unique values) for this table.
This table contains information about the ID of some persons and their first and last names.

Table: Address

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | addressId   | int     |
    | personId    | int     |
    | city        | varchar |
    | state       | varchar |
    +-------------+---------+

addressId is the primary key (column with unique values) for this table.
Each row of this table contains information about the city and state of one person with ID = PersonId.

Write a solution to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.

Return the result table in any order.

#### <u>attempts</u>

NOTE: need a left join

    SELECT person.firstName, person.LastName, address.city, address.state
    FROM Person person
    LEFT JOIN Address address ON person.personId = address.personId 
    
status = success

#### <u>solutions</u>

from someone on leetcode, haven't seen this before

    SELECT firstname, lastname, city, state FROM person
    LEFT JOIN address USING(personid)

#### <u>notes</u>

on joins: To combine two SQL tables based on a common ID, you typically use a JOIN operation. The most common types of joins are INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN), and FULL JOIN (or FULL OUTER JOIN). Here's a brief explanation of each:

1. INNER JOIN: Combines rows from both tables only when there is a match on the specified column (common ID). Rows without matching values in either table are excluded.

2. LEFT JOIN (LEFT OUTER JOIN): Combines all rows from the left table and the matched rows from the right table. If there is no match, NULL values are returned for columns from the right table.

3. RIGHT JOIN (RIGHT OUTER JOIN): Combines all rows from the right table and the matched rows from the left table. If there is no match, NULL values are returned for columns from the left table.

4. FULL JOIN (FULL OUTER JOIN): Combines rows when there is a match in one of the tables. Rows without matching values in either table will also be included, with NULL values for columns from the table that lacks a match.

Example of LEFT JOIN:

    SELECT a.id, a.column1, a.column2, b.column3, b.column4
    FROM table1 a
    LEFT JOIN table2 b ON a.id = b.id;
    
In this query:

* table1 and table2 are the tables you want to combine.
* a and b are aliases for table1 and table2, respectively.
* id is the common column in both tables.

Example of FULL JOIN:

Some databases like MySQL do not support FULL JOIN. You might need to use a combination of LEFT JOIN and RIGHT JOIN with a UNION in such cases. If you need to ensure compatibility across different SQL databases, you can use a UNION to simulate a FULL JOIN:

    SELECT a.id, a.column1, a.column2, b.column3, b.column4
    FROM table1 a
    LEFT JOIN table2 b ON a.id = b.id
    UNION
    SELECT a.id, a.column1, a.column2, b.column3, b.column4
    FROM table1 a
    RIGHT JOIN table2 b ON a.id = b.id;
   
**the above code on the FULL JOIN is from GPT and has not been tested... take with a grain of salt**

### 181. Employees Earning More Than Their Managers

#### <u>description</u>

Table: Employee

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | id          | int     |
    | name        | varchar |
    | salary      | int     |
    | managerId   | int     |
    +-------------+---------+

id is the primary key (column with unique values) for this table.
Each row of this table indicates the ID of an employee, their name, salary, and the ID of their manager.
 

Write a solution to find the employees who earn more than their managers.

Return the result table in any order

#### <u>attempts</u>

NOTE: aliases didn't seem necessary in problem 175, but this problem seems to highlight their importance

    SELECT 
        e1.name AS Employee
    FROM 
        Employee e1
    JOIN 
        Employee e2 ON e1.managerId = e2.id
    WHERE 
        e1.salary > e2.salary;
        
status = success

#### <u>solutions</u>

from leetcode, slightly different approach - no joins seem necessary

    SELECT e1.name AS Employee
    FROM Employee e1, Employee e2
    WHERE e1.managerId = e2.id AND e1.salary > e2.salary;

#### <u>notes</u>

explanation of the attempt: To solve this problem, you can use a self-join on the Employee table to compare each employee's salary with their manager's salary.

* Employee e1 is an alias for the employee table to represent the employees.
* Employee e2 is an alias for the employee table to represent the managers.
* The JOIN clause matches each employee (e1) with their manager (e2) by comparing e1.managerId with e2.id.
* The WHERE clause filters the results to include only those employees whose salary (e1.salary) is greater than their manager's salary (e2.salary).
* The SELECT clause returns the names of the employees who meet the condition.

This query will return the employees who earn more than their managers, as specified in the problem statement

### 182. Duplicate Emails

#### <u>description</u>

Table: Person

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | id          | int     |
    | email       | varchar |
    +-------------+---------+

id is the primary key (column with unique values) for this table.
Each row of this table contains an email. The emails will not contain uppercase letters.
 

Write a solution to report all the duplicate emails. Note that it's guaranteed that the email field is not NULL.



#### <u>attempts</u>

NOTE: haven't used the HAVING keyword before

    SELECT email FROM Person
    GROUP BY email
    HAVING COUNT(email) > 1
    
status = success

#### <u>solutions</u>

this seemed like another popular solution, but doesn't make a whole lot of sense to me...

    SELECT DISTINCT P1.Email FROM Person P1,Person P2 
    WHERE P1.id <> P2.id AND P1.Email=P2.Email
    
here's a breakdown:
1. FROM Person P1, Person P2:

    * This creates two references to the Person table, named P1 and P2. This technique is known as a self-join, where the table is joined with itself.

2. WHERE P1.id <> P2.id:

    * This condition ensures that the join is not comparing the same row with itself by checking that the id of P1 is different from the id of P2.

3. AND P1.Email = P2.Email:

    * This condition ensures that the email addresses in the two rows being compared are the same. This means we are finding pairs of rows where the email is the same but the IDs are different.

4. SELECT DISTINCT P1.Email:

    * SELECT DISTINCT is used to ensure that the result contains unique email addresses. Even if an email appears multiple times as a duplicate in different pairs, it will only be listed once in the result.

#### <u>notes</u>

difference between HAVING and WHERE: The HAVING and WHERE clauses in SQL are both used to filter data, but they are used in different contexts and for different purposes. Here are the key differences:

1. Usage Context:

    * WHERE Clause: The WHERE clause is used to filter rows before any grouping operations are performed. It is applied to individual rows in a table.
    * HAVING Clause: The HAVING clause is used to filter groups after the GROUP BY operation has been performed. It is applied to aggregated data.
    
2. Filtering Rows vs. Filtering Groups
     * WHERE Clause: Filters individual rows based on specified conditions. It cannot be used with aggregate functions (like COUNT, SUM, AVG, etc.) directly. But a WHERE clause can be used before a GROUP BY 
     * HAVING Clause: Filters groups of rows created by the GROUP BY clause based on aggregate function conditions.
     
3. Execution Order:

    * WHERE Clause: Applied early in the execution process, right after the FROM clause.
    * HAVING Clause: Applied later in the execution process, after the GROUP BY clause and any aggregate functions.
    
example

-- Using WHERE clause to filter rows

    SELECT product, amount
    FROM Sales
    WHERE amount > 100;

-- Using HAVING clause to filter groups

    SELECT region, SUM(amount) AS total_sales
    FROM Sales
    GROUP BY region
    HAVING total_sales > 500;

* The first query uses the WHERE clause to filter individual sales records where the amount is greater than 100.
* The second query uses the HAVING clause to filter regions where the total sales (SUM(amount)) exceed 500.

### 183. Customers Who Never Order

#### <u>descriptions</u> 

Table: Customers

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | id          | int     |
    | name        | varchar |
    +-------------+---------+
    
id is the primary key (column with unique values) for this table.
Each row of this table indicates the ID and name of a customer.
 

Table: Orders

    +-------------+------+
    | Column Name | Type |
    +-------------+------+
    | id          | int  |
    | customerId  | int  |
    +-------------+------+
    
id is the primary key (column with unique values) for this table.
customerId is a foreign key (reference columns) of the ID from the Customers table.
Each row of this table indicates the ID of an order and the ID of the customer who ordered it.
 

Write a solution to find all customers who never order anything.

Return the result table in any order.

#### <u>attempts</u>

NOTE: did a right join initially but that didn't make sense because that causes me to lose the names, so have to do left join again

    SELECT name AS Customers
    FROM Customers table1
    LEFT JOIN Orders table2 ON table1.id = table2.customerId
    WHERE table2.id IS NULL
    
status = success

#### <u>solutions</u>

from someone on leetcode, more concise and even eaiser to read

    SELECT name as Customers FROM Customers
    WHERE id NOT IN (SELECT customerid FROM orders)

#### <u>notes</u>

no notes as this time - this problem is fairly straightforward

### 196. Delete Duplicate Emails

#### <u>description</u>

Table: Person

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | id          | int     |
    | email       | varchar |
    +-------------+---------+
    
id is the primary key (column with unique values) for this table.
Each row of this table contains an email. The emails will not contain uppercase letters.
 

Write a solution to delete all duplicate emails, keeping only one unique email with the smallest id.

For SQL users, please note that you are supposed to write a DELETE statement and not a SELECT one.

For Pandas users, please note that you are supposed to modify Person in place.

After running your script, the answer shown is the Person table. The driver will first compile and run your piece of code and then show the Person table. The final order of the Person table does not matter.

#### <u>attempts</u>

NOTE: a bit confused about the WHERE statement; was sure that the condition was t1.id < t2.id since you want to keep the smallest id. same logic for id in the solutions... there's something about the table layout that i don't get from the join

    DELETE t1 FROM Person t1 
    INNER JOIN Person t2 
    WHERE t1.id > t2.id AND t1.email = t2.email
    
status = success

#### <u>solutions</u>

tried to write this but kept failing; need to review syntax

    DELETE p1 FROM Person p1,Person p2 
    WHERE p1.email=p2.email AND p1.id>p2.id

#### <u>notes</u>

would be useful to have a visual view of the table after the inner join so i can understand why that where clause work. explanation:


**Why t1.id > t2.id?**

When there are duplicate rows in the table, we need to choose one of the duplicates to delete. The WHERE t1.id > t2.id condition is used to ensure that we always delete the record with the higher id value and keep the record with the lower id value. This is a consistent and deterministic way to remove duplicates because:

1. Consistency: By always deleting the record with the higher id, we ensure that only one of the duplicates remains. If we used t1.id < t2.id, it would still remove duplicates, but the specific record being deleted would be different (it would be the one with the lower id), which is less intuitive in terms of preserving the original order of insertion.

2. Efficiency: Deleting the record with the higher id can be seen as preserving the "first" instance of each duplicate (assuming id is an auto-incrementing primary key). This is often desirable because it maintains the earliest inserted record.



### 197. Rising Temperature

#### <u>description</u>

Table: Weather

    +---------------+---------+
    | Column Name   | Type    |
    +---------------+---------+
    | id            | int     |
    | recordDate    | date    |
    | temperature   | int     |
    +---------------+---------+
    
id is the column with unique values for this table.
There are no different rows with the same recordDate.
This table contains information about the temperature on a certain day.
 

Write a solution to find all dates' Id with higher temperatures compared to its previous dates (yesterday).

Return the result table in any order.

#### <u>attempts</u>

NOTE: this almost works, but fails if there are gaps between dates...

    SELECT t2.id FROM (
        SELECT *,
        LAG (temperature, 1)
        OVER (ORDER BY recordDate ASC) AS previous_temp
        FROM Weather
    ) AS t2
    WHERE temperature > previous_temp
    
status = failure

NOTE: the idea here was create 2 lagged columns, one for temperature and one for recordDate. in order for an id to be selected, the current temperature had to be greater than the previous one, and the interval between the current day and the previous day had to be exactly 1 day (to ensure the previous record was indeed yesterday). also beats 99.50% of users apparently

    SELECT t2.id FROM (
        SELECT *,
        LAG (temperature, 1)
        OVER (ORDER BY recordDate ASC) AS previous_temp,
        LAG (recordDate, 1)
        OVER (ORDER BY recordDate ASC) AS previous_date
        FROM Weather
    ) AS t2
    WHERE temperature > previous_temp AND DATE_SUB(recordDate, INTERVAL 1 DAY) = previous_date

status = success

#### <u>solutions</u>

solution from someone on leetcode, a lot more straightforward, but not as fast as mine apparently

    SELECT w1.id
    FROM Weather w1, Weather w2
    WHERE DATEDIFF(w1.recordDate, w2.recordDate) = 1 AND w1.temperature > w2.temperature;

#### <u>notes</u>

using 2 new functions in my successful attempt: LAG() and DATE_SUB()

LAG():

The LAG() window function facilitates access to previous rows based on the offset argument. It can be particularly useful when a comparison of a previous value is necessary without the use of a self join. There is a similarity to the LEAD() function with the difference being the accessible rows. LEAD() accesses subsequent rows while LAG() accesses previous rows.

    LAG (expression [, offset] [, default])
    OVER ( [ partition_by ] order_by )

    1. expression - The column value which will be referenced.
    2. offset - A positive numeric indicator of the previous row to access that is relative to the current row. If not specified the default is 1.
    3. default - The value that will be returned if the offset is out of range. This is an optional argument, if not specified NULL will be returned.

    1. partition_by - Allows the result set to be grouped based on a column. This is an optional argument, if not specified the result set will be treated as a single group.
    2. order_by - Determines the order of the result set. If partition_by is specified, it will order the grouped data instead.
    
DATE_SUB():

The DATE_SUB() function subtracts a time/date interval from a date and then returns the date.

    DATE_SUB(date, INTERVAL value interval)
     
    1. date - Required. The date to be modified
    2. value - Required. The value of the time/date interval to subtract. Both positive and negative values are allowed
    3. interval - Required. The type of interval to subtract. Can be one of the following values

### 511. Game Play Analysis I

#### <u>description</u>

Table: Activity

    +--------------+---------+
    | Column Name  | Type    |
    +--------------+---------+
    | player_id    | int     |
    | device_id    | int     |
    | event_date   | date    |
    | games_played | int     |
    +--------------+---------+

(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.
 

Write a solution to find the first login date for each player.

Return the result table in any order.

#### <u>attempts</u>

NOTE: not entirely sure why this works. without the GROUP BY, it only returns a single row which contains the minimum date from the event_date column. possible i fail to really understand what a group by actually does at this stage

    SELECT player_id, MIN(event_date) as first_login
    FROM Activity
    GROUP BY player_id

status = success

#### <u>solutions</u>

there doesn't seem to be much else to this problem. simple problem but worth keeping in mind

#### <u>notes</u>

this is the explanation from GPT concerning the GROUP BY: 
* GROUP BY player_id: This groups the result by player_id, ensuring that the MIN(event_date) is calculated for each player individually.

### 577. Employee Bonus

#### <u>description</u>

Table: Employee

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | empId       | int     |
    | name        | varchar |
    | supervisor  | int     |
    | salary      | int     |
    +-------------+---------+

empId is the column with unique values for this table.
Each row of this table indicates the name and the ID of an employee in addition to their salary and the id of their manager.
 

Table: Bonus

    +-------------+------+
    | Column Name | Type |
    +-------------+------+
    | empId       | int  |
    | bonus       | int  |
    +-------------+------+

empId is the column of unique values for this table.
empId is a foreign key (reference column) to empId from the Employee table.
Each row of this table contains the id of an employee and their respective bonus.
 

Write a solution to report the name and bonus amount of each employee with a bonus less than 1000.

#### <u>attempts</u>

NOTE: didn't feel like i learned much from this one

    SELECT t1.name, t2.bonus
    FROM Employee t1
    LEFT JOIN Bonus t2 ON t1.empId = t2.empId
    WHERE t2.bonus < 1000 OR t2.bonus IS NULL

status = success

#### <u>solution</u>

a solution from leetcode

    SELECT name, bonus FROM employee
    LEFT JOIN bonus USING(empid)
    WHERE bonus <1000 OR bonus IS NULL

#### <u>notes</u>

difference between ON and USING when joining tables:

1. ON Clause

The ON clause is used to specify the condition on which two tables should be joined. It is flexible and allows you to join tables on columns that have different names, or even use more complex expressions.
    
syntax:

    SELECT columns
    FROM table1
    JOIN table2
    ON table1.column = table2.column;
    
2. USING Clause

The USING clause is used when the columns that are being joined have the same name in both tables. It is a simpler syntax for this specific case.

syntax:

    SELECT columns
    FROM table1
    JOIN table2
    USING (column_name);
    
Key Differences:
1. Column Names:

    * ON can join columns with different names or use more complex conditions.
    * USING requires that the columns being joined have the same name in both tables.
2. Flexibility:

    * ON is more flexible and can handle complex join conditions, including multiple conditions.
    * USING is simpler but limited to cases where the column names are identical. 
3. Resulting Column Names:

    * When using ON, both joined columns will appear in the result set, typically qualified with their table names.
    * When using USING, the resulting table will only include one instance of the joined column, eliminating redundancy.

### 584. Find Customer Referee

#### <u>description</u>

Table: Customer

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | id          | int     |
    | name        | varchar |
    | referee_id  | int     |
    +-------------+---------+
In SQL, id is the primary key column for this table.
Each row of this table indicates the id of a customer, their name, and the id of the customer who referred them.
 

Find the names of the customer that are not referred by the customer with id = 2.

Return the result table in any order.

#### <u>attempts</u>

    SELECT name FROM Customer
    WHERE referee_id <> 2 OR referee_id IS NULL

status = success

#### <u>solutions</u>

a solution from leetcode

    SELECT name
    FROM Customer
    WHERE COALESCE(referee_id,0) <> 2;

#### <u>notes</u>

1. Not Equal Operators

!= and <> can both act as a "not equal" operator

2. COALESCE

The COALESCE function in MySQL is used to get the first non-null value from a list of expressions.

If all the values in the list are evaluated to NULL, then the COALESCE() function returns NULL. The COALESCE() function accepts one parameter, which is the list, which can contain various values

syntax:

    COALESCE(value_1, value_2, …., value_n)
    
* COALESCE() can be used to substitute NULL values in table columns with a default value or an expression.
* COALESCE() is more flexible than IFNULL() as it can handle any number of arguments, while IFNULL() only takes two arguments.

### 586. Customer Placing the Largest Number of Orders

#### <u>description</u>

Table: Orders

    +-----------------+----------+
    | Column Name     | Type     |
    +-----------------+----------+
    | order_number    | int      |
    | customer_number | int      |
    +-----------------+----------+
order_number is the primary key (column with unique values) for this table.
This table contains information about the order ID and the customer ID.
 

Write a solution to find the customer_number for the customer who has placed the largest number of orders.

The test cases are generated so that exactly one customer will have placed more orders than any other customer.

#### <u>attempts</u>

NOTE: this query gives the right customer_number but comes with an additional unwanted column "c", however i can visualize what's happening

    SELECT customer_number, COUNT(order_number) as c
    FROM Orders
    GROUP BY customer_number
    ORDER BY c DESC LIMIT 1

status = failure

NOTE: really not entirely sure why this works i.e hard time visualizing what the GROUP BY and ORDER BY does (also didn't know i could use COUNT with the ORDER BY clause). it's also really efficient apparently

    SELECT customer_number
    FROM Orders
    GROUP BY customer_number
    ORDER BY COUNT(customer_number) DESC LIMIT 1

status = success

#### <u>solutions</u>

from some leetcode solution; almost identical to mine bu the COUNT() is different

    SELECT customer_number
    FROM orders
    GROUP BY customer_number
    ORDER BY COUNT(*) DESC
    LIMIT 1;

#### <u>notes</u>

most solutions has the same answer as me, this is the explanation from one of the solution:

1. Grouping the orders by customer_number.
2. Counting the number of orders for each customer_number.
3. Ordering the results by the count of orders in descending order.
4. Limiting the output to only the first row, which will contain the customer_number with the highest count of orders.

### 595. Big Countries

#### <u>description</u>

Table: World

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | name        | varchar |
    | continent   | varchar |
    | area        | int     |
    | population  | int     |
    | gdp         | bigint  |
    +-------------+---------+
    
name is the primary key (column with unique values) for this table.
Each row of this table gives information about the name of a country, the continent to which it belongs, its area, the population, and its GDP value.
 

A country is big if:

it has an area of at least three million (i.e., 3000000 km2), or
it has a population of at least twenty-five million (i.e., 25000000).
Write a solution to find the name, population, and area of the big countries.

Return the result table in any order.

#### <u>attempts</u>

    SELECT name, population, area FROM World
    WHERE area >= 3000000 OR population >=  25000000

status = success

#### <u>solutions</u>

#### <u>notes</u>

this problem is extremely straightforward

### 596. Classes More than 5 Students

#### <u>description</u>

Table: Courses

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | student     | varchar |
    | class       | varchar |
    +-------------+---------+
(student, class) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates the name of a student and the class in which they are enrolled.
 

Write a solution to find all the classes that have at least five students.

Return the result table in any order.

#### <u>attempts</u>

    SELECT class FROM Courses
    GROUP BY class
    HAVING COUNT(class) >= 5

status = success

#### <u>solutions</u>

this also works, perhaps a bit more intuitive

    SELECT class FROM Courses
    GROUP BY class
    HAVING COUNT(student) >= 5

#### <u>notes</u>

another extremely straightforward problem

### 607. Sales Person

#### <u>description</u>

Table: SalesPerson

    +-----------------+---------+
    | Column Name     | Type    |
    +-----------------+---------+
    | sales_id        | int     |
    | name            | varchar |
    | salary          | int     |
    | commission_rate | int     |
    | hire_date       | date    |
    +-----------------+---------+
sales_id is the primary key (column with unique values) for this table.
Each row of this table indicates the name and the ID of a salesperson alongside their salary, commission rate, and hire date.
 

Table: Company

    +-------------+---------+
    | Column Name | Type    |
    +-------------+---------+
    | com_id      | int     |
    | name        | varchar |
    | city        | varchar |
    +-------------+---------+
com_id is the primary key (column with unique values) for this table.
Each row of this table indicates the name and the ID of a company and the city in which the company is located.
 

Table: Orders

    +-------------+------+
    | Column Name | Type |
    +-------------+------+
    | order_id    | int  |
    | order_date  | date |
    | com_id      | int  |
    | sales_id    | int  |
    | amount      | int  |
    +-------------+------+
order_id is the primary key (column with unique values) for this table.
com_id is a foreign key (reference column) to com_id from the Company table.
sales_id is a foreign key (reference column) to sales_id from the SalesPerson table.
Each row of this table contains information about one order. This includes the ID of the company, the ID of the salesperson, the date of the order, and the amount paid.
 

Write a solution to find the names of all the salespersons who did not have any orders related to the company with the name "RED".

Return the result table in any order.

#### <u>attempts</u>

NOTE: this really feels like it should work but the output is plain wrong

    SELECT sales.name FROM SalesPerson sales
    JOIN Orders orders ON orders.sales_id = sales.sales_id
    JOIN Company company ON orders.com_id = company.com_id
    WHERE company.name <> "RED"
    
status = failure

NOTE: above was definitely incorrect, below works

    SELECT name FROM SalesPerson
    WHERE sales_id NOT IN (
        SELECT sales.sales_id FROM Orders orders
        JOIN Company company ON orders.com_id = company.com_id
        JOIN SalesPerson sales ON orders.sales_id = sales.sales_id
        WHERE company.name = "RED"
    )

status = success

#### <u>solutions</u>

#### <u>notes</u>

the solutions are all some variation of what i wrote, so nothing new to add. however i struggled a bit with this problem; i still seem to have a hard time visualizing the output tables after performing operations on them

### 610. Triangle Judgement

#### <u>description</u>

    +-------------+------+
    | Column Name | Type |
    +-------------+------+
    | x           | int  |
    | y           | int  |
    | z           | int  |
    +-------------+------+
In SQL, (x, y, z) is the primary key column for this table.
Each row of this table contains the lengths of three line segments.

Report for every three line segments whether they can form a triangle.

Return the result table in any order.

#### <u>attempts</u>

didn't really bother with this one, didn't seem so much to be about SQL but rather geometry...

#### <u>solutions</u>

from some leetcode solution

    SELECT *, IF(x+y>z AND y+z>x AND z+x>y, "Yes", "No") as triangle
    FROM Triangle

#### <u>notes</u>

new function used here, IF()

1. IF

The IF() function returns a value if a condition is TRUE, or another value if a condition is FALSE.

syntax:

    IF(condition, value_if_true, value_if_false)

### 619. Biggest Single Number

#### <u>description</u>

Table: MyNumbers

    +-------------+------+
    | Column Name | Type |
    +-------------+------+
    | num         | int  |
    +-------------+------+
This table may contain duplicates (In other words, there is no primary key for this table in SQL).
Each row of this table contains an integer.
 

A single number is a number that appeared only once in the MyNumbers table.

Find the largest single number. If there is no single number, report null.

#### <u>attempts</u>

    SELECT MAX(num) AS num FROM (
        SELECT * FROM MyNumbers
        GROUP BY num
        HAVING COUNT(num) = 1
    ) AS sub

status = success

#### <u>solutions</u>

solution from a leetcode thread, uses sorting instead of subquery

    SELECT IF(COUNT(num) =1, num, null) AS num          # if the number appears exactly once, it is returned, otherwise null is returned
    FROM MyNumbers 
    GROUP BY num                                        # groups the rows by the 'num' column. each group contains all rows with the same 'num' value
    ORDER BY COUNT(num), num DESC                       # this clause sorts results first by the count of each 'num' in asc order, then by 'num in desc order
                                                        # ensures that 1. numbers that appear once are at the top, 2. among those, the largest numbers appears first
    LIMIT 1;                                            # picks the top number 
    
NOTE: not as fast apparently. also a bit confusing since the original column and the new one are both named num, i assume the group and order is done on the new column. also not entirely sure if GPT's explanation is correct or if there is something about sql's order of operation i still don't fundamentally understand...

explanation: The query groups the table by each number, counts the occurrences, and uses the IF function to either select the number (if it appears exactly once) or null. It then sorts the results such that numbers that appear exactly once are prioritized and, among those, the largest number comes first. Finally, it selects the top result.

#### <u>notes</u>

initially wrote 

    SELECT MAX(num) AS num FROM (
        SELECT * FROM MyNumbers
        GROUP BY num
        HAVING COUNT(num) = 1
    )

but this query failed. the subquery (derived table) in this case must have a name, hence the AS sub (simply putting sub next to the ending paranthesis also works)

i also initialy though that my corrected query wouldn't work, that i had to write a condition in the event that there were no single numbers but that is not the case. SELECT will return null if the subquery is empty due to there being no single numbers

### 620. Not Boring Movies

#### <u>description</u>

Table: Cinema

    +----------------+----------+
    | Column Name    | Type     |
    +----------------+----------+
    | id             | int      |
    | movie          | varchar  |
    | description    | varchar  |
    | rating         | float    |
    +----------------+----------+
id is the primary key (column with unique values) for this table.
Each row contains information about the name of a movie, its genre, and its rating.
rating is a 2 decimal places float in the range [0, 10]
 

Write a solution to report the movies with an odd-numbered ID and a description that is not "boring".

Return the result table ordered by rating in descending order.

#### <u>attempts</u>

    SELECT * FROM Cinema
    WHERE MOD(id,2)=1 AND description <> 'boring'
    ORDER BY rating DESC

status = success

#### <u>solutions</u>

just about every top solutions are the exact same as my query

#### <u>notes</u>

in the WHERE clause in my attempt, it is possible to use the modulus operator instead of the function as well. so writing this also works:

    WHERE id % 2 = 1 AND description <> 'boring'

## Medium Problems

## Hard Problems