# Select query for a specific columns
#### SELECT column, another_column, …
#### FROM mytable;

# Select query with constraints
#### SELECT column, another_column, …
#### FROM mytable
#### WHERE condition
####     AND/OR another_condition
####     AND/OR …;


#### Operator	Condition	SQL Example
#### =, !=, <, <=, >, >=	Standard numerical operators	col_name != 4
#### BETWEEN … AND …	Number is within range of two values (inclusive)	col_name BETWEEN 1.5 AND 10.5
#### NOT BETWEEN … AND …	Number is not within range of two values (inclusive)	col_name NOT BETWEEN 1 AND 10
#### IN (…)	Number exists in a list	col_name IN (2, 4, 6)
#### NOT IN (…)	Number does not exist in a list	col_name NOT IN (1, 3, 5)


### EXAMPLES
#### SELECT * FROM movies WHERE id BETWEEN 0 AND 5;

# SQL Lesson 3: Queries with constraints (Pt. 2)
When writing WHERE clauses with columns containing text data, SQL supports a number of useful operators to do things like case-insensitive string comparison and wildcard pattern matching. We show a few common text-data specific operators below:

= (case sensitive) LIKE (case insensitive) != (case sensitive incomparison) LIKE (case insensitive comparison) (these are exact)

% (Used anywhere in a string to match a sequence of zero or more characters (only with LIKE or NOT LIKE)) 

col_name LIKE "%AT%"
(matches "AT", "ATTIC", "CAT" or even "BATS")

"\_" Used anywhere in a string to match a single character (only with LIKE or NOT LIKE)

col_name LIKE "AN_"
(matches "AND", but not "AN")

IN (…)	String exists in a list	col_name IN ("A", "B", "C")

NOT IN (…)	String does not exist in a list	col_name NOT IN ("D", "E", "F")

# SQL Lesson 4: Filtering and sorting Query results

Even though the data in a database may be unique, the results of any particular query may not be – take our Movies table for example, many different movies can be released the same year. In such cases, SQL provides a convenient way to discard rows that have a duplicate column value by using the **DISTINCT** keyword.

SELECT DISTINCT column, another_column, …
FROM mytable
WHERE condition(s);

## Ordering results

Unlike our neatly ordered table in the last few lessons, most data in real databases are added in no particular column order. As a result, it can be difficult to read through and understand the results of a query as the size of a table increases to thousands or even millions rows.

To help with this, SQL provides a way to sort your results by a given column in ascending or descending order using the **ORDER BY** clause.

SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC;

## Limiting results to a subset

Another clause which is commonly used with the **ORDER BY** clause are the **LIMIT** and **OFFSET** clauses, which are a useful optimization to indicate to the database the subset of the results you care about.
The **LIMIT** will reduce the number of rows to return, and the optional **OFFSET** will specify where to begin counting the number rows from.

SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC
LIMIT num_limit OFFSET num_offset;

# SQL Lesson 6: Multi-table queries with JOINs

## Database normalization
Database normalization is useful because it minimizes duplicate data in any single table, and allows for data in the database to grow independently of each other (ie. Types of car engines can grow independent of each type of car). As a trade-off, queries get slightly more complex since they have to be able to find data from different parts of the database, and performance issues can arise when working with many large tables.

## Multi-table queries with JOINs
Tables that share information about a single entity need to have a primary key that identifies that entity uniquely across the database. One common primary key type is an auto-incrementing integer (because they are space efficient), but it can also be a string, hashed value, so long as it is unique.


Using the **JOIN** clause in a query, we can combine row data across two separate tables using this unique key. The first of the joins that we will introduce is the **INNER JOIN**.

SELECT column, another_table_column, …
FROM mytable
INNER JOIN another_table 
    ON mytable.id = another_table.id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

The **INNER JOIN** is a process that matches rows from the first table and the second table which have the same key (as defined by the **ON** constraint) to create a result row with the combined columns from both tables. After the tables are joined, the other clauses we learned previously are then applied.

# SQL Lesson 7: OUTER JOINs

If the two tables have asymmetric data, which can easily happen when data is entered in different stages, then we would have to use a **LEFT JOIN**, **RIGHT JOIN** or **FULL JOIN** instead to ensure that the data you need is not left out of the results.

SELECT column, another_column, …
FROM mytable
INNER/LEFT/RIGHT/FULL JOIN another_table 
    ON mytable.id = another_table.matching_id;


Like the **INNER JOIN** these three new joins have to specify which column to join the data on.
When joining table A to table B, a **LEFT JOIN** simply includes rows from A regardless of whether a matching row is found in B. The **RIGHT JOIN** is the same, but reversed, keeping rows in B regardless of whether a match is found in A. Finally, a **FULL JOIN** simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table.

When using any of these new joins, you will likely have to write additional logic to deal with **NULL**s in the result and constraints (more on this in the next lesson).

SELECT DISTINCT building_name, role FROM buildings
LEFT JOIN employees
    ON buildings.building_name=employees.building;

# SQL Lesson 8: A short note on NULLs

It's always good to reduce the possibility of **NULL** values in databases because they require special attention when constructing queries, constraints (certain functions behave differently with null values) and when processing the results.

An alternative to **NULL** values in your database is to have **data-type appropriate default values**, like 0 for numerical data, empty strings for text data, etc. But if your database needs to store incomplete data, then NULL values can be appropriate if the default values will skew later analysis (for example, when taking averages of numerical data).

Sometimes, it's also not possible to avoid **NULL** values, as we saw in the last lesson when outer-joining two tables with asymmetric data. In these cases, you can test a column for **NULL** values in a **WHERE** clause by using either the **IS NULL** or **IS NOT NULL** constraint.

WHERE column IS/IS NOT NULL
AND/OR another_condition
AND/OR …

SELECT * 
FROM buildings
LEFT JOIN employees
    ON buildings.building_name=employees.building
WHERE name IS NULL;

# SQL Lesson 9: Queries with expressions

In addition to querying and referencing raw column data with SQL, you can also use expressions to write more complex logic on column values in a query. These expressions can use mathematical and string functions along with basic arithmetic to transform values when the query is executed.

SELECT particle_speed / 2.0 AS half_particle_speed FROM physics_data

In addition to expressions, regular columns and even tables can also have aliases to make them easier to reference in the output and as a part of simplifying more complex queries.

SELECT column AS better_column_name, … FROM a_long_widgets_table_name AS mywidgets INNER JOIN widget_sales ON mywidgets.id = widget_sales.widget_id;

# SQL Lesson 10: Queries with aggregates (Pt. 1)

SQL also supports the use of aggregate expressions (or functions) that allow you to summarize information about a group of rows of data. With the Pixar database that you've been using, aggregate functions can be used to answer questions like, "How many movies has Pixar produced?", or "What is the highest grossing Pixar film each year?".

SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;

### Common aggregate functions:

**COUNT(\* or column)** A common function used to counts the number of rows in the group if no column name is specified. Otherwise, count the number of rows in the group with non-NULL values in the specified column.

**MIN(column)** Finds the smallest numerical value in the specified column for all rows in the group.

**MAX(column)** Finds the largest numerical value in the specified column for all rows in the group.

**SUM(column)** Finds the sum of all numerical values in the specified column for the rows in the group.

**AVG(column)** Finds the average numerical value in the specified column for all rows in the group.

### Grouped aggregate functions
You can apply the aggregate functions to individual groups of data within that group (ie. box office sales for Comedies vs Action movies).
This would then create as many results as there are unique groups defined as by the **GROUP BY** clause.

SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression
GROUP BY column;

The **GROUP BY** clause works by grouping rows that have the same value in the column specified.

### example
Find the total number of employee years worked in each building

SELECT *, SUM(years_employed)
FROM employees
GROUP BY building;

# SQL Lesson 11: Queries with aggregates (Pt. 2)
One thing that you might have noticed is that if the GROUP BY clause is executed after the WHERE clause (which filters the rows which are to be grouped), then how exactly do we filter the grouped rows?

Luckily, SQL allows us to do this by adding an additional **HAVING** clause which is used specifically with the **GROUP BY** clause to allow us to filter grouped rows from the result set.

SELECT group_by_column, AGG_FUNC(column_expression) AS aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;

The **HAVING** clause constraints are written the same way as the **WHERE** clause constraints, and are applied to the grouped rows.

### example
Find the total number of years employed by all Engineers

SELECT *, SUM(years_employed) AS totalyears
FROM employees
GROUP BY role
HAVING role="Engineer";

# SQL Lesson 12: Order of execution of a Query

- FROM and JOIN
- WHERE 
- GROUP BY
- HAVING
- SELECT
- DISTINCT
- ORDER BY
- LIMIT / OFFSET

## overall example

Find the total domestic and international sales that can be attributed to each director 

SELECT director, SUM(domestic_sales + international_sales) as Cumulative_sales_from_all_movies
FROM movies
    INNER JOIN boxoffice
        ON movies.id = boxoffice.movie_id
GROUP BY director;

# Date Functions
- strftime('%Y-%m', InvoiceDate):

This function converts the InvoiceDate to the YYYY-MM format. For example, 2023-10-15 becomes 2023-10.

- julianday():

This function represents a date as a Julian day number. The Julian day number is the number of days since January 1, 4713 BC. For example, 2023-10-15 might return approximately 2460244.5.

To calculate the difference in months between two dates

SELECT 
    strftime('%Y-%m', InvoiceDate2) - strftime('%Y-%m', InvoiceDate1) AS MonthDifference
FROM 
    TableName
WHERE 
    InvoiceDate1 IS NOT NULL AND InvoiceDate2 IS NOT NULL;

Here, InvoiceDate1 and InvoiceDate2 are date fields. This query will compute the difference in months for each record.

# Subqueries / CTEs (WITH)

### Subquery Example
SELECT 
    InvoiceID, 
    InvoiceDate,
    (SELECT julianday(InvoiceDate) FROM TableName WHERE InvoiceID = t.InvoiceID) AS JulianDay
FROM 
    TableName t
WHERE 
    InvoiceDate BETWEEN '2023-01-01' AND '2023-12-31';

In this query, the Julian day for each invoice is calculated and added to the JulianDay field.

### CTE Example
A CTE defines a temporary result set which you can reference within a SELECT, INSERT, UPDATE, or DELETE statement.

WITH DateDifferences AS (
    SELECT 
        InvoiceID,
        InvoiceDate,
        strftime('%Y-%m', InvoiceDate) AS Month
    FROM 
        TableName
)
SELECT 
    InvoiceID,
    Month,
    COUNT(*) AS InvoiceCount
FROM 
    DateDifferences
GROUP BY 
    InvoiceID, Month;

This CTE calculates the number of invoices per month for each invoice. The DateDifferences CTE is then used in the main query for simplicity.

### mini cheat-sheat
- Grup + metrik: SELECT col, SUM(x) FROM T GROUP BY col
- Oran: SUM(CASE WHEN cond THEN 1 ELSE 0 END)1.0 / COUNT() AS rate
- Toplam pay: SUM(x)/SUM(SUM(x)) OVER () (SQLite’ta pencere yoksa CTE ile total’ı çapraz birleştir)