# DQL
DQL stands for Data Query Language.

### General framework
1. Identify the table or tables that are needed to solve the question. Do a quick `select *` and `limit` to study the table if required.
2. Identify all the different columns required in the output. Mention if the columns are available or if they need to be created.
3. Identify the various keywords needed that are to solve a given question.
4. If multiple tables need to be combined, apply the appropriate `join`. Make sure the `on` keyword is used correctly in this context.
5. For filter condition, identify all special operators or keywords that are needed to solve the given question. If there are multiple conditions, see if they need to be joined with `and`, `or`, etc.
6. Check for `order by` and `limit` condition if any.
7. Arrange all the statements in the correct order and execute the query.

### Order of execution
SQL queries adhere to a specific order when evaluating clauses, similar to BODMAS or PEMDAS or BIDMAS. From the eyes of the user, queries begin from the first command and end at the last command. However, the queries are not actually read from the top to bottom when they are executed.

Ther order in which the statements in the queries are executed, is as follows,
1. `from` or `join`.
2. `where`.
3. `group by`.
4. `having`.
5. `select`.
6. `order by`.
7. `limit` or `offset`.

# `select`
The `select` command is used to select data from the database. The data returned is stored in a result table, called the result-set.

```sql
-- syntax
select column1, column2, ...
from table_name;

-- example
-- the query below selects the specified columns
select product_id, product_name
from "farmers_market.product";
```

To select all the columns from the table,

```sql
-- syntax
select *
from "database_name.table_name"

-- example
-- the query below selects all the columns
select *
from "farmers_market.product";
```

The `*` is called as the wildcard.

To only select all the columns from a specific table, when two or more tables are joined,

```sql
-- example
select employees.*
from employees left join job_history on employees.employee_id = job_history.employee_id
where job_history.employee_id is null
order by employees.employee_id;

-- another example
select c.*
from `farmers_market.customer` as c left join `farmers_market.customer_purchases` as cp on c.customer_id = cp.customer_id;
```

### `select distinct`
The `select distinct` command is used to select only the distinct records or entries or values from the columns.

```sql
-- syntax
select distinct column1, column2, ...
from table_name;

-- example
-- the query below selects only the distinct records from the specified columns
select distinct product_id, product_name
from "farmers_market.product";
```

### Inline calculations
Whenever there is a need to create a new column which are based on existing columns in the dataset, inline calculations are used. The new column that is created will be displayed only in the output and the original data in the dataset will not be affected.

Why? The `select` is a query command and not a manipulation or definition command.

The inline calculations are performed in the `select` command.

```sql
-- the query below shows employee_id and total_earnings 
-- (salary + commission_pct) from the employees table and orders the output by 
-- the 1st column in the select statement
select employee_id, salary + commission_pct as total_earnings
from `hr.employees`
order by 1 desc;
```

# `from`
The `from` command is used to specify which table to select or delete the data from.

```sql
-- syntax
select column1, column2, ...
from table_name;
```

While writing the name of the table in a query, the `table_name` should be followed after the `database_name`.

```sql
-- syntax
from column1, column2, ...
from database_name.table_name

-- example
-- the query below selects the specified columns from the 
-- vendor_booth_assignment table
select market_date, vendor_id, booth_number
from "farmers_market.vendor_booth_assignments";
```

# `limit`
The `limit` command is used to filter the output result with a limited number of rows.

```sql
-- syntax
select column_name
from "database_name.table_name"
limit integer_number;
```

The `integer_number` must be a positive and a whole number, floating point numbers are not allowed.

```sql
-- example
-- the query below selects the specified columns from the vendor_booth_assignment table and shows 10 rows from the top
select market_date, vendor_id, booth_number
from "farmers_market.vendor_booth_assignments"
limit 10;
```

# `order by`
The `order by` command is used to sort the result-set based on a column. The sorting is done in either ascending (default) or descending order.

```sql
-- syntax
select column_name1, column_name2, ...
from "database_name.table_name"
order by column_name;
```

In the `order by` command, the priority is given to the data field which is placed first. When `order by 1` is specified in a query, the ordering of the entire resultant table is done according to the first column specified in the `select` command.

```sql
-- example
-- the query below shows all the columns from the vendor_booth_assignments table and orders them by market_date
select *
from `farmers_market.vendor_booth_assignments`
order by market_date;
```

### `desc`
The `desc` command is used to sort the data returned in descending order (highest to lowest).

```sql
-- syntax
select column_names
from "database_name.table_name"
order by column_name desc;

-- example
-- the query below shows all the columns from customer_purchases table 
-- and orders them by market_date in descending order (most recent orders are on the top)
select *
from `farmers_market.customer_purchases`
order by market_date desc;
```

### `asc`
The `asc` command is used to sort the data returned in ascending order (lowest to highest). This is an optional syntax, because the sorting is done in ascending order by default.

```sql
-- syntax
select column_names
from "database_name.table_name"
order by column_name asc

-- example
-- the query below selects all the columns from the vendor_booth_assignments 
-- table and sorts the output by market_date in descending order and vendor_id in ascending order
select *
from `farmers_market.vendor_booth_assignments`
order by market_date desc, vendor_id asc;
```

# `offset`
The `offset` command is used to identify the starting point to return rows from a result set. Basically, n number of rows are excluded from the top if `offset` command is used, n is an integer number that is specified.

`offset` can only be used with `limit` command. It cannot be used on it own.

`offset` value must be greater than or equal to 0. It cannot be negative and it cannot be a floating point value.

```sql
-- syntax
select column_names
from "database_name.table"
order by column_name offset integer_number;
```

`offset` does not work without `limit`

```sql
-- the query below displays all the information for the 3rd row from the 
-- customer_purchases table
select *
from `farmers_market.customer_purchases`
order by market_date desc
limit 1 offset 2;
```

# `as` (Aliasing)
### Aliasing
Qualifiers are used within SQL statements to reference data structures, such as databases, tables or columns.

Aliasing on tables is also possible (it is not limited to columns alone). The `as` keyword is used to perform the operation of aliasing.

### `as`
The `as` command is used to rename a column or a table with an alias. An alias only exists for the duration of the query.

```sql
-- syntax
select column_name as alias_name
from "database_name.table_name";
```

Aliasing can be done for tables.

```sql
-- the query below shows the employee_id and annual_salary (salary * 12)
select e.employee_id, e.salary * 12 as annual_salary
from "hr.employees" as e;
```

# Functions
All DBMSs and Data Warehouses provide a utility called as functions. These functions allow to perform certain calculations. Functions are usually used with the `select` keyword in SQL.

e.g., `concat()`, `upper()`, `lower()`, etc.

### Compatibility of functions with different data types
`avg(hire_date)` does not make any sense, but `min(hire_date)` or `max(hire_date)` or `count(hire_date)` does make sense.

`min()` or `max()` can be applied to any type of object where `order by` keyword can be applied to.

### String functions
- `concat()`: The `concat()` function combines 2 or more strings together.
    ```sql
    -- syntax
    select column1, column2, concat(column1, column2) as column3
    from "database_name.table_name";

    -- example
    -- the query below shows the first_name, last_name and full_name of the customers 
    -- from the customer table
    select 
	    customer_first_name, 
	    customer_last_name, 
	    concat(
		    customer_first_name, 
		    ' ', 
		    customer_last_name
	    ) as customer_full_name
    from `farmers_market.customer`;
    ```
- `upper()`: The `upper()` function converts a string to uppercase.
    ```sql
    -- syntax
    select upper(column_name) as alias_name
    from "database_name.table_name";

    -- example
    -- the query below shows the first_name, last_name and full_name in upper case 
    -- of the customers from the customer table
    select 
        upper(customer_first_name), 
        upper(customer_last_name), 
        upper(
            concat(
                customer_first_name, 
                ' ', 
                customer_last_name
            )
        ) as customer_full_name_uppercase
    from "farmers_market.customer";
    ```
- `lower()`: The `lower()` function converts a string to lowercase.
    ```sql
    -- syntax
    select lower(column_name) as alias_name
    from "database_name.table_name";

    -- example
    -- the query below shows the first_name, last_name and full_name in lower case 
    -- of the customers from the customer table
    select 
        lower(customer_first_name), 
        lower(customer_last_name), 
        lower(
            concat(
                customer_first_name, 
                ' ', 
                customer_last_name
            )
        ) as customer_full_name_uppercase
    from "farmers_market.customer";
    ```
- `left()`: The `left()` function extracts a number of characters from a string (starting from left).
    ```sql
    -- syntax
    left(string, number_of_chars)

    -- example
    select person_id, concat(name, '(', left(profession, 1), ')') as name
    from person
    order by person_id desc;
    ```
- `right()`: The `right()` function extracts a number of characters from a string (starting from right).
    ```sql
    # syntax
    right(string, number_of_chars)

    select person_id, concat(name, '(', right(profession, 1), ')') as name
    from person
    order by person_id desc;
    ```

### Numeric functions
- `count()`: The `count()` function returns the number of rows that matched a specified criterion. `count()` function does not count the `NULL` values if there are any, if a `column_name` is specified. But, when `count(*)` is used, then `NULL` values are counted as well.
    ```sql
    -- syntax
    -- count() can be used in three ways
    count(column_name) 
    count(*)
    count(distinct column_name)

    -- example
    -- the query below displays the count of all the employees in the company
    select count(employee_id) as employee_count
    from `hr.employees`;
    ```
- `min()`: The `min()` function returns the minimum value in a set of values.
    ```sql
    -- syntax
    select min(column_name)
    from 'database_name.table_name';

    -- example
    -- the query below displays the minimum, maximum and average salary of the 
    -- employees
    select 
        min(salary) as min_salary, 
        max(salary) as max_salary, 
        avg(salary) as average_salary
    from `hr.employees`;
    ```
- `max()`: The `max()` function returns the maximum value in a set of values.
    ```sql
    -- syntax
    select max(column_name)
    from 'database_name.table_name';

    -- example
    -- the query below displays the minimum, maximum and average salary of the 
    -- employees
    select 
        min(salary) as min_salary, 
        max(salary) as max_salary, 
        avg(salary) as average_salary
    from `hr.employees`;
    ```
- `avg()`: The `avg()` function returns the average value of an expression. The `NULL` values are ignored.
    ```sql
    -- syntax
    select avg(column_name)
    from 'database_name.table_name';

    -- example
    -- the query below displays the minimum, maximum and average salary of the 
    -- employees
    select 
        min(salary) as min_salary, 
        max(salary) as max_salary, 
        avg(salary) as average_salary
    from `hr.employees`;
    ```
- `round()`: The `round()` function is used to round a decimal number off to n digits. The n is passed as an argument.
    ```sql
    -- syntax
    select column_name, round(column_name * expression, n)
    from 'database_name.table_name';

    -- example
    select original_title, round((vote_count/ (vote_count + 104.0) * vote_average) + (104.0/ (104.0 + vote_count) * 5.97), 2) as weighted_avg_rating
    from movies
    order by 2 desc, 1 asc
    limit 10;
    ```
- `mod()`: The `mod()` function returns the remainder of a number divided by another number.
    ```sql
    -- syntax
    mod(int1, int2)

    -- example
    select employee_id,
    case
        when mod(employee_id, 2) = 0 then 0
        when name = "M%" then 0
        else salary
    end as bonus
    from employees;
    ```

### Date and time functions
- `extract()`: The `extract()` function extracts a part from the given table.
    ```sql
    -- syntax
    select extract(part from column_name)
    from 'database_name.table_name';
    -- part = The part to extract. 
    -- part can an be any of the following: MICROSECOND, SECOND, MINUTE, HOUR, DAY,
    -- WEEK, MONTH, QUARTER, YEAR, SECOND_MICROSECOND, MINUTE_MICROSECOND, MINUTE_SECOND, 
    -- HOUR_MICROSECOND, HOUR_SECOND, HOUR_MINUTE, DAY_MICROSECOND, DAY_SECOND, 
    -- DAY_MINUTE, DAY_HOUR, YEAR_MONTH.

    -- example
    -- the query below displays the month in which the employee was hired in
    select employee_id, hire_date, extract(month from hire_date) as month_of_hire
    from 'hr.employees';
    ```
- `year()`: The `year()` function returns the year part for a given table (a number from 1000 to 9999).
    ```sql
    -- syntax
    select year(column_name)
    from 'database_name.table_name';
    ```
- `current_date()`: The `current_date()` function returns the current date.
    ```sql
    -- syntax
    select current_date();
    select current_date() + 1;

    -- example
    -- the query below displays the current date and current time
    select current_date() as currentdate, current_time() as currenttime;
    ```
- `current_time()`: The `current_time()` function returns the current time.
    ```sql
    -- syntax
    select current_time();

    -- example
    -- the query below displays the current date and current time
    select current_date() as currentdate, current_time() as currenttime;
    ```
- `current_datetime()`: The `current_datetime()` function returns the current date and current time.
    ```sql
    -- syntax
    select current_datetime();

    -- example
    -- the query below displays the current date and current time in a single cell
    select current_datetime() as currentdatetime;
    ```
- `current_timestamp()`: The `current_timestamp()` function returns the current date and current time.
    ```sql
    -- syntax
    select current_timestamp();

    -- example
    -- the query below displays the current timestamp
    select current_timestamp() as currenttimestamp;
    ```
- `date_diff()`: The `date_diff()` function returns the number of days between 2 date values.
    ```sql
    -- syntax
    select date_diff(date1, date2, time_unit);
    -- time_unit can be year, month, or day

    -- example
    -- the query below display information of the tenure of each employee in years 
    -- from the date of hire
    select 
        employee_id, 
        hire_date, 
        date_diff(current_date(), hire_date, year) as tenure_in_years
    from 'hr.employees';

    -- the query below displays the time period between the company's start date and the date of hiring of the employee in years
    select 
        employee_id, 
        hire_date, 
        date_diff(hire_date, min(hire_date) over(), year) as tenure_in_years
    from 'hr.employees'
    order by tenure_in_years;
    ```
- `date_add()`: The `date_add()` function adds a date or time interval to a date and then returns the date.
    ```sql
    -- syntax
    select date_add(date, interval value addunit);
    -- where,
    -- date = date that is to be modified
    -- value = value of time/ date interval to be added
    -- both positive and negative values are allowed
    -- addunit = type of interval to add
    -- can be any of the following
    -- MICROSECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR, 
    -- SECOND_MICROSECOND, MINUTE_MICROSECOND, MINUTE_SECOND, HOUR_MICROSECOND, 
    -- HOUR_SECOND, HOUR_MINUTE, DAY_MICROSECOND, DAY_SECOND, DAY_MINUTE, DAY_HOUR, 
    -- YEAR_MONTH

    -- example
    -- the query below displays the employees who joined within 2 years of the company's inception
    select *
    from (
    select 
            employee_id, 
            hire_date, 
            min(hire_date) over() as company_start_date, 
            date_add(min(hire_date) over(), interval 2 year) as company_start_2_years
    from 'hr.employees'
    )
    where hire_date < company_start_2_years;
    ```
- `date_sub()`: The `date_sub()` function subtracts a date or time interval from a date and returns the date.
    ```sql
    -- syntax
    select date_sub(date, interval value addunit);
    -- where
    -- date = date that is to be modified
    -- value = value of time/ date interval to be subtracted
    -- both positive and negative values are allowed
    -- addunit = type of interval to subtract
    -- can be any of the following
    -- MICROSECOND, SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR, 
    -- SECOND_MICROSECOND, MINUTE_MICROSECOND, MINUTE_SECOND, HOUR_MICROSECOND, 
    -- HOUR_SECOND, HOUR_MINUTE, DAY_MICROSECOND, DAY_SECOND, DAY_MINUTE, DAY_HOUR, 
    -- YEAR_MONTH
    ```

### Other functions
- `ifnull()`: The `ifnull()` function returns a specified value if the expression is `NULL`. If the expression is `NOT NULL`, this function returns the expression.
    ```sql
    -- syntax
    ifnull(expression, alternate_value)

    -- example
    -- the query below will replace the null values in the product_size column with 
    -- the prompt and add a new column with the specified name and return the table
    select *, ifnull(product_size, 'missing') as product_size_new
    from 'farmers_market.product';
    ```
- `coalesce()`: The `coalesce()` function returns the first non-null value in a list.
    ```sql
    -- syntax
    coalesce(val1, val2, ..., valN)
    ```

# `where`
The `where` keyword is used to filter records. It is used to extract only those records that fulfil a specific condition.

`where` keyword, when used, performs operations (which are user-defined) only on the rows where the condition attached with the keyword satisfies.

The condition attached with the `where` keyword is either evaluated to either true or false.

```sql
-- syntax
select column_name
from "database_name.table_name"
where condition;
```

The `where` keyword is not just used with `select` statement, but it is also used with `update`, `delete` and other keywords as well.

SQL requires single quotes around text values, although double quotes are also allowed, but using single quotes is a preferred practice.

Whenever performing an operation or applying a condition on a string based values, make sure that the string is inside single quotes.

Operators used with `where` keywords are,
- Equal to: `=`.
- Greater than: `>`.
- Lesser than: `<`.
- Greater than or equal to: `>=`.
- Lesser than or equal to: `<=`.
- Not equal to: `<>`, `!=`.

The string placed inside single quotes are case sensitive. Any condition used along with the `where` keyword, the query will try to find an exact match,

Dates in SQL should be entered within single quotes.

```sql
-- examples
-- the query below shows all the product_names from the product table where the product_category is 1
select product_name, product_category_id
from 'farmers_market.product'
where product_category_id = 1;

-- the query below shows all the entries which fall under the product_size 'medium'
select *
from 'farmers_market.product'
where lower(product_size) = 'medium';

-- the query below shows all the entries which fall under the product_size 'small' and have product_category_id as 1
select *
from 'farmers_market.product'
where lower(product_size) = 'small' and product_category_id = 1;

-- the query below shows all the entries which have product_category_id as 2 and product_name as 'carrot'
select *
from 'farmers_market.product'
where product_category_id = 2 or product_name = 'Carrot';

-- the query below displays information for all the entries whose product_id is greater than 3
select *
from 'farmers_market.product'
where product_id > 3;

-- the query below displays information about what booth was assigned to vendor_id 3 on or before 20th April 2019
select vendor_id, booth_number, market_date
from 'farmers_market.vendor_booth_assignments'
where vendor_id = 3 and market_date <= '2019-04-20';

-- the query below displays information about what booth was assigned to vendor_id 7 between market dates 2019-04-03 and 2019-05-16
select vendor_id, booth_number, market_date
from 'farmers_market.vendor_booth_assignments'
where vendor_id = 7 and market_date >= '2019-04-03' and market_date <= '2019-05-16'
order by market_date;

-- the same result can be obtained using the between statement
SELECT *
FROM 'farmers_market.vendor_booth_assignments'
WHERE vendor_id = 7 and market_date between '2019-04-03' and '2019-05-16'
ORDER BY market_date;
```

# `and`
The `and` keyword is used along with `where` to only include rows where both the conditions are true.

```sql
-- syntax
select column_names
from 'database_name.table_name'
where condition1 and condition2;

-- example
-- the query below shows all the entries which fall under the product_size 'small' and have product_category_id as 1
select *
from 'farmers_market.product'
where lower(product_size) = 'small' and product_category_id = 1;
```

# `or`
The `or` keyword is used with `where` to include rows where either of the condition is true.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where condition1 or condition2;

-- example
-- the query below shows all the entries which have product_category_id as 2 and 
-- product_name as 'carrot'
select *
from 'farmers_market.product'
where product_category_id = 2 or product_name = 'Carrot';
```

# `between`
The `between` keyword is used when working with a range which is inclusive of the limits. Do not use `between` when working with a range which is exclusive of the limits.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where column_name between start_value and end_value

-- example
-- the query below displays information about what booth was assigned to 
-- vendor_id 7 between market dates 2019-04-03 and 2019-05-16
select vendor_id, booth_number, market_date
from 'farmers_market.vendor_booth_assignments'
where vendor_id = 7 and market_date >= '2019-04-03' and market_date <= '2019-05-16'
order by market_date;

-- the same result can be obtained using the between statement
select *
from 'farmers_market.vendor_booth_assignments'
where vendor_id = 7 and market_date between '2019-04-03' and '2019-05-16'
order by market_date;
```

# `in`
The `in` keywords allows to specifiy multiple values and it works along with `where`. The `in` keyword is a shorthand for multiple `or` keywords.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where column_name in(val1, val2, ..., valN);

-- example
-- the query below shows all the products whose product_size is small, medium or large
select product_name
from 'farmers_market.product'
where product_size = 'small' or product_size = 'medium' or product_size = 'large';

-- the same result can be obtained using in()
select product_name
from 'farmers_market.product'
where product_size in('small', 'medium', 'large');
```

# `is null`
`is null` command is used to test for empty values (`NULL` values).

A `NULL` or `null` value is different from a 0 or a field that contains spaces. A field with `null` value is the one that has been left blank during record creation.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where condition1 is null;

-- example
-- the query displays all the information from the product table which do not 
-- have a product_size
select *
from 'farmers_market.product'
where product_size is null;
```

# `is not null`
`is not null` command is used to test for non-empty values (`not null`) values.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where condition1 is not null;

-- example
-- the query displays all the information from the product table which have a 
-- product_size
select *
from 'farmers_market.product'
where product_size is not null;
```

# `not`
The `not` keyword is used with `where` to only include rows where a condition is not true.

```sql
-- syntax
select column_name(s)
from 'database_name.table_name'
where not condition1;

-- example
-- the query below will display all the information from the product table where the product_size is not 'small', 'medium', or 'large'
select *
from `farmers_market.product`
where product_size not in('small', 'medium', 'large');
```

# `like`
