# SQL Methods

---

In [1]:
# Enable SQL extensions
%load_ext sql

In [2]:
# Created sql-methods.db
%sql sqlite:///sql-methods.db

In [3]:
%%sql
CREATE TABLE IF NOT EXISTS employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    department VARCHAR(50),
    salary DECIMAL(10, 2) NOT NULL
);

 * sqlite:///sql-methods.db
Done.


[]

In [4]:
%sql SELECT name FROM sqlite_master WHERE type='table';

 * sqlite:///sql-methods.db
Done.


name
employees


In [7]:
%%sql
ALTER TABLE employees
ADD hire_date DATE;

ALTER TABLE employees
ADD performance_rating INT;

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (1, 'John', 'Doe', 'Sales', 50000.00, '2024-04-23', 3);

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (2, 'Jane', 'Smith', 'Marketing', 55000.00, '2024-04-25', 4);

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (3, 'Michael', 'Johnson', 'Sales', 60000.00, '2024-04-26', 5);

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (4, 'Emily', 'Davis', 'Operations', 62000.00, '2024-04-27', 2);

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (5, 'David', 'Wilson', 'Operations', 58000.00, '2024-04-28', 1);

 * sqlite:///sql-methods.db
Done.
Done.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

## DISTINCT

The **DISTINCT** keyword is used to return only distinct (unique) values in the result set of a query. It ensures that duplicate rows are removed from the output.

In [8]:
%%sql

-- Retrieve unique department names from Employees
SELECT DISTINCT Department
FROM Employees;

 * sqlite:///sql-methods.db
Done.


department
Sales
Marketing
Operations


In [10]:
%%sql

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (6, 'David', 'Blaine', 'Sales', 50000.00, '2024-03-28', 2);

INSERT INTO employees (employee_id, first_name, last_name, department, salary, hire_date, performance_rating) 
VALUES (7, 'John', 'Doe', 'Marketing', 33000.00, '2024-05-17', 3);

 * sqlite:///sql-methods.db
1 rows affected.
1 rows affected.


[]

In [12]:
%%sql

-- Retrieve unique combinations of first and last names
SELECT DISTINCT first_name, last_name
FROM employees;

 * sqlite:///sql-methods.db
Done.


first_name,last_name
John,Doe
Jane,Smith
Michael,Johnson
Emily,Davis
David,Wilson
David,Blaine


## LIMIT

The **LIMIT** clause specifies the maximum number of rows to return in a query result.

It's commonly used for:

*   Pagination: Retrieving results in smaller chunks, often used in web applications to display data in pages.
*   Performance optimization: Limiting the number of rows processed, especially for large datasets, can improve query speed.

In [13]:
%%sql

-- Retrieve the first 5 employees from the "employees" table

SELECT * FROM employees
LIMIT 5;

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
1,John,Doe,Sales,50000,2024-04-23,3
2,Jane,Smith,Marketing,55000,2024-04-25,4
3,Michael,Johnson,Sales,60000,2024-04-26,5
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1


In [17]:
%%sql

-- Starting at an offset of 3, and limiting displayed rows to 5
SELECT * FROM employees
LIMIT 3, 5;

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1
6,David,Blaine,Sales,50000,2024-03-28,2
7,John,Doe,Marketing,33000,2024-05-17,3


## COUNT

The **COUNT** function is used to count the number of rows in a table or the number of rows matching a specific condition.

In [18]:
%%sql

-- Count the total number of rows in the "employees" table.

SELECT COUNT(*) FROM employees;

 * sqlite:///sql-methods.db
Done.


COUNT(*)
7


In [23]:
%%sql

-- Count the total number of departments in the "employees" table.

SELECT COUNT(DISTINCT department) FROM employees;

 * sqlite:///sql-methods.db
Done.


COUNT(DISTINCT department)
3


# WHERE

The **WHERE** clause is used to filter data retrieved from a database based on specific conditions.

It allows you to narrow down your results to only include rows that meet certain criteria.

> **Common Operators Used in WHERE Clause:**

```
Comparison Operators:
  =: Equal to
  !=: Not equal to
  <: Less than
  >: Greater than
  <=: Less than or equal to
  >=: Greater than or equal to

Logical Operators:
  AND: Used to combine multiple conditions where both must be true.
  OR: Used to combine multiple conditions where at least one must be true.
  NOT: Used to negate a condition.
  
Special Operators:
  BETWEEN: Checks if a value falls within a specified range.
  IN: Checks if a value belongs to a set of values.
  LIKE: Used for pattern matching with wildcards.
  ```

In [31]:
%%sql

-- using the equal to [ = ]
SELECT * FROM employees
WHERE department = 'Sales';

-- not equal to [ != ]
SELECT * FROM employees
WHERE department != 'Sales';

-- less than [ < ]
SELECT * FROM employees
WHERE performance_rating < 3;

-- greater than [ > ]
SELECT * FROM employees
WHERE performance_rating > 2;

-- less than or equal to [ <= ]
SELECT * FROM employees
WHERE performance_rating <= 3;

-- greater than or equal to [ >= ]
SELECT * FROM employees
WHERE performance_rating >= 2;

 * sqlite:///sql-methods.db
Done.
Done.
Done.
Done.
Done.
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
1,John,Doe,Sales,50000,2024-04-23,3
2,Jane,Smith,Marketing,55000,2024-04-25,4
3,Michael,Johnson,Sales,60000,2024-04-26,5
4,Emily,Davis,Operations,62000,2024-04-27,2
6,David,Blaine,Sales,50000,2024-03-28,2
7,John,Doe,Marketing,33000,2024-05-17,3


In [34]:
%%sql

-- Logical AND
SELECT * FROM employees
WHERE department = 'Sales' AND salary > 50000;

-- Logical OR
SELECT * FROM employees
WHERE department = 'Operations' OR department = 'Marketing';

-- Logical NOT
SELECT * FROM employees
WHERE NOT department = 'Marketing';

 * sqlite:///sql-methods.db
Done.
Done.
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
1,John,Doe,Sales,50000,2024-04-23,3
3,Michael,Johnson,Sales,60000,2024-04-26,5
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1
6,David,Blaine,Sales,50000,2024-03-28,2


In [42]:
%%sql

-- BETWEEN (range) - inclusive
SELECT * FROM employees
WHERE salary BETWEEN 50000 AND 60000;

-- IN
SELECT * FROM employees
WHERE department IN ('Sales', 'Marketing');

-- LIKE

-- ending substring
SELECT * FROM employees
WHERE last_name LIKE '%son';

-- starting substring
SELECT * FROM employees
WHERE first_name LIKE 'J%';

-- contains
SELECT * FROM employees
WHERE first_name LIKE '%a%';

-- character position (_ will be characters of the string)
SELECT * FROM employees
WHERE first_name LIKE '_o%';

 * sqlite:///sql-methods.db
Done.
Done.
Done.
Done.
Done.
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
1,John,Doe,Sales,50000,2024-04-23,3
7,John,Doe,Marketing,33000,2024-05-17,3


## ORDER BY

The **ORDER BY** clause allows you to sort the results of your SELECT queries in either ascending or descending order. It lets you sort the retrieved data based on one or more columns, making it easier to analyze and interpret.

  * **ASC:** Ascending order (lowest to highest).
  * **DESC:** Descending order (highest to lowest).

In [45]:
%%sql 

-- Sorting Single Column
SELECT * FROM employees
ORDER BY last_name ASC;

SELECT * FROM employees
ORDER BY last_name DESC;

-- Sorting Multiple Columns
SELECT * FROM employees
ORDER BY department ASC, salary DESC;

 * sqlite:///sql-methods.db
Done.
Done.
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
2,Jane,Smith,Marketing,55000,2024-04-25,4
7,John,Doe,Marketing,33000,2024-05-17,3
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1
3,Michael,Johnson,Sales,60000,2024-04-26,5
1,John,Doe,Sales,50000,2024-04-23,3
6,David,Blaine,Sales,50000,2024-03-28,2


In [54]:
%%sql
-- Sorting by Expressions

SELECT * FROM employees
ORDER BY performance_rating > 3 DESC;

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
2,Jane,Smith,Marketing,55000,2024-04-25,4
3,Michael,Johnson,Sales,60000,2024-04-26,5
1,John,Doe,Sales,50000,2024-04-23,3
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1
6,David,Blaine,Sales,50000,2024-03-28,2
7,John,Doe,Marketing,33000,2024-05-17,3


In [55]:
%%sql

-- Sorting By Position
SELECT * FROM employees
ORDER BY 3;

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
6,David,Blaine,Sales,50000,2024-03-28,2
4,Emily,Davis,Operations,62000,2024-04-27,2
1,John,Doe,Sales,50000,2024-04-23,3
7,John,Doe,Marketing,33000,2024-05-17,3
3,Michael,Johnson,Sales,60000,2024-04-26,5
2,Jane,Smith,Marketing,55000,2024-04-25,4
5,David,Wilson,Operations,58000,2024-04-28,1


In [60]:
%%sql

-- Sorting in Random Order
SELECT * FROM employees
ORDER BY RANDOM();

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
5,David,Wilson,Operations,58000,2024-04-28,1
1,John,Doe,Sales,50000,2024-04-23,3
6,David,Blaine,Sales,50000,2024-03-28,2
2,Jane,Smith,Marketing,55000,2024-04-25,4
4,Emily,Davis,Operations,62000,2024-04-27,2
7,John,Doe,Marketing,33000,2024-05-17,3
3,Michael,Johnson,Sales,60000,2024-04-26,5


# GROUP BY

The **GROUP BY** clause is used to organize and summarize data by grouping rows with the same values in one or more columns. This helps you analyze trends, patterns, and aggregate statistics within your data.

> **Common Aggregate Functions Used with GROUP BY:**
```
  COUNT(): Counts the number of rows in each group.
  SUM(): Calculates the sum of a numeric column for each group.
  AVG(): Calculates the average of a numeric column for each group.
  MAX(): Returns the maximum value of a column for each group.
  MIN(): Returns the minimum value of a column for each group.
```

In [62]:
%%sql

SELECT * FROM employees;

 * sqlite:///sql-methods.db
Done.


employee_id,first_name,last_name,department,salary,hire_date,performance_rating
1,John,Doe,Sales,50000,2024-04-23,3
2,Jane,Smith,Marketing,55000,2024-04-25,4
3,Michael,Johnson,Sales,60000,2024-04-26,5
4,Emily,Davis,Operations,62000,2024-04-27,2
5,David,Wilson,Operations,58000,2024-04-28,1
6,David,Blaine,Sales,50000,2024-03-28,2
7,John,Doe,Marketing,33000,2024-05-17,3


In [70]:
%%sql

-- Grouping with COUNT()
SELECT department, COUNT(*) AS Employee_Count
FROM employees
GROUP BY department;

-- Grouping with SUM()
SELECT department, SUM(salary) AS total_salary
FROM employees
GROUP BY department;

-- Grouping with AVG()
SELECT department, AVG(salary) AS total_salary
FROM employees
GROUP BY department;

-- Grouping with MAX() and MIN()
SELECT department, MAX(salary) AS max_salary, MIN(salary) AS min_salary
FROM employees
GROUP BY department;

 * sqlite:///sql-methods.db
Done.
Done.
Done.
Done.


department,max_salary,min_salary
Marketing,55000,33000
Operations,62000,58000
Sales,60000,50000


## HAVING

The **HAVING** clause is used to filter groups of rows created by the GROUP BY clause based on conditions applied to aggregate functions. This allows you to focus on specific groups that meet certain criteria within your summarized data.

In [74]:
%%sql

-- Departments with more than 2 employees
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department
HAVING employee_count > 2;

-- Departments with an average salary greater than 50000 and where the number of employees is greater than 2.
SELECT department, AVG(salary) AS avg_salary, COUNT(*) AS num_employees
FROM employees
GROUP BY department
HAVING avg_salary > 50000 AND num_employees > 2;

-- Departments with an average salary greater than 50000 and where the number of employees is greater than 2.
SELECT department, AVG(salary) AS avg_salary, COUNT(*) AS num_employees
FROM employees
GROUP BY department
HAVING avg_salary > 50000 AND num_employees > 2;

 * sqlite:///sql-methods.db
Done.
Done.
Done.


department,first_name,avg_salary,num_employees
Sales,John,53333.333333333336,3


# JOIN

The JOIN clause is used to combine rows from two or more tables based on a related column between them. It allows you to retrieve data from multiple tables simultaneously, creating a single result set.

> **Types of JOINs:**

```
  INNER JOIN: Returns rows where the join condition is met in both tables.
  LEFT JOIN: Returns all rows from the left table, and matching rows from the right table.
     If there's no match in the right table, NULL values are filled for the right table's columns.
  RIGHT JOIN: Returns all rows from the right table, and matching rows from the left table.
    If there's no match in the left table, NULL values are filled for the left table's columns.
  FULL JOIN: Returns all rows from both tables, regardless of whether there's a match in the join condition.
  CROSS JOIN: combines every row from one table with every row from another table, resulting in the Cartesian product of the two tables.
    This means it creates a new table containing all possible combinations of rows from the joined tables, regardless of any relationship between them.
```

---
### Department Table Setup

---

In [77]:
%%sql

CREATE TABLE departments (
    department_id INTEGER PRIMARY KEY AUTOINCREMENT
);

 * sqlite:///sql-methods.db
Done.


[]

In [78]:
%%sql

ALTER TABLE departments
ADD department_name VARCHAR(50);

 * sqlite:///sql-methods.db
Done.


[]

In [79]:
%%sql

INSERT INTO departments (department_name)
VALUES
    ('HR'),
    ('Marketing'),
    ('Sales'),
    ('IT'),
    ('Finance');

 * sqlite:///sql-methods.db
5 rows affected.


[]

In [81]:
%%sql

INSERT INTO departments (department_name)
VALUES
    ('Operations');

 * sqlite:///sql-methods.db
1 rows affected.


[]

In [82]:
%%sql

SELECT * FROM departments;

 * sqlite:///sql-methods.db
Done.


department_id,department_name
1,HR
2,Marketing
3,Sales
4,IT
5,Finance
6,Operations


In [93]:
%%sql

-- TYPES OF JOINS

-- Inner Join
SELECT employee_id, first_name, last_name, department_name
FROM employees
INNER JOIN departments ON department = department_name;

-- Left Join (Gets all of Table A, Table B still has condition)
SELECT employee_id, first_name, last_name, department_name
FROM departments
LEFT JOIN employees ON department_name = department;

-- Right Join (Gets all of Table B, Table A still has condition)
SELECT employee_id, first_name, last_name, department_name
FROM employees
RIGHT JOIN departments ON department = department_name;

-- Full Join (Gets all of Table A and Table B)
SELECT employee_id, first_name, last_name, department_name
FROM employees
FULL JOIN departments ON department = department_name;

-- Cross Join (Gets all possible relationships)
SELECT employee_id, first_name, last_name, department_name
FROM employees
CROSS JOIN departments;

 * sqlite:///sql-methods.db
Done.
Done.
Done.
Done.
Done.


employee_id,first_name,last_name,department_name
1,John,Doe,HR
1,John,Doe,Marketing
1,John,Doe,Sales
1,John,Doe,IT
1,John,Doe,Finance
1,John,Doe,Operations
2,Jane,Smith,HR
2,Jane,Smith,Marketing
2,Jane,Smith,Sales
2,Jane,Smith,IT


## CASE...WHEN

The **CASE...WHEN** statement allows you to conditionally assign values to a column based on specific criteria. It's like creating an if-else logic within your SQL query.

In [95]:
%%sql

SELECT first_name, last_name, salary,
    CASE WHEN salary >= 60000 THEN 'High Salary'
         WHEN salary >= 50000 Then 'Mid Salary'
         ELSE 'Low Salary'
    END AS salary_category
FROM employees;

 * sqlite:///sql-methods.db
Done.


first_name,last_name,salary,salary_category
John,Doe,50000,Mid Salary
Jane,Smith,55000,Mid Salary
Michael,Johnson,60000,High Salary
Emily,Davis,62000,High Salary
David,Wilson,58000,Mid Salary
David,Blaine,50000,Mid Salary
John,Doe,33000,Low Salary
