# SQL Basics

## Course: Programming and Data Management (EDI 3400)

### *Vegard H. Larsen (Department of Data Science and Analytics)*

# 1. The Auto Dealership Database

## Let's look at the database in DB browser for SQLite

# 2. Running SQL-code in the Notebook

## An SQL-extension

- Download the file `isqlite3.py` from Ed 
- The file `isqlite3.py` must be stored in the same folder as the Notebook
- Developed by my colleague Jan Kudlicka
- We can now run SQL queries directly in notebook cells 

In [None]:
%load_ext isqlite3

In [None]:
%sql_open files/auto_dealership_database.db

## Employees table

In [None]:
%%sql

SELECT * FROM Employees LIMIT 10

## Cars table

In [None]:
%%sql

SELECT * 
FROM Cars 
LIMIT 10

## Sales table

In [None]:
%%sql

SELECT * 
FROM Sales 
LIMIT 10

## Customers table

In [None]:
%%sql

SELECT * 
FROM Customers
LIMIT 10

# 3. SQL-syntax

## The `SELECT` and `FROM` statements:

- SQL is not case sensitive (Python is case sensitive) and `select` and `SELECT` is treated the same 

`SELECT column_name` 

`FROM table_name`



In [None]:
%%sql

SELECT msrp
FROM Cars
LIMIT 5

## The `WHERE` clause:

`SELECT column_name` 

`FROM table_name` 

`WHERE condition`

In [None]:
%%sql

SELECT id, price 
FROM Sales
WHERE price > 10000
LIMIT 5

In [None]:
%%sql

Select Cars.id, Cars.msrp, Sales.price
FROM Cars, Sales
Where Cars.id = Sales.car_id
LIMIT 5

## The `ORDER BY` statement:

`SELECT column_name`

`FROM table_name`

`ORDER BY column_name [ASC or DESC]`

In [None]:
%%sql

SELECT id, name, salary
FROM Employees
ORDER BY salary DESC
LIMIT 10

## `INNER JOIN`
`INNER JOIN` combines rows from two or more tables based on a related column between them. This type of join returns only the rows that have matching values in both tables. If there is no match, the rows will not appear in the result set.

In SQL, if you use JOIN without specifying the type of join, it defaults to INNER JOIN. Therefore, JOIN and INNER JOIN are functionally equivalent and will produce the same result set, which includes only the rows that have matching values in both tables being joined.


In [None]:
%%sql

SELECT Customers.first_name, Customers.last_name, Sales.date, Sales.price
FROM Customers
INNER JOIN Sales ON Customers.id = Sales.customer_id
WHERE Sales.price > 90000
LIMIT 10

In [None]:
%%sql

SELECT
    Customers.first_name,
    Customers.last_name,
    Cars.make,
    Cars.type,
    Cars.msrp,
    Sales.date,
    Sales.price
FROM Customers
JOIN Sales ON Customers.id = Sales.customer_id
JOIN Cars ON Sales.car_id = Cars.id
LIMIT 10


## Functions:
- `count` used to count rows 
- `sum` used to sum values
- `min` used to find the minimum value
- `max` used to find the maximum value
- `avg` used to calculate the mean

## GROUP BY
The `GROUP BY` statement is used to arrange identical data into groups with the help of aggregate functions like `count()`, `sum()`, `max()`, `min()`, `avg()`. When you use a `GROUP BY`` clause, you're instructing the SQL database to combine rows that have the same values in the specified columns into summary rows, like "find the number of customers in each country" or "calculate the total revenue from each product category."

Without `GROUP BY`, an aggregate function like `count()` would return a single value for the entire table or dataset.
With `GROUP BY`, you get the count for each group separately.

## Examples

### Count the number of sales each employee has made.

In [None]:
%%sql

SELECT employee_id, count(*) AS number_of_sales
FROM Sales
GROUP BY employee_id

### Find the minimum sale price of all sales.

In [None]:
%%sql

SELECT min(price) AS min_sale_price
FROM Sales

### Find the maximum sale price of all sales.

In [None]:
%%sql

SELECT max(price) AS max_sale_price
FROM Sales

### Calculate the average salary of all employees.

In [None]:
%%sql 

SELECT AVG(salary) AS average_salary
FROM Employees