<div>
    <img style="float:right; width:210px" src="images/snext-logo.png"/>
    <div style="float:left;"><h1>Relational Databases and Data Warehousing</h1></div>
</div>

---
# Notebook 2: Intro to relational databases
In this notebook you learn to turn a data model into an actual database using the SQlite database.

> SQLite [...] implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine. SQLite is the most used database engine in the world. SQLite is built into all mobile phones and most computers and comes bundled inside countless other applications that people use every day.

Requirements:
- You should be familiar with the concept of relational databases in general.
- You should be familiar with the syntax of the structured query language (SQL).
---

First we have to tell our Jupyter environment that we want to work with sql in the following cells.

In [None]:
%load_ext sql

Now...
- each cell that starts with ``%%sql`` will be completely considered as SQL code.
- in each cell that starts with ``%sql`` the first line will be considered as one-line SQL statement.


### Open database to create schema

In [None]:
%sql sqlite:///data/my-database.db

---
## <span style="color:#FF5D02;">Assignment: Create the tables from your physical data model</span>
Use the ``CREATE TABLE`` command. If you are not satisfied with a single table or want to change the table you might delete an existing table again with ``DROP TABLE`` or delete the database file in the file browser and re-run above cell to recreate the database.

Please include:
- a text comment before each command that states what you are going to do, e.g. "create table for customer data"
- appropriate field types (INT, TEXT, ...)
- primary keys
- foreign keys and referenced columns

Hints

In [None]:
%%sql 

-- this could be your customer table

DROP TABLE IF EXISTS customers;
CREATE TABLE customers (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name TEXT NOT NULL,
    last_name TEXT NOT NULL,
    email TEXT,
    phone TEXT,
    address TEXT
);

In [None]:
%%sql

-- create all tables from physical schema

CREATE TABLE ...

---
## <span style="color:#FF5D02;">Assignment: Populate your data model with mock data</span>
Mock data is a type of simulated data used for testing and training purposes. It is generated from a set of rules that mimic the characteristics of real data. Mock data is useful because it can provide a realistic environment for testing different strategies and methods without the risk of damaging or altering real data. It also allows students to practice problem-solving and other analytical techniques on a safe, virtual platform.

You'll find mock data for your e-commerce data structure in the following tables:
- mock_customers
- mock_products
- mock_categories

To check, let's output a list of all tables currently existing in the database. You can ignore sqlite_sequence, it's just there for technical reasons.

In [None]:
%sql SELECT name FROM sqlite_master WHERE type='table';

#### Step 1: Explore the mock data structure and compare with your tables
Prepare queries for the mockdata to output a table matching your data structure, especially the number and order of columns should match.

In [None]:
%sql SELECT * FROM mock_customers LIMIT 1

In [None]:
%sql SELECT * FROM mock_categories LIMIT 1

In [None]:
%sql SELECT * FROM mock_products LIMIT 1

In [None]:
%sql SELECT * FROM mock_orders LIMIT 1

#### Step 2: Populate your tables
Copy rows from the mock_* tables to your data structure. Use the INSERT INTO statement.

Documentation can be found [here](https://www.sqlite.org/lang_insert.html).

In [None]:
%%sql

INSERT INTO customers (<list of columns of your table>)
SELECT <list of columns of mock-table> FROM mock_...

#### Step 3: Check the data in your tables
Check if mockdata was successfully imported into your schema. Check with JOINs if IDs match.

In [None]:
%%sql 

SELECT * 
FROM ...
INNER JOIN ...  ON ...
...
LIMIT 10;

---
## <span style="color:#FF5D02;">Assignment: Create a view for an order log</span>
Use the ``CREATE VIEW`` syntax to create a view named ``order_log``, that holds a list of all recent orders with relevant customer, product and category data to that you could display this list in your warehouse and all necessary info to package the goods is included.

Be sure to include the order ID, the customer address, product category, description and price!

In [None]:
%%sql 

DROP VIEW IF EXISTS order_log;

CREATE VIEW order_log AS
    SELECT ...
    

Now let's check your view by selecting some data. Try limiting the output to ~10 records by appending ``LIMIT 10`` to your query, so the your notebook remains well readable.

In [None]:
%%sql 

select * from order_log limit 10

---
## <span style="color:#FF5D02;">Assignment: Create statistics from newly created view<span>
Use the ``SELECT`` statement to create a statistic about some transactions:
- list the number and average price of orders
- for each product category 
- for New Jersey residents.

Set the column headings appropriately and round the average price to two decimals using the ``ROUND`` function. Experiment or look up the specification of the ``ROUND`` function online.

> Hint: You can get filter for New Jersey residents by matching the address string with " NJ ".

In [None]:
%%sql


> ## NOTE
>If you want to continue your work later, 
>- please download this notebook with the completed exercises and 
>- download the my-database.db file to your computer.
>
>Do this by right clicking the files in the file browser on the left hand side and selecting ``download``.