# Introduction to SQL

## Overview of the SQL query language

The **Structured Query Language (SQL)** has several parts:

1. **Data-definition language (DDL)**; the SQL DDL provides commands for defining relation schemas (e.g. ``student (age, name, sex, ...)``), deleting relations, and modifying relation schemas. Specifically with commands for:
    * **Integrity**; specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed.
    * **View definition**;
2. **Data-manipulation language (DML)**; the SQL DML provides the ability to query information from the database and to insert tuples into, delete tuples from, and modify tuples in a database. Here, it seems crucial to point out that SQL allows for:
    * **transaction control**; a transaction is a unit of work performed against a database, such as update or delete; SQL contains commands to control whether the transactions are actually concluded or not which is useful to guarantee the integrity of the database, namely via commands such as ``COMMIT`` or ``ROLL BACK``

## Basic data types

Main built-in data types supported:

* ``char(n)``; a fixed-length character string with user-specified length n, if the string is shorter than *n* SQL recycles it with spaces, e.g. if ``char(3)``,"hi" becomes "hi "
* ``varchar(n)``; A variable-length character string with user-specified maximum length;
* ``boolean``; logical values TRUE, FALSE AND UNKNOWN
* ``int``; An integer;
* ``smallint``; 
* ``numeric(p,d)``; a fixed-point number with user-specified precision, defined by the *p* parameter as number of digits of the number, and *d* as the number of the *p* digits to the right of the decimal point.
* ``real, double precision``; Floating-point and double-precision floating-point numbers with machine dependent precision;
* ``float(n)``; A floating-point number wiht precision of at least n digits
* ``date``; defaults to the following format YYYY-MM-DD. For other formats, including time and datetime, [see](https://docs.microsoft.com/en-us/sql/t-sql/data-types/date-transact-sql?view=sql-server-ver15)

## Basic Schema Definition (creating databases and tables)

Before actually writting the code, lets create a temp database. If it already exists delete it.



In [72]:
-- conditional creation of a temporary database
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'uni')
    BEGIN
        -- delete the db
        DROP DATABASE uni
        -- create it
        CREATE DATABASE uni
    END
ELSE 
    BEGIN
        -- create it
        CREATE DATABASE uni
    END;
GO

-- double check
SELECT name, database_id AS id FROM sys.databases WHERE name = 'uni';
GO

name,id
uni,7


We define a SQL relation via the <code>CREATE TABLE</code> command. The command below defines the relation department. Notice that integrity constraints are added at the bottom within a list. In this cade we specificy that dept_name is the primary key of the department relation.

In [73]:
-- select the relevant db
USE uni;
GO

-- creat the dept table
CREATE TABLE department (
    dept_name VARCHAR(20),
    building VARCHAR(15),
    budget numeric (12,2),
    PRIMARY KEY (dept_name)
)

-- double-check
SELECT * FROM department;
GO

dept_name,building,budget


Sql <strong>supports a number of different integrity constraints</strong>, namely:

- `FOREIGN KEY (A_1, ..., A_n) REFERENCE s`; recall that foreign key restriction state that the value of attributes `(A_1, ..., A_n)` for any tuple in the relation must correspond to values of the primary key attributes of some tuple in relation s.<br>

<br>

In [74]:
--create another table
CREATE TABLE professors (
    prof_id int,
    prof_name varchar(250),
    salary NUMERIC(8,2),
    dept_name varchar(20),
    PRIMARY KEY (prof_id),
    FOREIGN KEY (dept_name) REFERENCES department
);
GO

SELECT * FROM professors;
GO

prof_id,prof_name,salary,dept_name


- `PRIMARY KEY ((A_1, ..., A_n)`; the primary key attributes are required to be **non-null** and **unique**. The primary key specifications are optional but strongly recommended.

In [75]:
-- another example
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    PRIMARY KEY (course_id),
    FOREIGN KEY (dept_name) REFERENCES department
);
GO

-- primary key can be composed by many fields
CREATE TABLE teaches(
    prof_id int,
    course_id int,
    semester varchar(6),
    year numeric(4,0),
    PRIMARY KEY (prof_id, course_id, semester, year),
    FOREIGN KEY (course_id) REFERENCES courses,
    FOREIGN KEY (prof_id) REFERENCES professors,
);
GO

SELECT * FROM courses;
GO

SELECT * FROM teaches;
GO

course_id,title,dept_name


prof_id,course_id,semester,year


## Droping dabases and tables

It is pretty straightforward.

In [76]:
-- Switch database being used
USE master;
GO
-- droping a database as a whole
DROP DATABASE uni;
GO
-- rebuilding it but without foreign key constraints
CREATE DATABASE uni;
GO
USE uni;
GO
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    PRIMARY KEY (course_id)
);
GO
CREATE TABLE teaches(
    prof_id int,
    course_id int,
    semester varchar(6),
    year numeric(4,0),
    PRIMARY KEY (prof_id, course_id, semester, year)
);
GO
CREATE TABLE professors (
    prof_id int,
    prof_name varchar(250),
    salary NUMERIC(8,2),
    dept_name varchar(20),
    PRIMARY KEY (prof_id),
);
GO

SELECT name, object_id AS id FROM sys.tables;
GO 
-- Drop a table
DROP TABLE courses;
GO
-- check
SELECT name, object_id AS id FROM sys.tables;
GO 
-- Drop two tables
DROP TABLE teaches, professors;
GO
-- check
SELECT name, object_id AS id FROM sys.tables;
GO
-- Switch database being used
USE tempdb;
GO
-- Drop the entire database
DROP DATABASE uni;
GO
-- check
SELECT name FROM sys.databases;
GO  

name,id
courses,581577110
teaches,613577224
professors,645577338


name,id
teaches,613577224
professors,645577338


name,id


name
master
tempdb
model
msdb
udemy
dbm_project


As for dropping columns, we have to preced the ``DROP`` with a ``ALTER TABLE``.

In [77]:
-- rebuilding it but without foreign key constraints
CREATE DATABASE uni;
GO
USE uni;
GO
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    fooh varchar(100)
);
GO
-- Drop the dept_name columns
ALTER TABLE courses
DROP COLUMN dept_name;
GO
SELECT * FROM courses;
GO

-- Drop two columns
ALTER TABLE courses
DROP COLUMN title, course_id;
GO
SELECT * FROM courses;
GO

course_id,title,fooh


fooh


We can also drop constraints. This is particularly useful to do before removing a column with a constraint.

In [78]:
-- Create a table with constraints
CREATE TABLE doc_exc (
    column_a int NOT NULL CONSTRAINT my_constraint UNIQUE,
    column_b int NOT NULL CONSTRAINT my_pk_constraint PRIMARY KEY
);
GO
-- Remove the constraints
ALTER TABLE doc_exc
DROP CONSTRAINT my_constraint, my_pk_constraint;
GO

## Basic structure of SQL queries

The basic structure of SQL queries consists of three clauses:

- `SELECT` -> **column list**;
- `FROM` -> **table list**;
- `WHERE` -> **predicate/logical condition**

A query takes as its input the relations listed in the from clause, operates on them as specified in the where and select clauses, and then produces a relation as the result.

For example, using our database above, “Find the names of all instructors.” Instructor names are found in the instructor relation, so we put that relation in the from clause. The instructor’s name appears in the name attribute, so we put that in the select clause.

```
SELECT name 
FROM instructor;
```
Now consider another query, “Find the department names of all instructors,” which can be written as:

```
SELECT dept_name
FROM instructor;
```
Given the question at hand, we might just be interested in distinct department names since the duplicates are not informative. We can do so by adding this clause.

```
SELECT DISTINCT dept_name
FROM instructor;
```

Now we will focus a bit more on slightly more complex queries. As such lets re-create our database, add two tables and fill it with some fake data. The code below does that.


In [89]:
-- conditional creation of a temporary database
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'uni')
    BEGIN
        -- delete the db
        DROP DATABASE uni
        -- create it
        CREATE DATABASE uni
        USE uni
        -- creat the dept table
        CREATE TABLE department (
            dept_name VARCHAR(20),
            building VARCHAR(15),
            budget numeric (12,2),
            PRIMARY KEY (dept_name)
        )
        -- create the prof table
        CREATE TABLE professors (
            prof_id int,
            prof_name varchar(250),
            dept_name varchar(20),
            salary numeric(8,2),
            PRIMARY KEY (prof_id),
        )
        -- insert some data
        insert into professors values ('10101', 'Srinivasan', 'Comp. Sci.', '65000');
        insert into professors values ('12121', 'Wu', 'Finance', '90000');
        insert into professors values ('15151', 'Mozart', 'Music', '40000');
        insert into department values ('Biology', 'Watson', '90000');
        insert into department values ('Comp. Sci.', 'Taylor', '100000');
        insert into department values ('Elec. Eng.', 'Taylor', '85000');
        insert into department values ('Finance', 'Painter', '120000');
    END;
    GO

In [91]:
USE uni;
GO
SELECT * FROM professors;
GO
SELECT * FROM department;
GO

prof_id,prof_name,dept_name,salary
10101,Srinivasan,Comp. Sci.,65000.0
12121,Wu,Finance,90000.0
15151,Mozart,Music,40000.0


dept_name,building,budget
Biology,Watson,90000.0
Comp. Sci.,Taylor,100000.0
Elec. Eng.,Taylor,85000.0
Finance,Painter,120000.0


As stated before, the <code>where</code> clause filter the rows in the result relation which satisfy a specified predicated. For example, <em>“Find the names of all instructors in the Computer Science department who have salary greater than $40,000.”</em>

In [95]:
SELECT prof_name AS [Relevant Professor] FROM professors
WHERE (
    dept_name = 'Comp. Sci.' AND salary > 40000.00
);
GO

Relevant Professor
Srinivasan


SQL allows the use of the logical connectives and, or, and not in the where clause. The operands of the logical connectives can be expressions involving the comparison operators \<, \<=, \>, \>=, =, and \<\> (<strong>NOT EQUAL TO</strong>). SQL allows us to use the comparison operators to compare strings and arithmetic expressions, as well as special types, such as date types.

Another example,* find all departments which are not hosted at the Taylor building*

In [96]:
SELECT dept_name AS department FROM department
WHERE building <> 'Taylor';
GO

department
Biology
Finance


Another crucial predicate is the <code>LIKE %STRING%</code> which allows for partial matching with wildcards. Crucial here to keep in mind is that <code>%</code> operates as a wild-card and <code>_</code>a placeholder.

In [107]:
-- starting with a ta and ending with r
SELECT dept_name AS department FROM department
WHERE building NOT LIKE 'Ta%r';
GO

department
Biology
Finance


In [104]:
-- regex
SELECT dept_name AS department FROM department
WHERE building NOT LIKE 'Tay__r';
GO

department
Biology
Finance


Similar we can filter observations which are a subset of some list.

In [106]:
SELECT dept_name AS department FROM department
WHERE building IN ('FOOH', ' Baah', 'Watson', 'Painter');
GO

department
Biology
Finance


The ``BETWEEN`` predicate evaluates whether the tuple is whithin a certain interval of values.

In [138]:
SELECT budget FROM department
WHERE budget BETWEEN 95000.00 AND 140000.00;
GO

budget
100000.0
120000.0


budget in thousands


In [140]:
-- refining the result with cast
SELECT CAST(d.budget / 1000 AS int) AS [budget in thousands]
FROM department AS d
WHERE  d.budget > 88;
GO

budget in thousands
90
100
85
120


We also have predicates for filtering out or in NULL observations.

In [141]:
SELECT dept_name AS name, building
FROM department
WHERE building IS NOT NULL;

SELECT dept_name AS name, building
FROM department
WHERE building IS NULL;

name,building
Biology,Watson
Comp. Sci.,Taylor
Elec. Eng.,Taylor
Finance,Painter


name,building


### Queries on Multiple Relations

<br>

<br>