# Introduction to SQL

## Overview of the SQL query language

The **Structured Query Language (SQL)** has several parts:

1. **Data-definition language (DDL)**; the SQL DDL provides commands for defining relation schemas (e.g. ``student (age, name, sex, ...)``), deleting relations, and modifying relation schemas. Specifically with commands for:
    * **Integrity**; specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed.
    * **View definition**;
2. **Data-manipulation language (DML)**; the SQL DML provides the ability to query information from the database and to insert tuples into, delete tuples from, and modify tuples in a database. Here, it seems crucial to point out that SQL allows for:
    * **transaction control**; a transaction is a unit of work performed against a database, such as update or delete; SQL contains commands to control whether the transactions are actually concluded or not which is useful to guarantee the integrity of the database, namely via commands such as ``COMMIT`` or ``ROLL BACK``

## Basic data types

Main built-in data types supported:

* ``char(n)``; a fixed-length character string with user-specified length n, if the string is shorter than *n* SQL recycles it with spaces, e.g. if ``char(3)``,"hi" becomes "hi "
* ``varchar(n)``; A variable-length character string with user-specified maximum length;
* ``boolean``; logical values TRUE, FALSE AND UNKNOWN
* ``int``; An integer;
* ``smallint``; 
* ``numeric(p,d)``; a fixed-point number with user-specified precision, defined by the *p* parameter as number of digits of the number, and *d* as the number of the *p* digits to the right of the decimal point.
* ``real, double precision``; Floating-point and double-precision floating-point numbers with machine dependent precision;
* ``float(n)``; A floating-point number wiht precision of at least n digits
* ``date``; defaults to the following format YYYY-MM-DD. For other formats, including time and datetime, [see](https://docs.microsoft.com/en-us/sql/t-sql/data-types/date-transact-sql?view=sql-server-ver15)

## Basic Schema Definition (creating databases and tables)

Before actually writting the code, lets create a temp database. If it already exists delete it.



In [1]:
-- conditional creation of a temporary database
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'uni')
    BEGIN
        -- delete the db
        DROP DATABASE uni
        -- create it
        CREATE DATABASE uni
    END
ELSE 
    BEGIN
        -- create it
        CREATE DATABASE uni
    END;
GO

-- double check
SELECT name, database_id AS id FROM sys.databases WHERE name = 'uni';
GO

name,id
uni,7


We define a SQL relation via the <code>CREATE TABLE</code> command. The command below defines the relation department. Notice that integrity constraints are added at the bottom within a list. In this cade we specificy that dept_name is the primary key of the department relation.

In [2]:
-- select the relevant db
USE uni;
GO

-- creat the dept table
CREATE TABLE department (
    dept_name VARCHAR(20),
    building VARCHAR(15),
    budget numeric (12,2),
    PRIMARY KEY (dept_name)
)

-- double-check
SELECT * FROM department;
GO

dept_name,building,budget


Sql <strong>supports a number of different integrity constraints</strong>, namely:

- `FOREIGN KEY (A_1, ..., A_n) REFERENCE s`; recall that foreign key restriction state that the value of attributes `(A_1, ..., A_n)` for any tuple in the relation must correspond to values of the primary key attributes of some tuple in relation s.<br>

<br>

In [3]:
--create another table
CREATE TABLE professors (
    prof_id int,
    prof_name varchar(250),
    salary NUMERIC(8,2),
    dept_name varchar(20),
    PRIMARY KEY (prof_id),
    FOREIGN KEY (dept_name) REFERENCES department
);
GO

SELECT * FROM professors;
GO

prof_id,prof_name,salary,dept_name


- `PRIMARY KEY ((A_1, ..., A_n)`; the primary key attributes are required to be **non-null** and **unique**. The primary key specifications are optional but strongly recommended.

In [4]:
-- another example
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    PRIMARY KEY (course_id),
    FOREIGN KEY (dept_name) REFERENCES department
);
GO

-- primary key can be composed by many fields
CREATE TABLE teaches(
    prof_id int,
    course_id int,
    semester varchar(6),
    year numeric(4,0),
    PRIMARY KEY (prof_id, course_id, semester, year),
    FOREIGN KEY (course_id) REFERENCES courses,
    FOREIGN KEY (prof_id) REFERENCES professors,
);
GO

SELECT * FROM courses;
GO

SELECT * FROM teaches;
GO

course_id,title,dept_name


prof_id,course_id,semester,year


## Droping dabases and tables

It is pretty straightforward.

In [5]:
-- Switch database being used
USE master;
GO
-- droping a database as a whole
DROP DATABASE uni;
GO
-- rebuilding it but without foreign key constraints
CREATE DATABASE uni;
GO
USE uni;
GO
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    PRIMARY KEY (course_id)
);
GO
CREATE TABLE teaches(
    prof_id int,
    course_id int,
    semester varchar(6),
    year numeric(4,0),
    PRIMARY KEY (prof_id, course_id, semester, year)
);
GO
CREATE TABLE professors (
    prof_id int,
    prof_name varchar(250),
    salary NUMERIC(8,2),
    dept_name varchar(20),
    PRIMARY KEY (prof_id),
);
GO

SELECT name, object_id AS id FROM sys.tables;
GO 
-- Drop a table
DROP TABLE courses;
GO
-- check
SELECT name, object_id AS id FROM sys.tables;
GO 
-- Drop two tables
DROP TABLE teaches, professors;
GO
-- check
SELECT name, object_id AS id FROM sys.tables;
GO
-- Switch database being used
USE tempdb;
GO
-- Drop the entire database
DROP DATABASE uni;
GO
-- check
SELECT name FROM sys.databases;
GO  

name,id
courses,581577110
teaches,613577224
professors,645577338


name,id
teaches,613577224
professors,645577338


name,id


name
master
tempdb
model
msdb
udemy
dbm_project


As for dropping columns, we have to preced the ``DROP`` with a ``ALTER TABLE``.

In [6]:
-- rebuilding it but without foreign key constraints
CREATE DATABASE uni;
GO
USE uni;
GO
CREATE TABLE courses (
    course_id int,
    title varchar(15),
    dept_name varchar(20),
    fooh varchar(100)
);
GO
-- Drop the dept_name columns
ALTER TABLE courses
DROP COLUMN dept_name;
GO
SELECT * FROM courses;
GO

-- Drop two columns
ALTER TABLE courses
DROP COLUMN title, course_id;
GO
SELECT * FROM courses;
GO

course_id,title,fooh


fooh


We can also drop constraints. This is particularly useful to do before removing a column with a constraint.

In [7]:
-- Create a table with constraints
CREATE TABLE doc_exc (
    column_a int NOT NULL CONSTRAINT my_constraint UNIQUE,
    column_b int NOT NULL CONSTRAINT my_pk_constraint PRIMARY KEY
);
GO
-- Remove the constraints
ALTER TABLE doc_exc
DROP CONSTRAINT my_constraint, my_pk_constraint;
GO

## Basic structure of SQL queries

The basic structure of SQL queries consists of three clauses:

- `SELECT` -\> **column list**;
- `FROM` -\> **table list**;
- `WHERE` -\> **predicate/logical condition**

For example, using our database above, “Find the names of all instructors.” Instructor names are found in the instructor relation, so we put that relation in the from clause. The instructor’s name appears in the name attribute, so we put that in the select clause.

```
SELECT name 
FROM instructor;
```

Now consider another query, “Find the department names of all instructors,” which can be written as:

```
SELECT dept_name
FROM instructor;
```

Given the question at hand, we might just be interested in distinct department names since the duplicates are not informative. We can do so by adding this clause.

```
SELECT DISTINCT dept_name
FROM instructor;
```

Now we will focus a bit more on slightly more complex queries. As such lets re-create our database, add two tables and fill it with some fake data. The code below does that.

In [8]:
-- conditional creation of a temporary database
USE master;
GO
IF EXISTS (SELECT * FROM sys.databases WHERE name = 'uni')
    BEGIN
        -- delete the db
        DROP DATABASE uni
        -- create it
        CREATE DATABASE uni
        USE uni
        -- creat the dept table
        CREATE TABLE department (
            dept_name VARCHAR(20),
            building VARCHAR(15),
            budget numeric (12,2),
            PRIMARY KEY (dept_name)
        )
        -- create the prof table
        CREATE TABLE professors (
            prof_id int,
            prof_name varchar(250),
            dept_name varchar(20),
            salary numeric(8,2),
            PRIMARY KEY (prof_id),
        )
        -- insert some data
        insert into professors values ('10101', 'Srinivasan', 'Comp. Sci.', '65000');
        insert into professors values ('12121', 'Wu', 'Finance', '90000');
        insert into professors values ('15151', 'Mozart', 'Music', '40000');
        insert into department values ('Biology', 'Watson', '90000');
        insert into department values ('Comp. Sci.', 'Taylor', '100000');
        insert into department values ('Elec. Eng.', 'Taylor', '85000');
        insert into department values ('Finance', 'Painter', '120000');
    END;
    GO

In [9]:
USE uni;
GO
SELECT * FROM professors;
GO
SELECT * FROM department;
GO

prof_id,prof_name,dept_name,salary
10101,Srinivasan,Comp. Sci.,65000.0
12121,Wu,Finance,90000.0
15151,Mozart,Music,40000.0


dept_name,building,budget
Biology,Watson,90000.0
Comp. Sci.,Taylor,100000.0
Elec. Eng.,Taylor,85000.0
Finance,Painter,120000.0


As stated before, the <code>where</code> clause filter the rows in the result relation which satisfy a specified predicated. For example, <em>“Find the names of all instructors in the Computer Science department who have salary greater than $40,000.”</em>

In [10]:
SELECT prof_name AS [Relevant Professor] FROM professors
WHERE (
    dept_name = 'Comp. Sci.' AND salary > 40000.00
);
GO

Relevant Professor
Srinivasan


SQL allows the use of the logical connectives and, or, and not in the where clause. The operands of the logical connectives can be expressions involving the comparison operators \<, \<=, \>, \>=, =, and \<\> (<strong>NOT EQUAL TO</strong>). SQL allows us to use the comparison operators to compare strings and arithmetic expressions, as well as special types, such as date types.

Another example,* find all departments which are not hosted at the Taylor building*

In [11]:
SELECT dept_name AS department FROM department
WHERE building <> 'Taylor';
GO

department
Biology
Finance


Another crucial predicate is the <code>LIKE %STRING%</code> which allows for partial matching with wildcards.

* Percent (%): The % character matches any substring.
* Underscore ( ): The character matches any character.

In [12]:
-- starting with a ta and ending with r
SELECT dept_name AS department FROM department
WHERE building NOT LIKE 'Ta%r';
GO

department
Biology
Finance


In [13]:
-- regex
SELECT dept_name AS department FROM department
WHERE building NOT LIKE 'Tay__r';
GO

department
Biology
Finance


Similar we can filter observations which are a subset of some list.

In [14]:
SELECT dept_name AS department FROM department
WHERE building IN ('FOOH', ' Baah', 'Watson', 'Painter');
GO

department
Biology
Finance


The ``BETWEEN`` predicate evaluates whether the tuple is whithin a certain interval of values.

In [15]:
SELECT budget FROM department
WHERE budget BETWEEN 95000.00 AND 140000.00;
GO

budget
100000.0
120000.0


In [16]:
-- refining the result with cast
SELECT d.dept_name, CAST(d.budget / 1000 AS int) AS [budget in thousands]
FROM department AS d
ORDER BY [budget in thousands] desc;
GO

dept_name,budget in thousands
Finance,120
Comp. Sci.,100
Biology,90
Elec. Eng.,85


We also have predicates for filtering out or in NULL observations.

In [17]:
SELECT dept_name AS name, building
FROM department
WHERE building IS NOT NULL;

SELECT dept_name AS name, building
FROM department
WHERE building IS NULL;

name,building
Biology,Watson
Comp. Sci.,Taylor
Elec. Eng.,Taylor
Finance,Painter


name,building


### Queries on Multiple Relations

As an example, suppose we want to answer the query *“Retrieve the names of all instructors, along with their department names and department building name.* Notice how we droped one professor from the music department. That is because the music department is still missing from this table.

In [18]:
SELECT p.prof_name AS names, p.dept_name, d.building
FROM professors AS p, department AS d
WHERE p.dept_name = d.dept_name;
GO

names,dept_name,building
Srinivasan,Comp. Sci.,Taylor
Wu,Finance,Painter


Although the clauses must be written in the order above, the easiest way to understand the operations specified by the query is to consider the clauses in operational order: first <strong>from</strong>, then <strong>where</strong>, and then <strong>select</strong>.

  

The from clause by itself defines a cartesian product of the relations listed in the clause. The resulting relation has all attributes from all relations present in the clause.

In [19]:
-- cartesian product of two relations in SQL
SELECT prof.*, dept.* 
FROM professors AS prof, department AS dept;
GO

prof_id,prof_name,dept_name,salary,dept_name.1,building,budget
10101,Srinivasan,Comp. Sci.,65000.0,Biology,Watson,90000.0
10101,Srinivasan,Comp. Sci.,65000.0,Comp. Sci.,Taylor,100000.0
10101,Srinivasan,Comp. Sci.,65000.0,Elec. Eng.,Taylor,85000.0
10101,Srinivasan,Comp. Sci.,65000.0,Finance,Painter,120000.0
12121,Wu,Finance,90000.0,Biology,Watson,90000.0
12121,Wu,Finance,90000.0,Comp. Sci.,Taylor,100000.0
12121,Wu,Finance,90000.0,Elec. Eng.,Taylor,85000.0
12121,Wu,Finance,90000.0,Finance,Painter,120000.0
15151,Mozart,Music,40000.0,Biology,Watson,90000.0
15151,Mozart,Music,40000.0,Comp. Sci.,Taylor,100000.0


This leads to a very large relation with plenty of duplicated and incorrenct information. Instead, the predicate in the where clause is used to restrict the combinations created by the cartesian product to those that are meaningful for the relevant question.

### Set operations

SQL operations <strong>UNION</strong>, <strong>INTERSECT</strong>, and <strong>EXCEPT</strong> operate in relations and correspond to the mathematical set operations ∪, ∩, and −.

We can find the set of all disciplines at the Taylor building with a budget above 89,000

In [20]:
SELECT dept_name AS name 
FROM department
WHERE building = 'Taylor' and budget > 89000.00;
GO

name
Comp. Sci.


Similarly, we can find the set of all disciplines at the Watson building with a budget above 89,000

In [21]:
SELECT dept_name AS name 
FROM department
WHERE building = 'Watson' and budget > 89000.00;
GO


name
Biology


To find the set of all departments in Taylor or Watson with a budget above 89.000 we can resort o the union operator.

In [22]:
(SELECT dept_name AS name 
FROM department
WHERE building = 'Taylor' and budget > 89000.00)
UNION
(SELECT dept_name AS name 
FROM department
WHERE building = 'Watson' and budget > 89000.00);
GO

-- alternatively, we could've have written
SELECT dept_name AS name 
FROM department
WHERE building IN ('Taylor', 'Watson') and budget > 89000.00;
GO

name
Biology
Comp. Sci.


name
Biology
Comp. Sci.


Now turning to the intersection, we can identify the set of department names which are mentioned in both tables.<br>

In [23]:
SELECT dept_name FROM professors
INTERSECT
SELECT dept_name FROM department;
GO

dept_name
Comp. Sci.
Finance


The except operator outputs all tuples from the first input relation which do not occur in the second input (i.e. as an anti-join). As we saw before, it is confirmed here that the only department in the professors tables which is not in the department table is the music department.

In [24]:
SELECT dept_name FROM professors
EXCEPT
SELECT dept_name FROM department;
GO

dept_name
Music


### Null Values

Null values present special problem in relational operations, including arithmetic, comparison, and set operations. In comparison operations, SQL treats null values as unknown.

In [25]:
-- add an empty column to departments
insert into department values ('Law.', 'Sottomayor', NULL)

For example, in this query SQL identifies NULL as part of the resulting tuple of department with budgets higher than 10000 but also notice that once we make this specific query, the row with the NULL is not part of it - which does not make much sense.

In [26]:
SELECT budget FROM department
WHERE budget < 10000.00;
GO

SELECT budget FROM department
WHERE budget > 10000.00;
GO

budget


budget
90000.0
100000.0
85000.0
120000.0


SQL treats treats it as an unknown. As we know, UNKNOWN is a logical operator which implies that we can manipulate it in the WHERE clause using logical comparisons. However, we have a special keyword ``null`` which we can use in the predicate to work with null values

In [27]:
SELECT budget FROM department
WHERE budget IS NOT NULL AND budget > 10000.00;
GO

-- we can also identify the tuple with the missing data
SELECT * FROM department 
WHERE budget IS NULL;
GO

budget
90000.0
100000.0
85000.0
120000.0


dept_name,building,budget
Law.,Sottomayor,


### Aggregate functions

Aggregate functions are functions that take a collection of valus as input and return a single value. SQL offers five standard built-in aggregate functions:

- Average: `avg`
- Minimum: `min`
- Maximum: `max`
- Total: `sum`
- Count: `count`

For example, in the query below we find the average salary of the professors.

In [28]:
SELECT AVG(salary) as [Avg. budget]
FROM professors;
GO

-- issue of nulls
SELECT AVG(budget) as [Avg. budget]
FROM department
WHERE budget IS NOT NULL;
GO

Avg. budget
65000.0


Avg. budget
98750.0


In this case we also remove duplicates since we want to know how many buildings are in our university database.

In [29]:
SELECT COUNT(
    DISTINCT building
) AS [number of buildings]
FROM department;
GO
-- check
SELECT DISTINCT building
FROM department;
GO

number of buildings
4


building
Painter
Sottomayor
Taylor
Watson


1. Quite useful to know, it is ilegal to count the number of distinct rows in the following way. However, we can just count them as a whole.

In [30]:

-- this is fine
SELECT COUNT (*) as [Row count]
FROM department;
GO


Row count
5


### Aggregation by groups

There are circumstances where we would like to apply the aggregate function not only to a single set of tuples, but also to a group of sets of tuples. For this it is useful to get a larger database, so I am re-creating the uni data base [using the followign scripts](https://www.db-book.com/db7/university-lab-dir/sample_tables-dir/index.html)

In [40]:
USE master;
GO

-- dump the database
IF EXISTS (SELECT name FROM sys.databases WHERE name = 'uni')
    BEGIN
        DROP DATABASE uni
    END;
    GO

-- check
SELECT name FROM sys.databases;
GO

-- re-create the database
CREATE DATABASE uni;
GO
USE uni;
GO
-- create tables
create table classroom
	(building		varchar(15),
	 room_number		varchar(7),
	 capacity		numeric(4,0),
	 primary key (building, room_number)
	);

create table department
	(dept_name		varchar(20), 
	 building		varchar(15), 
	 budget		        numeric(12,2) check (budget > 0),
	 primary key (dept_name)
	);

create table course
	(course_id		varchar(8), 
	 title			varchar(50), 
	 dept_name		varchar(20),
	 credits		numeric(2,0) check (credits > 0),
	 primary key (course_id),
	 foreign key (dept_name) references department (dept_name)
		on delete set null
	);

create table instructor
	(ID			varchar(5), 
	 name			varchar(20) not null, 
	 dept_name		varchar(20), 
	 salary			numeric(8,2) check (salary > 29000),
	 primary key (ID),
	 foreign key (dept_name) references department (dept_name)
		on delete set null
	);

create table section
	(course_id		varchar(8), 
         sec_id			varchar(8),
	 semester		varchar(6)
		check (semester in ('Fall', 'Winter', 'Spring', 'Summer')), 
	 year			numeric(4,0) check (year > 1701 and year < 2100), 
	 building		varchar(15),
	 room_number		varchar(7),
	 time_slot_id		varchar(4),
	 primary key (course_id, sec_id, semester, year),
	 foreign key (course_id) references course (course_id)
		on delete cascade,
	 foreign key (building, room_number) references classroom (building, room_number)
		on delete set null
	);

create table teaches
	(ID			varchar(5), 
	 course_id		varchar(8),
	 sec_id			varchar(8), 
	 semester		varchar(6),
	 year			numeric(4,0),
	 primary key (ID, course_id, sec_id, semester, year),
	 foreign key (course_id, sec_id, semester, year) references section (course_id, sec_id, semester, year)
		on delete cascade,
	 foreign key (ID) references instructor (ID)
		on delete cascade
	);

create table student
	(ID			varchar(5), 
	 name			varchar(20) not null, 
	 dept_name		varchar(20), 
	 tot_cred		numeric(3,0) check (tot_cred >= 0),
	 primary key (ID),
	 foreign key (dept_name) references department (dept_name)
		on delete set null
	);

create table takes
	(ID			varchar(5), 
	 course_id		varchar(8),
	 sec_id			varchar(8), 
	 semester		varchar(6),
	 year			numeric(4,0),
	 grade		        varchar(2),
	 primary key (ID, course_id, sec_id, semester, year),
	 foreign key (course_id, sec_id, semester, year) references section (course_id, sec_id, semester, year)
		on delete cascade,
	 foreign key (ID) references student (ID)
		on delete cascade
	);

create table advisor
	(s_ID			varchar(5),
	 i_ID			varchar(5),
	 primary key (s_ID),
	 foreign key (i_ID) references instructor (ID)
		on delete set null,
	 foreign key (s_ID) references student (ID)
		on delete cascade
	);

create table time_slot
	(time_slot_id		varchar(4),
	 day			varchar(1),
	 start_hr		numeric(2) check (start_hr >= 0 and start_hr < 24),
	 start_min		numeric(2) check (start_min >= 0 and start_min < 60),
	 end_hr			numeric(2) check (end_hr >= 0 and end_hr < 24),
	 end_min		numeric(2) check (end_min >= 0 and end_min < 60),
	 primary key (time_slot_id, day, start_hr, start_min)
	);

create table prereq
	(course_id		varchar(8), 
	 prereq_id		varchar(8),
	 primary key (course_id, prereq_id),
	 foreign key (course_id) references course (course_id)
		on delete cascade,
	 foreign key (prereq_id) references course (course_id)
	);

-- add data
insert into time_slot values ( 'A', 'M', 8, 0, 8, 50);
insert into time_slot values ( 'A', 'W', 8, 0, 8, 50);
insert into time_slot values ( 'A', 'F', 8, 0, 8, 50);
insert into time_slot values ( 'B', 'M', 9, 0, 9, 50);
insert into time_slot values ( 'B', 'W', 9, 0, 9, 50);
insert into time_slot values ( 'B', 'F', 9, 0, 9, 50);
insert into time_slot values ( 'C', 'M', 11, 0, 11, 50);
insert into time_slot values ( 'C', 'W', 11, 0, 11, 50);
insert into time_slot values ( 'C', 'F', 11, 0, 11, 50);
insert into time_slot values ( 'D', 'M', 13, 0, 13, 50);
insert into time_slot values ( 'D', 'W', 13, 0, 13, 50);
insert into time_slot values ( 'D', 'F', 13, 0, 13, 50);
insert into time_slot values ( 'E', 'T', 10, 30, 11, 45);
insert into time_slot values ( 'E', 'R', 10, 30, 11, 45);
insert into time_slot values ( 'F', 'T', 14, 30, 15, 45);
insert into time_slot values ( 'F', 'R', 14, 30, 15, 45);
insert into time_slot values ( 'G', 'M', 16, 0, 16, 50);
insert into time_slot values ( 'G', 'W', 16, 0, 16, 50);
insert into time_slot values ( 'G', 'F', 16, 0, 16, 50);
insert into time_slot values ( 'H', 'W', 10, 0, 12, 30);
insert into classroom values('Lamberton', 134, 10);
insert into classroom values('Chandler', 375, 10);
insert into classroom values('Fairchild', 145, 27);
insert into classroom values('Nassau', 45, 92);
insert into classroom values('Grace', 40, 34);
insert into classroom values('Whitman', 134, 120);
insert into classroom values('Lamberton', 143, 10);
insert into classroom values('Taylor', 812, 115);
insert into classroom values('Saucon', 113, 109);
insert into classroom values('Painter', 86, 97);
insert into classroom values('Alumni', 547, 26);
insert into classroom values('Alumni', 143, 47);
insert into classroom values('Drown', 757, 18);
insert into classroom values('Saucon', 180, 15);
insert into classroom values('Whitman', 434, 32);
insert into classroom values('Saucon', 844, 24);
insert into classroom values('Bronfman', 700, 12);
insert into classroom values('Polya', 808, 28);
insert into classroom values('Gates', 707, 65);
insert into classroom values('Gates', 314, 10);
insert into classroom values('Main', 45, 30);
insert into classroom values('Taylor', 183, 71);
insert into classroom values('Power', 972, 10);
insert into classroom values('Garfield', 119, 59);
insert into classroom values('Rathbone', 261, 60);
insert into classroom values('Stabler', 105, 113);
insert into classroom values('Power', 717, 12);
insert into classroom values('Main', 425, 22);
insert into classroom values('Lambeau', 348, 51);
insert into classroom values('Chandler', 804, 11);
insert into department values('Civil Eng.', 'Chandler', 255041.46);
insert into department values('Biology', 'Candlestick', 647610.55);
insert into department values('History', 'Taylor', 699140.86);
insert into department values('Physics', 'Wrigley', 942162.76);
insert into department values('Marketing', 'Lambeau', 210627.58);
insert into department values('Pol. Sci.', 'Whitman', 573745.09);
insert into department values('English', 'Palmer', 611042.66);
insert into department values('Accounting', 'Saucon', 441840.92);
insert into department values('Comp. Sci.', 'Lamberton', 106378.69);
insert into department values('Languages', 'Linderman', 601283.60);
insert into department values('Finance', 'Candlestick', 866831.75);
insert into department values('Geology', 'Palmer', 406557.93);
insert into department values('Cybernetics', 'Mercer', 794541.46);
insert into department values('Astronomy', 'Taylor', 617253.94);
insert into department values('Athletics', 'Bronfman', 734550.70);
insert into department values('Statistics', 'Taylor', 395051.74);
insert into department values('Psychology', 'Thompson', 848175.04);
insert into department values('Math', 'Brodhead', 777605.11);
insert into department values('Elec. Eng.', 'Main', 276527.61);
insert into department values('Mech. Eng.', 'Rauch', 520350.65);
insert into course values('787', 'C  Programming', 'Mech. Eng.', 4);
insert into course values('238', 'The Music of Donovan', 'Mech. Eng.', 3);
insert into course values('608', 'Electron Microscopy', 'Mech. Eng.', 3);
insert into course values('539', 'International Finance', 'Comp. Sci.', 3);
insert into course values('278', 'Greek Tragedy', 'Statistics', 4);
insert into course values('972', 'Greek Tragedy', 'Psychology', 4);
insert into course values('391', 'Virology', 'Biology', 3);
insert into course values('814', 'Compiler Design', 'Elec. Eng.', 3);
insert into course values('272', 'Geology', 'Mech. Eng.', 3);
insert into course values('612', 'Mobile Computing', 'Physics', 3);
insert into course values('237', 'Surfing', 'Cybernetics', 3);
insert into course values('313', 'International Trade', 'Marketing', 3);
insert into course values('887', 'Latin', 'Mech. Eng.', 3);
insert into course values('328', 'Composition and Literature', 'Cybernetics', 3);
insert into course values('984', 'Music of the 50s', 'History', 3);
insert into course values('241', 'Biostatistics', 'Geology', 3);
insert into course values('338', 'Graph Theory', 'Psychology', 3);
insert into course values('400', 'Visual BASIC', 'Psychology', 4);
insert into course values('760', 'How to Groom your Cat', 'Accounting', 3);
insert into course values('629', 'Finite Element Analysis', 'Cybernetics', 3);
insert into course values('762', 'The Monkeys', 'History', 4);
insert into course values('242', 'Rock and Roll', 'Marketing', 3);
insert into course values('482', 'FOCAL Programming', 'Psychology', 4);
insert into course values('581', 'Calculus', 'Pol. Sci.', 4);
insert into course values('843', 'Environmental Law', 'Math', 4);
insert into course values('679', 'The Beatles', 'Math', 3);
insert into course values('704', 'Marine Mammals', 'Geology', 4);
insert into course values('774', 'Game Programming', 'Cybernetics', 4);
insert into course values('591', 'Shakespeare', 'Pol. Sci.', 4);
insert into course values('319', 'World History', 'Finance', 4);
insert into course values('960', 'Tort Law', 'Civil Eng.', 3);
insert into course values('274', 'Corporate Law', 'Comp. Sci.', 4);
insert into course values('426', 'Video Gaming', 'Finance', 3);
insert into course values('852', 'World History', 'Athletics', 4);
insert into course values('408', 'Bankruptcy', 'Accounting', 3);
insert into course values('808', 'Organic Chemistry', 'English', 4);
insert into course values('902', 'Existentialism', 'Finance', 3);
insert into course values('730', 'Quantum Mechanics', 'Elec. Eng.', 4);
insert into course values('362', 'Embedded Systems', 'Finance', 4);
insert into course values('341', 'Quantum Mechanics', 'Cybernetics', 3);
insert into course values('582', 'Marine Mammals', 'Cybernetics', 3);
insert into course values('867', 'The IBM 360 Architecture', 'History', 3);
insert into course values('169', 'Marine Mammals', 'Elec. Eng.', 3);
insert into course values('680', 'Electricity and Magnetism', 'Civil Eng.', 3);
insert into course values('227', 'Elastic Structures', 'Languages', 4);
insert into course values('991', 'Transaction Processing', 'Psychology', 3);
insert into course values('366', 'Computational Biology', 'English', 3);
insert into course values('376', 'Cost Accounting', 'Physics', 4);
insert into course values('489', 'Journalism', 'Astronomy', 4);
insert into course values('663', 'Geology', 'Psychology', 3);
insert into course values('461', 'Physical Chemistry', 'Math', 3);
insert into course values('105', 'Image Processing', 'Astronomy', 3);
insert into course values('407', 'Industrial Organization', 'Languages', 4);
insert into course values('254', 'Security', 'Cybernetics', 3);
insert into course values('998', 'Immunology', 'Civil Eng.', 4);
insert into course values('457', 'Systems Software', 'History', 3);
insert into course values('401', 'Sanitary Engineering', 'Athletics', 4);
insert into course values('127', 'Thermodynamics', 'Geology', 3);
insert into course values('399', 'RPG Programming', 'Pol. Sci.', 4);
insert into course values('949', 'Japanese', 'Comp. Sci.', 3);
insert into course values('496', 'Aquatic Chemistry', 'Cybernetics', 3);
insert into course values('334', 'International Trade', 'Athletics', 3);
insert into course values('544', 'Differential Geometry', 'Statistics', 3);
insert into course values('451', 'Database System Concepts', 'Pol. Sci.', 4);
insert into course values('190', 'Romantic Literature', 'Civil Eng.', 3);
insert into course values('630', 'Religion', 'English', 3);
insert into course values('761', 'Existentialism', 'Athletics', 3);
insert into course values('804', 'Introduction to Burglary', 'Cybernetics', 4);
insert into course values('781', 'Compiler Design', 'Finance', 4);
insert into course values('805', 'Composition and Literature', 'Statistics', 4);
insert into course values('318', 'Geology', 'Cybernetics', 3);
insert into course values('353', 'Operating Systems', 'Psychology', 3);
insert into course values('394', 'C  Programming', 'Athletics', 3);
insert into course values('137', 'Manufacturing', 'Finance', 3);
insert into course values('192', 'Drama', 'Languages', 4);
insert into course values('681', 'Medieval Civilization or Lack Thereof', 'English', 3);
insert into course values('377', 'Differential Geometry', 'Astronomy', 4);
insert into course values('959', 'Bacteriology', 'Physics', 4);
insert into course values('235', 'International Trade', 'Math', 3);
insert into course values('421', 'Aquatic Chemistry', 'Athletics', 4);
insert into course values('647', 'Service-Oriented Architectures', 'Comp. Sci.', 4);
insert into course values('598', 'Number Theory', 'Accounting', 4);
insert into course values('858', 'Sailing', 'Math', 4);
insert into course values('487', 'Physical Chemistry', 'History', 3);
insert into course values('133', 'Antidisestablishmentarianism in Modern America', 'Biology', 4);
insert into course values('267', 'Hydraulics', 'Physics', 4);
insert into course values('200', 'The Music of the Ramones', 'Accounting', 4);
insert into course values('664', 'Elastic Structures', 'English', 3);
insert into course values('599', 'Mechanics', 'Psychology', 4);
insert into course values('456', 'Hebrew', 'Civil Eng.', 3);
insert into course values('558', 'Environmental Law', 'Psychology', 3);
insert into course values('919', 'Computability Theory', 'Math', 3);
insert into course values('546', 'Creative Writing', 'Mech. Eng.', 4);
insert into course values('969', 'The Monkeys', 'Astronomy', 4);
insert into course values('877', 'Composition and Literature', 'Biology', 4);
insert into course values('337', 'Differential Geometry', 'Statistics', 3);
insert into course values('983', 'Virology', 'Languages', 4);
insert into course values('603', 'Care and Feeding of Cats', 'Statistics', 3);
insert into course values('747', 'International Practicum', 'Comp. Sci.', 4);
insert into course values('659', 'Geology', 'Math', 4);
insert into course values('559', 'Martian History', 'Biology', 3);
insert into course values('403', 'Immunology', 'Biology', 3);
insert into course values('436', 'Stream Processing', 'Physics', 4);
insert into course values('656', 'Groups and Rings', 'Civil Eng.', 4);
insert into course values('731', 'The Music of Donovan', 'Physics', 4);
insert into course values('820', 'Assembly Language Programming', 'Cybernetics', 3);
insert into course values('898', 'Petroleum Engineering', 'Marketing', 4);
insert into course values('545', 'International Practicum', 'History', 3);
insert into course values('893', 'Systems Software', 'Cybernetics', 3);
insert into course values('818', 'Environmental Law', 'Astronomy', 4);
insert into course values('618', 'Thermodynamics', 'English', 4);
insert into course values('416', 'Data Mining', 'Accounting', 3);
insert into course values('716', 'Medieval Civilization or Lack Thereof', 'Languages', 4);
insert into course values('130', 'Differential Geometry', 'Physics', 3);
insert into course values('476', 'International Communication', 'Astronomy', 4);
insert into course values('101', 'Diffusion and Phase Transformation', 'Mech. Eng.', 3);
insert into course values('123', 'Differential Equations', 'Mech. Eng.', 3);
insert into course values('209', 'International Trade', 'Cybernetics', 4);
insert into course values('352', 'Compiler Design', 'Psychology', 4);
insert into course values('393', 'Aerodynamics', 'Languages', 3);
insert into course values('795', 'Death and Taxes', 'Marketing', 3);
insert into course values('577', 'The Music of Dave Edmunds', 'Elec. Eng.', 3);
insert into course values('584', 'Computability Theory', 'Comp. Sci.', 3);
insert into course values('864', 'Heat Transfer', 'Geology', 3);
insert into course values('594', 'Cognitive Psychology', 'Finance', 3);
insert into course values('802', 'African History', 'Cybernetics', 3);
insert into course values('692', 'Cat Herding', 'Athletics', 3);
insert into course values('258', 'Colloid and Surface Chemistry', 'Math', 3);
insert into course values('748', 'Tort Law', 'Cybernetics', 4);
insert into course values('770', 'European History', 'Pol. Sci.', 3);
insert into course values('340', 'Corporate Law', 'History', 3);
insert into course values('158', 'Elastic Structures', 'Cybernetics', 3);
insert into course values('276', 'Game Design', 'Comp. Sci.', 4);
insert into course values('626', 'Multimedia Design', 'History', 4);
insert into course values('696', 'Heat Transfer', 'Marketing', 4);
insert into course values('239', 'The Music of the Ramones', 'Physics', 4);
insert into course values('962', 'Animal Behavior', 'Psychology', 3);
insert into course values('527', 'Graphics', 'Finance', 3);
insert into course values('275', 'Romantic Literature', 'Languages', 3);
insert into course values('549', 'Banking and Finance', 'Astronomy', 3);
insert into course values('974', 'Astronautics', 'Accounting', 3);
insert into course values('897', 'How to Succeed in Business Without Really Trying', 'Languages', 4);
insert into course values('359', 'Game Programming', 'Comp. Sci.', 4);
insert into course values('345', 'Race Car Driving', 'Accounting', 4);
insert into course values('371', 'Milton', 'Finance', 3);
insert into course values('284', 'Topology', 'Comp. Sci.', 4);
insert into course values('642', 'Video Gaming', 'Psychology', 3);
insert into course values('769', 'Logic', 'Elec. Eng.', 4);
insert into course values('947', 'Real-Time Database Systems', 'Accounting', 3);
insert into course values('265', 'Thermal Physics', 'Cybernetics', 4);
insert into course values('927', 'Differential Geometry', 'Cybernetics', 4);
insert into course values('694', 'Optics', 'Math', 3);
insert into course values('580', 'The Music of Dave Edmunds', 'Physics', 4);
insert into course values('324', 'Ponzi Schemes', 'Civil Eng.', 3);
insert into course values('349', 'Networking', 'Finance', 4);
insert into course values('392', 'Recursive Function Theory', 'Astronomy', 4);
insert into course values('735', 'Greek Tragedy', 'Geology', 3);
insert into course values('702', 'Arabic', 'Biology', 3);
insert into course values('458', 'The Renaissance', 'Civil Eng.', 4);
insert into course values('348', 'Compiler Design', 'Elec. Eng.', 3);
insert into course values('500', 'Networking', 'Astronomy', 3);
insert into course values('494', 'Automobile Mechanics', 'Pol. Sci.', 4);
insert into course values('411', 'Music of the 80s', 'Mech. Eng.', 4);
insert into course values('493', 'Music of the 50s', 'Geology', 3);
insert into course values('396', 'C  Programming', 'Languages', 3);
insert into course values('810', 'Mobile Computing', 'Geology', 3);
insert into course values('631', 'Plasma Physics', 'Elec. Eng.', 4);
insert into course values('486', 'Accounting', 'Geology', 3);
insert into course values('963', 'Groups and Rings', 'Languages', 4);
insert into course values('445', 'Biostatistics', 'Finance', 3);
insert into course values('292', 'Electron Microscopy', 'English', 4);
insert into course values('830', 'Sensor Networks', 'Astronomy', 4);
insert into course values('604', 'UNIX System Programmming', 'Statistics', 4);
insert into course values('857', 'UNIX System Programmming', 'Geology', 4);
insert into course values('304', 'Music 2 New for your Instructor', 'Finance', 4);
insert into course values('922', 'Microeconomics', 'Finance', 4);
insert into course values('571', 'Plastics', 'Comp. Sci.', 4);
insert into course values('628', 'Existentialism', 'Accounting', 3);
insert into course values('841', 'Fractal Geometry', 'Mech. Eng.', 4);
insert into course values('586', 'Image Processing', 'Finance', 4);
insert into course values('139', 'Number Theory', 'English', 4);
insert into course values('666', 'Multivariable Calculus', 'Accounting', 3);
insert into course values('443', 'Journalism', 'Physics', 4);
insert into course values('195', 'Numerical Methods', 'Geology', 4);
insert into course values('634', 'Astronomy', 'Cybernetics', 4);
insert into course values('224', 'International Finance', 'Athletics', 3);
insert into course values('791', 'Operating Systems', 'Marketing', 3);
insert into course values('875', 'Bioinformatics', 'Cybernetics', 3);
insert into course values('958', 'Fiction Writing', 'Mech. Eng.', 3);
insert into course values('415', 'Numerical Methods', 'Biology', 3);
insert into course values('442', 'Strength of Materials', 'Athletics', 3);
insert into course values('468', 'Fractal Geometry', 'Civil Eng.', 4);
insert into course values('270', 'Music of the 90s', 'Math', 4);
insert into course values('966', 'Sanitary Engineering', 'History', 3);
insert into course values('793', 'Decison Support Systems', 'Civil Eng.', 3);
insert into course values('236', 'Design and Analysis of Algorithms', 'Mech. Eng.', 3);
insert into course values('792', 'Image Processing', 'Accounting', 3);
insert into course values('561', 'The Music of Donovan', 'Elec. Eng.', 4);
insert into course values('344', 'Quantum Mechanics', 'Accounting', 4);
insert into course values('780', 'Geology', 'Psychology', 3);
insert into instructor values('63395', 'McKinnon', 'Cybernetics', 94333.99);
insert into instructor values('78699', 'Pingr', 'Statistics', 59303.62);
insert into instructor values('96895', 'Mird', 'Marketing', 119921.41);
insert into instructor values('4233', 'Luo', 'English', 88791.45);
insert into instructor values('4034', 'Murata', 'Athletics', 61387.56);
insert into instructor values('50885', 'Konstantinides', 'Languages', 32570.50);
insert into instructor values('79653', 'Levine', 'Elec. Eng.', 89805.83);
insert into instructor values('50330', 'Shuming', 'Physics', 108011.81);
insert into instructor values('80759', 'Queiroz', 'Biology', 45538.32);
insert into instructor values('73623', 'Sullivan', 'Elec. Eng.', 90038.09);
insert into instructor values('97302', 'Bertolino', 'Mech. Eng.', 51647.57);
insert into instructor values('57180', 'Hau', 'Accounting', 43966.29);
insert into instructor values('74420', 'Voronina', 'Physics', 121141.99);
insert into instructor values('35579', 'Soisalon-Soininen', 'Psychology', 62579.61);
insert into instructor values('31955', 'Moreira', 'Accounting', 71351.42);
insert into instructor values('37687', 'Arias', 'Statistics', 104563.38);
insert into instructor values('6569', 'Mingoz', 'Finance', 105311.38);
insert into instructor values('16807', 'Yazdi', 'Athletics', 98333.65);
insert into instructor values('14365', 'Lembr', 'Accounting', 32241.56);
insert into instructor values('90643', 'Choll', 'Statistics', 57807.09);
insert into instructor values('81991', 'Valtchev', 'Biology', 77036.18);
insert into instructor values('95030', 'Arinb', 'Statistics', 54805.11);
insert into instructor values('15347', 'Bawa', 'Athletics', 72140.88);
insert into instructor values('74426', 'Kenje', 'Marketing', 106554.73);
insert into instructor values('42782', 'Vicentino', 'Elec. Eng.', 34272.67);
insert into instructor values('58558', 'Dusserre', 'Marketing', 66143.25);
insert into instructor values('63287', 'Jaekel', 'Athletics', 103146.87);
insert into instructor values('59795', 'Desyl', 'Languages', 48803.38);
insert into instructor values('22591', 'DAgostino', 'Psychology', 59706.49);
insert into instructor values('48570', 'Sarkar', 'Pol. Sci.', 87549.80);
insert into instructor values('79081', 'Ullman ', 'Accounting', 47307.10);
insert into instructor values('52647', 'Bancilhon', 'Pol. Sci.', 87958.01);
insert into instructor values('25946', 'Liley', 'Languages', 90891.69);
insert into instructor values('36897', 'Morris', 'Marketing', 43770.36);
insert into instructor values('72553', 'Yin', 'English', 46397.59);
insert into instructor values('3199', 'Gustafsson', 'Elec. Eng.', 82534.37);
insert into instructor values('34175', 'Bondi', 'Comp. Sci.', 115469.11);
insert into instructor values('48507', 'Lent', 'Mech. Eng.', 107978.47);
insert into instructor values('65931', 'Pimenta', 'Cybernetics', 79866.95);
insert into instructor values('3335', 'Bourrier', 'Comp. Sci.', 80797.83);
insert into instructor values('64871', 'Gutierrez', 'Statistics', 45310.53);
insert into instructor values('95709', 'Sakurai', 'English', 118143.98);
insert into instructor values('43779', 'Romero', 'Astronomy', 79070.08);
insert into instructor values('77346', 'Mahmoud', 'Geology', 99382.59);
insert into instructor values('28097', 'Kean', 'English', 35023.18);
insert into instructor values('90376', 'Bietzk', 'Cybernetics', 117836.50);
insert into instructor values('28400', 'Atanassov', 'Statistics', 84982.92);
insert into instructor values('41930', 'Tung', 'Athletics', 50482.03);
insert into instructor values('19368', 'Wieland', 'Pol. Sci.', 124651.41);
insert into instructor values('99052', 'Dale', 'Cybernetics', 93348.83);

-- check
SELECT TOP 10 * FROM course;
GO


name
master
tempdb
model
msdb
udemy
dbm_project


course_id,title,dept_name,credits
101,Diffusion and Phase Transformation,Mech. Eng.,3
105,Image Processing,Astronomy,3
123,Differential Equations,Mech. Eng.,3
127,Thermodynamics,Geology,3
130,Differential Geometry,Physics,3
133,Antidisestablishmentarianism in Modern America,Biology,4
137,Manufacturing,Finance,3
139,Number Theory,English,4
158,Elastic Structures,Cybernetics,3
169,Marine Mammals,Elec. Eng.,3


We can aggregate observations by group using the ``GROUP BY`` clause. As an ilustration, we can find the number of courses offered by each department.

In [42]:
SELECT dept_name, COUNT(course_id) as [course number]
FROM course
GROUP BY dept_name;
GO

dept_name,course number
Accounting,12
Astronomy,10
Athletics,9
Biology,7
Civil Eng.,10
Comp. Sci.,10
Cybernetics,20
Elec. Eng.,8
English,8
Finance,14


Similarly, we can find the number of departments by building.

In [49]:
-- find the available table
SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES ;
GO

-- make the query
SELECT building, COUNT(dept_name) as [department count]
FROM department
GROUP BY building;
GO

TABLE_NAME
classroom
department
course
instructor
section
teaches
student
takes
advisor
time_slot


building,department count
Brodhead,1
Bronfman,1
Candlestick,2
Chandler,1
Lambeau,1
Lamberton,1
Linderman,1
Main,1
Mercer,1
Palmer,2


At times it is useful to state a condition that applies to groups rather than to tuples - that is, a <code>WHERE</code> for groups. For this, we can use the <code>HAVING</code> clause. For example, we may be interested in the average budget by building where the average budget must exceed 500,000.

In [55]:
-- all
SELECT building, AVG(budget) as [average budget]
FROM department
GROUP BY building;
GO

-- filtered
SELECT building, AVG(budget) as [average budget]
FROM department
GROUP BY building
HAVING AVG(budget) > 500000.00;
GO

building,average budget
Brodhead,777605.11
Bronfman,734550.7
Candlestick,757221.15
Chandler,255041.46
Lambeau,210627.58
Lamberton,106378.69
Linderman,601283.6
Main,276527.61
Mercer,794541.46
Palmer,508800.295


building,average budget
Brodhead,777605.11
Bronfman,734550.7
Candlestick,757221.15
Linderman,601283.6
Mercer,794541.46
Palmer,508800.295
Rauch,520350.65
Taylor,570482.18
Thompson,848175.04
Whitman,573745.09


### Nested subqueries

SQL provides a mechanism for nesting subqueries. A subquery is a **select-from-where expression that is nested within another query**.