A relational database:

- real-life entities become tables
- reduced redundancy
- data integrity by relationships

learn three concepts

- constraints
- keys
- referential integrity

In [None]:
-- Query the right table in information_schema
SELECT table_name 
FROM information_schema.tables -- meta-database with MULTIPLE tables
-- Specify the correct table_schema value
WHERE table_schema = 'public';

Now have a look at the columns in university_professors by selecting all entries in information_schema.columns that correspond to that table.

In [None]:
-- Query the right table in information_schema to get columns
SELECT column_name, data_type 
FROM information_schema.columns 
WHERE table_name = 'university_professors' AND table_schema = 'public';

Finding the number of columns in the table university_professors

In [None]:
SELECT COUNT(*)
FROM information_schema.columns
WHERE table_name = 'university_professors' AND table_schema = 'public'

CREATE table

In [None]:
-- Create a table for the professors entity type
CREATE TABLE professors (
 firstname text,
 lastname text
); -- Note the semicolon after this

-- Print the contents of this table
SELECT * 
FROM professors

ALTER TABLE, RENAME COLUMN

In [None]:
-- Add the university_shortname column
ALTER TABLE professors
ADD COLUMN university_shortname TEXT;

-- Print the contents of this table
SELECT * 
FROM professors

In [None]:
-- Rename the organisation column
ALTER TABLE affiliations
RENAME COLUMN organisation TO organization;

-- Delete the university_shortname column
ALTER TABLE affiliations
DROP COLUMN university_shortname;

INSERT INTO, SELECT DISTINCT

In [None]:
-- Insert unique professors into the new table
INSERT INTO professors 
SELECT DISTINCT firstname, lastname, university_shortname 
FROM university_professors;

In [None]:
-- Delete the university_professors table
DROP TABLE university_professors;

CASTING

In [None]:
CREATE TABLE weather (
temperature integer,
wind_speed text);

SELECT temperature * windspeed AS wind_chill
FROM weather; ------------- WRONG



SELECT temperature * CAST(wind_speed AS integer) AS wind_chill
FROM weather;

In [None]:
-- Calculate the net amount as amount + fee
SELECT transaction_date, amount + CAST(fee AS integer) AS net_amount 
FROM transactions;

In [None]:
-- Specify the correct fixed-length character type
ALTER TABLE professors
ALTER COLUMN university_shortname
TYPE char(3);

In [None]:
-- Change the type of firstname
ALTER TABLE professors
ALTER COLUMN firstname
TYPE varchar(64);

SUBSTRING

Convert types USING a function
If you don't want to reserve too much space for a certain varchar column, you can truncate the values before converting its type.

For this, you can use the following syntax:

ALTER TABLE table_name
ALTER COLUMN column_name
TYPE varchar(x)
USING SUBSTRING(column_name FROM 1 FOR x)


You should read it like this: Because you want to reserve only x characters for column_name, you have to retain a SUBSTRING of every value, i.e. the first x characters of it, and throw away the rest. This way, the values will fit the varchar(x) requirement.

Now use SUBSTRING() to reduce firstname to 16 characters so its type can be altered to varchar(16).

In [None]:
-- Convert the values in firstname to a max. of 16 characters
ALTER TABLE professors 
ALTER COLUMN firstname
TYPE varchar(16)
USING SUBSTRING(firstname FROM 1 FOR 16)

not-null and unique constraints

eg. ssn integer not null,

NULL != NULL

ADD CONSTRAINT 

In [None]:
-- Disallow NULL values in firstname
ALTER TABLE professors 
ALTER COLUMN firstname SET NOT NULL;

In [None]:
-- Make universities.university_shortname unique
ALTER TABLE universities
ADD CONSTRAINT university_shortname_unq UNIQUE(university_shortname); -- university_shortname_unq is name of constraint

keys and superkeys, key constraints

key = attributes that identify a record uniquely (always minimal)

superkey = attributes that can be removed

minimal superkey or key = no more attributes can be removed - but can still be uniquely identified by the remaining attributes

primary keys - most important concept, chosen by you from candidate keys

one primary key per database table

- multiple primary columns can make up a primary key, but ideally only one

multiple ways to specify primary key
- ADD CONSTRAINT    PRIMARY KEY
- PRIMARY KEY (col1, col2)
- product_no integer PRIMARY KEY

In [None]:
-- Rename the organization column to id
ALTER TABLE organizations
ALTER COLUMN organization TO id;

-- Make id a primary key
ALTER TABLE organizations
ADD CONSTRAINT organization_pk PRIMARY KEY (id);

In [None]:
--setting up a primary key
-- Rename the organization column to id
ALTER TABLE organizations
RENAME COLUMN organization TO id;

-- Make id a primary key
ALTER TABLE organizations
ADD CONSTRAINT organization_pk PRIMARY KEY (id);

In [None]:
-- Rename the university_shortname column to id
ALTER TABLE universities
RENAME COLUMN university_shortname TO id;

-- Make id a primary key
ALTER TABLE universities
ADD CONSTRAINT university_pk PRIMARY KEY (id);

surrogate key with serial data type

ADD COLUMN id serial PRIMARY KEY;

In [None]:
-- Add the new column to the table
ALTER TABLE professors 
ADD COLUMN id serial;

-- Make id a primary key
ALTER TABLE professors 
ADD CONSTRAINT professors_pkey PRIMARY KEY (id);

-- Have a look at the first 10 rows of professors
SELECT * FROM professors LIMIT (10);

UPDATE and SET

In [None]:
-- Count the number of distinct rows with columns make, model
SELECT COUNT(DISTINCT(make, model)) 
FROM cars;

-- Add the id column
ALTER TABLE cars
ADD COLUMN id varchar(128);

-- Update id with make + model
UPDATE cars
SET id = CONCAT(make, model);

-- Make id a primary key
ALTER TABLE cars
ADD CONSTRAINT id_pk PRIMARY KEY(id);

-- Have a look at the table
SELECT * FROM cars;

Exercise:
Let's think of an entity type "student". A student has:

a last name consisting of up to 128 characters (required),
a unique social security number, consisting only of integers, that should serve as a key,
a phone number of fixed length 12, consisting of numbers and characters (but some students don't have one).

In [None]:
-- Create the table
CREATE TABLE students (
  last_name varchar(128) NOT NULL,
  ssn integer PRIMARY KEY,
  phone_no char(12)
);

## Implementing relationships with foreign keys

- a foreign key (FK ) points to the Primary key (PK) of another table
- domain of FK must be equal to domain of PK
- each value of FK must exit in PK of other table (FK constraint or 'referential integrity'
- FKs are not actual keys

FOREIGN KEYS created with REFERENCES keyword, followed by primary key name

Add a foreign key on university_id column in professors that references the id column in universities.
Name this foreign key professors_fkey.

In [None]:
-- Rename the university_shortname column
ALTER TABLE professors
RENAME COLUMN university_shortname TO university_id;

-- Add a foreign key on professors referencing universities
ALTER TABLE professors 
ADD CONSTRAINT professors_fkey FOREIGN KEY (university_id) REFERENCES universities (id);

JOIN tables linked by a foreign key

JOIN professors with universities on professors.university_id = universities.id, i.e., retain all records where the foreign key of professors is equal to the primary key of universities.

Filter for university_city = 'Zurich'.

In [None]:
-- Select all professors working for universities in the city of Zurich
SELECT professors.lastname, universities.id, universities.university_city
FROM professors
JOIN universities
ON professors.university_id = universities.id
WHERE universities.university_city = 'Zurich';

In [None]:
CREATE TABLE manufacturers (
name varchar(255) PRIMARY KEY);

INSERT INTO manufacturers
VALUES ('Ford'), ('VW');

CREATE TABLE cars (
model varchar(255) PRIMARY KEY,
manufacturer_name varchar(255) REFERENCES manufacturers (name));

Add a professor_id column with integer data type to affiliations, and declare it to be a foreign key that references the id column in professors.

In [None]:
-- Add a professor_id column
ALTER TABLE affiliations
ADD COLUMN professor_id integer REFERENCES professors (id);

Add a foreign key constraint on organization_id so that it references the id column in organizations.

Syntax:

ADD CONSTRAINT constraint_name FOREIGN KEY (column_name) REFERENCES other_table_name (other_column_name)

In [None]:
-- Add a professor_id column
ALTER TABLE affiliations
ADD COLUMN professor_id integer REFERENCES professors (id);

-- Rename the organization column to organization_id
ALTER TABLE affiliations
RENAME organization TO organization_id;

-- Add a foreign key on organization_id
ALTER TABLE affiliations
ADD CONSTRAINT affiliations_organization_fkey FOREIGN KEY (organization_id) REFERENCES organizations (id);

UPDATE table_a

SET column_to_update = table_b.column_to_update_from

FROM table_b

WHERE condition1 AND condition2 AND ...;

Update the professor_id column with the corresponding value of the id column in professors.

"Corresponding" means rows in professors where the firstname and lastname are identical to the ones in affiliations.

In [None]:
-- Set professor_id to professors.id where firstname, lastname correspond to rows in professors
UPDATE affiliations
SET professor_id = professors.id
FROM professors
WHERE affiliations.firstname = professors.firstname AND affiliations.lastname = professors.lastname;

In [None]:
-- DROPPING

-- Drop the firstname column
ALTER TABLE affiliations
DROP COLUMN firstname;

-- Drop the lastname column
ALTER TABLE affiliations
DROP COLUMN lastname;

REFERENTIAL INTEGRITY 
- a record referencing another table must refer to an existing record in that table

in other words, a record in table a, cannot point to a record in table b that does not exist

- specified between two tables
- enforced through foreign keys - ie. throw errors to prevent deleting referenced records in either tables

DEALING WITH VIOLATIONS
- can use eg:

b_id integer REFERENCES b(id) ON DELETE NO ACTION

option2: 

ON DELETE CASCADE

- deleting record in table b will auto delete record referenced in table a

ON DELETE...

- RESTRICT: throw an error - same as NO ACTION
- SET NULL: set referencing column to NULL
- SET DEFAULT: Set referencing column to its default value

foreign key on professors.university_id that references universities.id, so <b>referential integrity is said to hold from professors to universities.</b>

Have a look at the existing foreign key constraints by querying table_constraints in information_schema.

Delete the affiliations_organization_id_fkey foreign key constraint in affiliations.

Add a new foreign key to affiliations that CASCADEs deletion if a referenced record is deleted from organizations. Name it affiliations_organization_id_fkey.

Run the DELETE and SELECT queries to double check that the deletion cascade actually works.

In [None]:
-- Identify the correct constraint name
SELECT constraint_name, table_name, constraint_type
FROM information_schema.table_constraints
WHERE constraint_type = 'FOREIGN KEY';

-- Drop the right foreign key constraint
ALTER TABLE affiliations
DROP CONSTRAINT affiliations_organization_id_fkey;

-- Add a new foreign key constraint from affiliations to organizations which cascades deletion
ALTER TABLE affiliations
ADD CONSTRAINT affiliations_organization_id_fkey FOREIGN KEY (organization_id) REFERENCES organizations (id) ON DELETE CASCADE;

-- Below will cascade deletion where id='CUREM'
-- Delete an organization 
DELETE FROM organizations 
WHERE id = 'CUREM';

-- Check that no more affiliations with this organization exist -- Gives empty query
SELECT * FROM affiliations
WHERE organization_id = 'CUREM';

Count the number of total affiliations by university.
Sort the result by that count, in descending order.

In [None]:
SELECT COUNT(*), professors.university_id 
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
-- Group by the university ids of professors
GROUP BY professors.university_id 
ORDER BY count DESC;

Count the number of total affiliations by university.
Sort the result by that count, in descending order.

In [None]:
-- Join all tables
SELECT *
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id;

Now group the result by organization sector, professor, and university city.

Count the resulting number of rows.

In [None]:
-- Group the table by organization sector, professor ID and university city
SELECT COUNT(*), organizations.organization_sector, 
professors.id, universities.university_city
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id
GROUP BY organizations.organization_sector, 
professors.id, universities.university_city;

Only retain rows with "Media & communication" as organization sector, and sort the table by count, in descending order.

In [None]:
-- Filter the table and sort it
SELECT COUNT(*), organizations.organization_sector, 
professors.id, universities.university_city
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id
WHERE organizations.organization_sector = 'Media & communication'
GROUP BY organizations.organization_sector, 
professors.id, universities.university_city
ORDER BY count DESC;