In [None]:
# exercise 01

"""
Get to know SELECT COUNT DISTINCT

Your database doesn't have any defined keys so far, and you don't know which columns or combinations of columns are suited as keys.

There's a simple way of finding out whether a certain column (or a combination) contains only unique values – and thus identifies the records in the table.

You already know the SELECT DISTINCT query from the first chapter. Now you just have to wrap everything within the COUNT() function and PostgreSQL will return the number of unique rows for the given columns:

SELECT COUNT(DISTINCT(column_a, column_b, ...))
FROM table;

"""

# Instructions

"""
First, find out the number of rows in universities.
---
Then, find out how many unique values there are in the university_city column.
"""

# solution

-- Count the number of rows in universities
SELECT COUNT(DISTINCT(universities)) 
FROM universities;

#----------------------------------#

-- Count the number of distinct values in the university_city column
SELECT COUNT(distinct(university_city)) 
FROM universities;

#----------------------------------#

# Conclusion

"""
Great! So, obviously, the university_city column wouldn't lend itself as a key. Why? Because there are only 9 distinct values, but the table has 11 rows.
"""

'/home/nero/Documents/Estudos/DataCamp'

In [1]:
# exercise 02

"""
Identify keys with SELECT COUNT DISTINCT

There's a very basic way of finding out what qualifies for a key in an existing, populated table:

    Count the distinct records for all possible combinations of columns. If the resulting number x equals the number of all rows in the table for a combination, you have discovered a superkey.

    Then remove one column after another until you can no longer remove columns without seeing the number x decrease. If that is the case, you have discovered a (candidate) key.

The table professors has 551 rows. It has only one possible candidate key, which is a combination of two attributes. You might want to try different combinations using the "Run code" button. Once you have found the solution, you can submit your answer.
"""

# Instructions

"""
Using the above steps, identify the candidate key by trying out different combination of columns.
"""

# solution

-- Try out different combinations
SELECT COUNT(DISTINCT(firstname, lastname)) 
FROM professors;

#----------------------------------#

# Conclusion

"""
Indeed, the only combination that uniquely identifies professors is {firstname, lastname}. {firstname, lastname, university_shortname} is a superkey, and all other combinations give duplicate values. Hopefully, the concept of superkeys and keys is now a bit more clear. Let's move on to primary keys!
"""

'\n\n'

# Identify the primary key

Have a look at the example table from the previous video. As the database designer, you have to make a wise choice as to which column should be the primary key.

## Vehicle Information Table

| license_no | serial_no | make | model | year |
|---|---|---|---|---|
| Texas ABC-739 | A69352 | Ford | Mustang | 2 |
| Florida TVP-347 | B43696 | Oldsmobile | Cutlass | 5 |
| New York MPO-22 | X83554 | Oldsmobile | Delta | 1 |
| California 432-TFY | C43742 | Mercedes | 190-D | 99 |
| California RSK-629 | Y82935 | Toyota | Camry | 4 |
| Texas RSK-629 | U028365 | Jaguar | XJS | 4 |


Which of the following column or column combinations could best serve as primary key?

### Possible Answers


    PK = {make}
    
    
    PK = {model, year}
    
    
    PK = {license_no} {Answer}
    
    
    PK = {year, make}
**A primary key consisting solely of "license_no" is probably the wisest choice, as license numbers are certainly unique across all registered cars in a country.**

In [2]:
# exercise 03

"""
ADD key CONSTRAINTs to the tables

Two of the tables in your database already have well-suited candidate keys consisting of one column each: organizations and universities with the organization and university_shortname columns, respectively.

In this exercise, you'll rename these columns to id using the RENAME COLUMN command and then specify primary key constraints for them. This is as straightforward as adding unique constraints (see the last exercise of Chapter 2):

ALTER TABLE table_name
ADD CONSTRAINT some_name PRIMARY KEY (column_name)

Note that you can also specify more than one column in the brackets.
"""

# Instructions

"""
    Rename the organization column to id in organizations.
    Make id a primary key and name it organization_pk.
---
    Rename the university_shortname column to id in universities.
    Make id a primary key and name it university_pk.

"""

# solution

-- Rename the organization column to id
ALTER TABLE organizations
RENAME COLUMN organization TO id;

-- Make id a primary key
ALTER TABLE organizations
ADD CONSTRAINT organization_pk PRIMARY KEY (id);

#----------------------------------#

-- Rename the university_shortname column to id
ALTER TABLE universities
RENAME COLUMN university_shortname TO id;

-- Make id a primary key
ALTER TABLE universities
ADD CONSTRAINT university_pk PRIMARY KEY (id);

#----------------------------------#

# Conclusion

"""
Good job! That was easy, wasn't it? Let's tackle the last table that needs a primary key right now: professors. However, things are going to be different this time, because you'll add a so-called surrogate key.
"""

'\n\n'

In [3]:
# exercise 04

"""
Add a SERIAL surrogate key

Since there's no single column candidate key in professors (only a composite key candidate consisting of firstname, lastname), you'll add a new column id to that table.

This column has a special data type serial, which turns the column into an auto-incrementing number. This means that, whenever you add a new professor to the table, it will automatically get an id that does not exist yet in the table: a perfect primary key!
"""

# Instructions

"""
Add a new column id with data type serial to the professors table.
---
Make id a primary key and name it professors_pkey.
---
Write a query that returns all the columns and 10 rows from professors.
"""

# solution

-- Add the new column to the table
ALTER TABLE professors 
ADD COLUMN id serial;

-- Make id a primary key
ALTER TABLE professors 
ADD CONSTRAINT professors_pkey PRIMARY KEY (id);

-- Have a look at the first 10 rows of professors
SELECT * FROM professors LIMIT 10;

#----------------------------------#

# Conclusion

"""
Well done. As you can see, PostgreSQL has automatically numbered the rows with the id column, which now functions as a (surrogate) primary key - it uniquely identifies professors.
"""

'\n\n'

In [4]:
# exercise 05

"""
CONCATenate columns to a surrogate key

Another strategy to add a surrogate key to an existing table is to concatenate existing columns with the CONCAT() function.

Let's think of the following example table:

CREATE TABLE cars (
 make varchar(64) NOT NULL,
 model varchar(64) NOT NULL,
 mpg integer NOT NULL
)

The table is populated with 10 rows of completely fictional data.

Unfortunately, the table doesn't have a primary key yet. None of the columns consists of only unique values, so some columns can be combined to form a key.

In the course of the following exercises, you will combine make and model into such a surrogate key.
"""

# Instructions

"""
Count the number of distinct rows with a combination of the make and model columns.
---
Add a new column id with the data type varchar(128).
---
Concatenate make and model into id using an UPDATE table_name SET column_name = ... query and the CONCAT() function.
---
Make id a primary key and name it id_pk.
"""

# solution

-- Count the number of distinct rows with columns make, model
SELECT COUNT(DISTINCT(make, model)) 
FROM cars;

-- Add the id column
ALTER TABLE cars
ADD COLUMN id varchar(128);

-- Update id with make + model
UPDATE cars
SET id = CONCAT(make, model);

-- Make id a primary key
ALTER TABLE cars
ADD CONSTRAINT id_pk PRIMARY KEY(id);

-- Have a look at the table
SELECT * FROM cars;

#----------------------------------#

# Conclusion

"""
Good job! These were quite some steps, but you've managed! Let's look into another method of adding a surrogate key now.
"""

'\n\n'

In [5]:
# exercise 06

"""
Test your knowledge before advancing

Before you move on to the next chapter, let's quickly review what you've learned so far about attributes and key constraints. If you're unsure about the answer, please quickly review chapters 2 and 3, respectively.

Let's think of an entity type "student". A student has:

    a last name consisting of up to 128 characters (required),
    a unique social security number, consisting only of integers, that should serve as a key,
    a phone number of fixed length 12, consisting of numbers and characters (but some students don't have one).

"""

# Instructions

"""

    Given the above description of a student entity, create a table students with the correct column types.
    Add a PRIMARY KEY for the social security number ssn.

Note that there is no formal length requirement for the integer column. The application would have to make sure it's a correct SSN!
"""

# solution

-- Create the table
CREATE TABLE students (
  last_name VARCHAR(128) NOT NULL,
  ssn INTEGER PRIMARY KEY,
  phone_no CHAR(12)
);

#----------------------------------#

# Conclusion

"""
Great! Looks like you are ready for the last chapter of this course, where you'll connect tables in your database.
"""

'\n\n'