# SQL Part 5

<img src="https://media3.giphy.com/media/UqeH2KKx0U65oETdDR/source.gif" width="300" height="300" />

**What is a record in a table?**

*A row of data*

**What is a primary key?**

*The column in a table that has a unique value for every record/row*

**What is a foreign key?**

*A column in a table that has values from the primary key column in another table. A column in one table that uniquely identifies a row of another table*

**What is a composite key?**

*A primary key that is made up of the combination of 2 or more columns*

# Schemas

A schema is a collection of database objects (e.g. tables, indexes, views, stored procedures, function, etc.). A few things to remember about schemas:
* A database may have multiple schemas
* To reference an object, you will need to use schema_name.object_name (e.g. schema_name.table_name)
* Tables from different schemas may have the same name (e.g. sales.employees and hr.employees)
* The default schema for a newly created database id 'dbo'
* You can think of schemas like separate folders or namespaces

Schemas are used to group tables into logical groups within the database. They also allow the database managers to allow people to only access the schemas that they need (e.g. you are doing analysis on sales and only need to access the 'Sales' schema), which is the best practice of 'least privilege'

Organization often use schema diagrams to visualize the relationships between different table in the database:

<img src="Schema Visualization.png" width="700" height="700" />

*visual from https://www.sqlservertutorial.net/sql-server-sample-database/*

# Creating a New Table

### Syntax

In [None]:
CREATE TABLE database_name*.schema_name.table_name(
    column_name1 data_type PRIMARY KEY column_constraint,
    column_name2 data_type column_constraint,
    ...
    column_nameN, data_type column_constraint,
    table_constraints
)

*If no database name is specified, the database_name defaults to the current database

### Primary Key 

**When the primary key is made up of one column** : You just have to add 'PRIMARY KEY' keywords after the datatype when specifying that column
* To have the table automatically generate a unique primary key id whenever a new record is added, use 'IDENTITY(first_value, increment_value)' in the column_constraint section of this column. If you don't specify a first_value and increment_value, they will both default to 1.
* **<span style="color:orange">NOTE</span>** : If someone tries to insert a record into the table and the operation fails for some reason, that value generated by IDENTITY will still be 'used up' and will not be generated again

**When the primary key is made up of more than one column** : In the table_constraints use 'PRIMARY KEY (columnA, columnB)'. In this case columnA doesn't have to be made up of unique values and columnB doesn't have to be made up of unique values, but the combination of columnA + columnB must be unique. *Remember this is also called a ***composite primary key****

**<span style="color:orange">NOTE</span>** : All columns that participate in the primary key are automatically defined to have a NOT NULL column constraint.

A table doesn't *have* to have a primary key defined, and one can be specified later instead using ALTER TABLE

In [None]:
ALTER TABLE database_name.schema_name.table_name
ADD PRIMARY KEY (column_name)

### Data Types

* **INT** : stores an integer value (e.g. 50)
* **DECIMAL/DEC/NUMERIC(p,s)** : stores a float value (e.g. 50.5). P is the maximum number of digits (default 38) and s is the maximum number of digits after the decimal point
* **VARCHAR(n)** : stores variable-length strings (e.g. 'Abraham Lincoln'), where n is the maximum string length (up to 8,000, default is 1)
* **DATE** : stores a date value. The default format is YYYY-MM-DD
* **TIME** : stores a time value. The default format is hh:mm:ss
* **DATETIME2** : stores a specific time and date. The default format is YYYY-MM-DD hh:mm:ss
    - **<span style="color:orange">NOTE</span>** : DATETIME2 is recommended over DATETIME (one reason is that DATETIME only allows you to go back as far as 1753)

WARNING : if someone tries to add a record with values that do not match the data type or data type requirements (e.g. a string with length longer than the 'n' specified in VARCHAR) they will get an error

*These are the most common data types. For a complete list of possible data types see https://www.sqlservertutorial.net/sql-server-basics/sql-server-data-types/*

### Column Constraints

* **NOT NULL** : requires that the values for this column are not NULL
    - If someone attempts to add a record with a value for this column that is NULL they will get an error
    - **<span style="color:orange">NOTE</span>** : Always written as column constraints and not as table constraints

* **UNIQUE** : requires that each value for this column is unique
    - If someone attempts to add a record with a value for this column that already exists in another record they will get an error
    - Unlike the PRIMARY KEY constraint, UNIQUE allows a NULL value (but only once...)

* **CHECK** : allows you to specify a boolean expression that the values for the column must satisfy with the syntax 'CHECK(boolean expression)'
    - A 'boolean expression' just mean that the expression must evaluate to either True or False
    - With a CHECK constraint, if someone tries to insert a record with a value for this column that would cause the boolean expression to evaluate to False they will get an error
    - **<span style="color:orange">NOTE</span>** : adding a record with a NULL value for a column with a CHECK constraint will not cause an error

### Table Constraints

* **FOREIGN KEY** : when you want to specify that a specific column is a foreign key, you use 'FOREIGN KEY (column_nameA) REFERENCES schema.parent_table_name (column_nameB)' 
    - When you specify a foreign key, you will get an error if you try to add a record with a value for the foreign key column that does not exist in the column that you linked it to
    - Foreign key links also allow you to specify what you want to happen to rows in this table that link to a certain primary key in the parent table if that primary key is deleted or updated (to maintain data integrity across tables) 
    - **<span style="color:orange">NOTE</span>** : Even though column_nameA and column_nameB should refer to columns with the same kinds of value, they do not have to be named the same thing

* **PRIMARY KEY** : see creating a composite primary key in the Primary Key section above

* **UNIQUE** : allows you to specify that a combination of columns must be unique (similar to the idea of a composite key) with 'UNIQUE (column_nameA, column_nameB)'

* **CHECK** : allows you to specify a boolean expression that applies to more than one column (e.g. 'CHECK(column_nameA > column_nameB)')

### Assigning Names to Constraints

When a constraint is added to a column or table, it automatically gets assigned a 'constraint name'. This is what will be specified in the error message if someone tries to add a record that violates the constraint (e.g. *'Violation of UNIQUE KEY constraint 'UQ_table_name_AB234958BSDGW'. Cannot insert a duplicate key in object 'schema_name.table_name' The duplicate value is (duplicate_value)'*). If you want to specify a more helpful constraint name you always can by specifying 'CONSTRAINT constraint_name' right before any column or table constraint. This can make it easier to understand error messages and allow you to use that constraint name if you modify the constraint later

In [None]:
CREATE TABLE database_name*.schema_name.table_name(
    column_name1 data_type PRIMARY KEY column_constraint,
    column_name2 data_type CONSTRAINT constraint_name column_constraint,
    ...
    column_nameN, data_type CONSTRAINT constraint_name column_constraint,
    CONSTRAINT constraint_name table_constraint
)

# Creating A Table Based On Another Table

### Syntax

In [None]:
SELECT 
    other_table_name.column_name(s) AS new_column_name(s)
INTO
    schema_name.new_table_name(s)
FROM 
    other_table(s)
WHERE 
    condition(s)

In the SELECT clause you can either specify column names that you want to move over, or use '\*'.

If you want to change the name of the columns, use aliases ('AS')

**<span style="color:orange">NOTE</span>** : When you create a table from another table it will be populated with the records from the existing table. To avoid this, use 'WHERE 1=2' in the WHERE clause of the SELECT statement:

In [None]:
SELECT 
    table_name.column_name(s)
INTO
    new_table_name(s)
FROM 
    other_table_name(s)
WHERE 
    1=2 # Will never evaluate to true, so no row will meet the filter requirement

# Adding Records to a Table

### Syntax

In [None]:
INSERT INTO schema_name.table_name (column_name1, column_name2, ...)
VALUES (value1, value2, ...)

or

INSERT INTO schema_name.table_name 
VALUES (value1, value2, ...) # Must have values for all of the columns and they must be in the same order as the columns they refer to in the table

**<span style="color:orange">NOTE</span>** : you won't specify a value for a primary key field that is auto-generated

# Editing an Existing Record

### Syntax

In [None]:
UPDATE schema_name.table_name
SET column_name1 = value1,
    column_name2 = value2,
    ...
WHERE
    condition # specified which rows to update; if it contains a primary key expression, this statement only changes one row. Otherwise it could change multiple rows

**<span style="color:red">Warning</span>** : if you leave out the WHERE clause, an UPDATE statement will modify all of the records/rows in a table

# Deleting Records From a Table

### Syntax

In [None]:
DELETE FROM schema_name.table_name
WHERE 
    condition # determines which rows to remove; if it contains a primary key expression this statement only changes one row. Otherwise it could change multiple rows

**<span style="color:red">Warning</span>** : if you leave out the WHERE clause, a DELETE statement will delete all of the records/rows in a table

# Deleting a Table

### Syntax

In [None]:
DROP TABLE schema_name.table_name # deletes the whole table

or

TRUNCATE TABLE schema_name.table_name # deletes all of the data in a table (all of the records/rows) but not the table itself

**<span style="color:orange">NOTE</span>** : using TRUNCATE is faster than using DELETE without a WHERE clause to delete all of the records in a table

# Temporary Tables

Temporary tables exist temporarily on the server. They are useful for storing information that you are going to access multiple times (instead of re-querying for that set of data each time).

### Syntax

In [None]:
SELECT
    column_name(s)
INTO
    #temporary_table_name
FROM 
    existing_table_name
WHERE
    condition(s)
    
or

CREATE TABLE #temporary_table_name (
    column_name1 data_type PRIMARY KEY column_constraint,
    column_name2 data_type column_constraint,
    ...
    column_nameN, data_type column_constraint,
    table_constraints
)

Once you have created a temporary table you can find it under **System Databases > tempdb > Temporary Tables**. You can then perform operations on this table like any other table (e.g. SELECT, INSERT INTO, etc.)

**<span style="color:orange">NOTE</span>** : A temporary table can only be accessed from the connection that created it. So if you open up a new notebook or query window you won't be able to access the same temporary table and would have to create a new one. For this reason, a temporary table is deleted when you close the connection that created it