This is a repository for the Advanced SQL course
- SQL stands for Structured Query Language
- SQL is used to communicate with a database
- SQL is an ANSI (American National Standards Institute) standard
- SQL is the most popular database language today
- SQL is a standard language for accessing and manipulating databases
- SQL can execute queries against a database
- SQL can retrieve data from a database
- SQL can insert records in a database
- SQL can update records in a database
- SQL can delete records from a database
- SQL can create new databases
- SQL can create new tables in a database
- SQL can create stored procedures in a database
- SQL can create views in a database
- SQL can set permissions on tables, procedures, and views
- SQL can create indexes on tables
- SQL can create triggers on tables
- SQL can create user-defined functions in a database
- SQL can create user-defined aggregate functions in a database
- SQL can create user-defined data types in a database
- Database mining is the process of discovering patterns in large databases.
- Database mining is also known as knowledge discovery in databases (KDD).
- Database mining is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both.
- Database mining is used by companies that want to market new products to their existing customers or by companies that want to attract new customers.
- Database mining is also used to determine which customers are likely to respond to a particular marketing campaign.
- Tables: A table is a collection of related data entries and it consists of columns and rows.
- Views: A view is a virtual table based on the result-set of an SQL statement.
- Indexes: An index is a way to quickly locate data in a database.
- Triggers: A trigger is a set of SQL statements that are automatically executed when a specified event occurs in a database.
- Procedures: A procedure is a set of SQL statements that are stored in the database and can be executed by using a procedure name.
- Functions: A function is a set of SQL statements that are stored in the database and can be executed by using a function name.
- Cursors: A cursor is a database object that enables you to traverse the rows of a result-set one row at a time.
- Constraints: A constraint is a rule that is enforced on a column or a table.
- Rules: Specify acceptable values that can be entered into a table.
- Synonyms: A synonym is an alternative name for a database object.
- Packages: A package is a collection of database objects that can be stored in the database and reused in other packages or applications.
- Database Links: A database link is a connection to another database.
- Sequences: A sequence is a database object that generates a series of numbers.
- Types: A type is a collection of attributes that define the structure of a table.
- Domains: A domain is a collection of values that define the structure of a column.
- Operators: An operator is a symbol that tells the database how to perform a calculation.
- Expressions: An expression is a combination of one or more values, operators, and SQL functions that results in a value.
- Clusters: A cluster is a group of database objects that are stored together on disk.
- Windows Authentication: Windows Authentication is a method of authentication that uses the Windows operating system to verify the identity of a user.
-
Identify the entities
-
Identify the relationships
-
Identify the attributes of each entity
-
Identify the primary key of each entity
-
Identify the foreign key of each entity
-
some concepts: 1 - Entity: An entity is a person, place, object, or event about which data is stored. 2 - Attribute: An attribute is a characteristic of an entity. 3 - Relationship: A relationship is a connection between two or more entities. 4 - Primary Key: A primary key is a column or a combination of columns that uniquely identifies each row in a table. 5 - Foreign Key: A foreign key is a column or a combination of columns that uniquely identifies a row/record in another table.
-
some rules: 1 - Each entity must have a primary key. 2 - Each entity must have at least one attribute.
-
Attributes can be: 1 - Single-valued: A single-valued attribute can have only one value for each entity. 2 - Multi-valued: A multi-valued attribute can have more than one value for each entity. 3 - Derived: A derived attribute is a value that is calculated from other values. 4 - Composite: A composite attribute is an attribute that is made up of other attributes. 5 - Derived and Composite: A derived and composite attribute is an attribute that is calculated from other attributes. 6 - Identifiers: An identifier is a unique value that identifies an entity.
- Rows contain data attributes of the relation
- Cells of the table hold a single value (no arrays)
- Column are attributes of the relation
- All entries in a column are of the same domain (physical and logical)
- Each attribute has a unique name
- No two tuples may be identical (must have a valid primary key)
- No repeating groups - more than one attribute from same physical and logical domain (ex. author1 author2, etc.)
- Crow's Foot Notation is a notation used to represent the relationships between tables in a database.
- Crow's Foot Notation is also known as Chen's Notation.
- Crow's Foot Notation is used to represent the relationships between tables in a database.
- The database should be normalized to the highest possible level.
- The database should be designed to be as flexible as possible.
- The database should be designed to be as easy to use as possible.
- The ALTER TABLE statement is also used to add and drop various constraints on an existing table.
- The ALTER TABLE statement is also used to add and drop various objects on an existing table.
- 2NF: stands for Second Normal Form
- 2NF: is a normal form used to normalize a database table.
- Normalization is a database design technique that reduces data redundancy and improves data integrity.
- Normalization is a process used to organize a database into tables and columns.
- Redundancy: infromation repeated in multiple locations
- Update: failure to change all instances of a specific value.
- Delete: delete data ane lose other values as side effect
- Insert [
- need to add data about multiple themes into a relation
- adding rows may force users to add infromation about another theme ]
- Functional Dependencies: is a relationship between two sets of attributes where changing the value of one set of attributes determines the value of the other set of attributes.
- A set of attributes can form a functional dependency: A,B,C → D, A,B,C → D does not mean A → D, BD, and C → D
- All three attributes are needed to determine the value of D
- A,B,C → D is a functional dependency
-
A relation is in second normal form
- if it is in first normal form and all of its non-key attributes are dependent on the entire primary key example: book(ISBN,name,date_publish) ISBN->name,date_publish)
-
Non-key attribute: attribute that is not part of the primary key
-
A relation is in second normal form if it is in first normal form and all of its non-key attributes are dependent on the entire primary key.
-
Dependent: Attributes appearing on the right side of a functional dependency
-
Independent: Attributes appearing on the left side of a functional dependency
-
Violation: partial dependency - when you have a non-key attribute that is dependent on only part of a composite pk (primary key)
-- book(ISBN, name,date_publish,authorname, address)
-- ISBN, authorname>name,date_publish,address)
-- authorname>address
CREATE TABLE books (
ISBN VARCHAR(20) NOT NULL,
name VARCHAR(50) NOT NULL,
date_publish DATE NOT NULL,
authorname VARCHAR(50) NOT NULL,
PRIMARY KEY (ISBN)
);
CREATE TABLE authors (
authorname VARCHAR(50) NOT NULL,
address VARCHAR(50) NOT NULL,
PRIMARY KEY (authorname)
);
INSERT INTO books VALUES ('1234567890', 'The C Programming Language', '1978-02-22', 'Dennis Ritchie');
INSERT INTO books VALUES ('1234567891', 'The C++ Programming Language', '1985-10-15', 'Bjarne Stroustrup');
INSERT INTO books VALUES ('1234567892', 'The Java Programming Language', '1995-05-23', 'James Gosling');
INSERT INTO books VALUES ('1234567893', 'The Python Programming Language', '1991-02-20', 'Guido van Rossum');
INSERT INTO authors VALUES ('Dennis Ritchie', 'USA');
INSERT INTO authors VALUES ('Bjarne Stroustrup', 'Denmark');
INSERT INTO authors VALUES ('James Gosling', 'Canada');
INSERT INTO authors VALUES ('Guido van Rossum', 'Netherlands');
- A relation is in third normal form
- if it in 2NF and is free from transitive dependencies
- Transitive Dependency Format:
- AttributeA → AttributeB
- Attribute B → AttributeC
- AttributeA → AttributeB
- Violation: The 2nd half of a Transitive Dependency: a non-key attribute is functionally dependent on another non-key attribute
- AttributeB → AttributeC
-- employee(id, name, age,companyname, address,city)
-- id, comapnyname->name,age,address, city
-- companyname->address,city
CREATE TABLE company (
companyname VARCHAR(50) NOT NULL,
address VARCHAR(50) NOT NULL,
city VARCHAR(50) NOT NULL,
PRIMARY KEY (companyname)
);
CREATE TABLE employee (
id INT NOT NULL,
name VARCHAR(50) NOT NULL,
age INT NOT NULL,
companyname VARCHAR(50) NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO company VALUES ('Google', '1600 Amphitheatre Parkway', 'Mountain View');
INSERT INTO company VALUES ('Microsoft', 'One Microsoft Way', 'Redmond');
INSERT INTO company VALUES ('Apple', '1 Infinite Loop', 'Cupertino');
INSERT INTO employee VALUES (1, 'John', 30, 'Google');
INSERT INTO employee VALUES (2, 'Mary', 25, 'Microsoft');
INSERT INTO employee VALUES (3, 'Steve', 40, 'Apple');
INSERT INTO employee VALUES (4, 'Bill', 35, 'Microsoft');
INSERT INTO employee VALUES (5, 'Ram', 45, 'Google');
INSERT INTO employee VALUES (6, 'Ron', 50, 'Apple');
- AS: is a SQL clause that allows you to rename a column or table using an alias.
- LIMIT: is a clause that lets you specify the maximum number of rows the query will return.
- TOP: is a Transact-SQL clause that restricts the number of rows returned by a SELECT statement.
SELECT name AS 'Employee Name', age AS 'Employee Age' FROM employee;
SELECT name AS 'Employee Name', age AS 'Employee Age' FROM employee LIMIT 3;
SELECT TOP 3 name AS 'Employee Name', age AS 'Employee Age' FROM employee;
SELECT * FROM employee LIMIT 3;
SELECT TOP 3 * FROM employee;
- AND: is a logical operator that displays a record if all the conditions separated by AND are TRUE.
- OR: is a logical operator that displays a record if any of the conditions separated by OR is TRUE.
- NOT: is a logical operator that displays a record if the condition(s) is NOT TRUE.
SELECT * FROM employee WHERE age > 30 AND age < 50;
SELECT * FROM employee WHERE age > 30 OR age < 50;
SELECT * FROM employee WHERE NOT age > 30;
- JOIN: is a SQL clause that is used to combine rows from two or more tables, based on a related column between them.
- INNER JOIN: is a SQL clause that returns all rows from multiple tables where the join condition is met.
- LEFT JOIN: is a SQL clause that returns all rows from the left table (table1), even if there are no matches in the right table (table2).
- RIGHT JOIN: is a SQL clause that returns all rows from the right table (table2), even if there are no matches in the left table (table1).
- FULL JOIN: is a SQL clause that returns all rows when there is a match in either left or right table.
create table employees(
id int not null auto_increment,
name varchar(30) not null,
phone varchar(20) not null,
salary int not null check(salary>100),
primary key(id)
);
create table customers(
id int not null auto_increment,
name varchar(30) not null,
phone varchar(20) not null,
state varchar(30) not null,
primary key(id)
);
create table orders(
id int not null auto_increment,
customer_id int not null,
employee_id int not null,
order_date date not null,
primary key(id),
foreign key(customer_id) references customers(id),
foreign key(employee_id) references employees(id)
);
insert into employees(name,phone,salary) values
('John','1234567890',1000, 'TX'),
('Mary','1234567891',2000, 'NY'),
('Steve','1234567892',3000, 'NW'),
('Bill','1234567893',4000, 'CA'),
('Ram','1234567894',5000, 'TX'),
('Ron','1234567895',6000, 'TX');
insert into customers(name,phone,state) values('bob','1234567890','CA'),
('alice','1234567891','NY'),
('jim','1234567892','TX'),
('jane','1234567893','CA'),
('joe','1234567894','NY');
insert into orders(customer_id,employee_id,order_date) values(1,1,'2019-01-01'),
(2,2,'2019-01-02'),
(3,3,'2019-01-03'),
(4,4,'2019-01-04'),
(5,5,'2019-01-05'),
(1,2,'2019-01-06'),
(2,3,'2019-01-07');
select * from employees inner join orders on employees.id=orders.employee_id;
-- preview
+----+-------+------------+--------+----+-------------+-------------+------------+
| id | name | phone | salary | id | customer_id | employee_id | order_date |
+----+-------+------------+--------+----+-------------+-------------+------------+
| 1 | John | 1234567890 | 1000 | 1 | 1 | 1 | 2019-01-01 |
| 2 | Mary | 1234567891 | 2000 | 2 | 2 | 2 | 2019-01-02 |
| 2 | Mary | 1234567891 | 2000 | 6 | 1 | 2 | 2019-01-06 |
| 3 | Steve | 1234567892 | 3000 | 3 | 3 | 3 | 2019-01-03 |
| 3 | Steve | 1234567892 | 3000 | 7 | 2 | 3 | 2019-01-07 |
| 4 | Bill | 1234567893 | 4000 | 4 | 4 | 4 | 2019-01-04 |
| 5 | Ram | 1234567894 | 5000 | 5 | 5 | 5 | 2019-01-05 |
+----+-------+------------+--------+----+-------------+-------------+------------+
select * from employees left join orders on employees.id=orders.employee_id;
-- run the code
+----+-------+------------+--------+------+-------------+-------------+------------+
| id | name | phone | salary | id | customer_id | employee_id | order_date |
+----+-------+------------+--------+------+-------------+-------------+------------+
| 1 | John | 1234567890 | 1000 | 1 | 1 | 1 | 2019-01-01 |
| 2 | Mary | 1234567891 | 2000 | 2 | 2 | 2 | 2019-01-02 |
| 2 | Mary | 1234567891 | 2000 | 6 | 1 | 2 | 2019-01-06 |
| 3 | Steve | 1234567892 | 3000 | 3 | 3 | 3 | 2019-01-03 |
| 3 | Steve | 1234567892 | 3000 | 7 | 2 | 3 | 2019-01-07 |
| 4 | Bill | 1234567893 | 4000 | 4 | 4 | 4 | 2019-01-04 |
| 5 | Ram | 1234567894 | 5000 | 5 | 5 | 5 | 2019-01-05 |
| 6 | Ron | 1234567895 | 6000 | NULL | NULL | NULL | NULL |
+----+-------+------------+--------+------+-------------+-------------+------------+
select name,orders.id as order_id,orders.order_date from employees right join orders on employees.id = orders.employee_id
-- run the code
+-------+----------+------------+
| name | order_id | order_date |
+-------+----------+------------+
| John | 1 | 2019-01-01 |
| Mary | 2 | 2019-01-02 |
| Steve | 3 | 2019-01-03 |
| Bill | 4 | 2019-01-04 |
| Ram | 5 | 2019-01-05 |
| Mary | 6 | 2019-01-06 |
| Steve | 7 | 2019-01-07 |
+-------+----------+------------+
-- using union
select name,phone,state from customers
union
select name,phone,state from employees;
-- run the code
+-------+------------+-------+
| name | phone | state |
+-------+------------+-------+
| bob | 1234567890 | CA |
| alice | 1234567891 | NY |
| jim | 1234567892 | TX |
| jane | 1234567893 | CA |
| joe | 1234567894 | NY |
| john | 1234567890 | CA |
| mary | 1234567891 | NY |
| peter | 1234567892 | TX |
| paul | 1234567893 | CA |
| lisa | 1234567894 | NY |
+-------+------------+-------+
- BETWEEN: is a SQL operator that selects values within a given range.
create table employees(
id int not null auto_increment,
name varchar(30) not null,
age int not null,
primary key(id)
);
insert into employees(name,age) values('John',30),
('alice', 33),
('bob', 25),
('jim', 40),
('jane', 35),
('joe', 50),
('mary', 45),
('peter', 55),
('paul', 60),
('lisa', 65);
SELECT * FROM employee WHERE age BETWEEN 30 AND 50;
-- run the code
+----+-------+-----+
| id | name | age |
+----+-------+-----+
| 1 | John | 30 |
| 2 | alice | 33 |
| 5 | jane | 35 |
| 6 | joe | 50 |
| 7 | mary | 45 |
+----+-------+-----+
- LIKE: is a SQL operator that is used to search for a specific pattern in a column.
SELECT * FROM employee WHERE name LIKE 'J%';
-- run the code
+----+-------+-----+
| id | name | age |
+----+-------+-----+
| 1 | John | 30 |
| 3 | jim | 40 |
| 4 | jane | 35 |
| 5 | joe | 50 |
+----+-------+-----+
- IN: is a SQL operator that allows you to specify multiple values in a WHERE clause.
SELECT * FROM employee WHERE name IN ('John', 'Mary');
-- run the code
+----+-------+-----+
| id | name | age |
+----+-------+-----+
| 1 | John | 30 |
| 7 | mary | 45 |
+----+-------+-----+
SELECT * FROM employee WHERE age BETWEEN 30 AND 50;
SELECT * FROM employee WHERE name LIKE 'J%';
SELECT * FROM employee WHERE name IN ('John', 'Mary');
- HAVING: is a SQL clause that was added to SQL because the WHERE keyword could not be used with aggregate functions.
- WHERE: is a SQL clause that is used to filter records.
- GROUP BY: is a clause in SQL that is only used with aggregate functions to group the result-set by one or more columns.
SELECT companyname, COUNT(*) AS 'Number of Employees' FROM employee GROUP BY companyname HAVING COUNT(*) > 1;
SELECT companyname, COUNT(*) AS 'Number of Employees' FROM employee GROUP BY companyname WHERE COUNT(*) > 1;
- ORDER BY: is a clause in SQL that is used to sort the data in ascending or descending order in the result-set.
- DISTINCT: is a clause in SQL that is used to return only distinct (different) values.
- UNION: is a SQL clause that is used to combine the result-set of two or more SELECT statements.
SELECT DISTINCT city FROM company ORDER BY city;
SELECT DISTINCT city FROM company ORDER BY city DESC;
SELECT name FROM employee UNION SELECT companyname FROM company;
- INTERSECT: is a SQL clause that is used to return only the common records that exist in two or more SELECT statements.
- EXCEPT: is a SQL clause that is used to return only the records that are not common in two or more SELECT statements.
SELECT name FROM employee INTERSECT SELECT companyname FROM company;
SELECT name FROM employee EXCEPT SELECT companyname FROM company;
- INSERT INTO SELECT: is a SQL statement that is used to insert data into a table from another table.
- INSERT INTO VALUES: is a SQL statement that is used to insert data into a table.
INSERT INTO employee (name, age, companyname) SELECT name, age, companyname FROM company;
INSERT INTO employee (name, age, companyname) VALUES ('John', 30, 'Apple');
- UPDATE: is a SQL statement that is used to update existing records in a table.
- DELETE: is a SQL statement that is used to delete records from a database table.
UPDATE employee SET age = 40 WHERE name = 'John';
DELETE FROM employee WHERE name = 'John';
- CREATE TABLE: is a SQL statement that is used to create a new table in a database.
- ALTER TABLE: is a SQL statement that is used to add, delete, or modify columns in an existing table.
- DROP TABLE: is a SQL statement that is used to delete an entire table from the database.
- CREATE INDEX: is a SQL statement that is used to create an index (search key) on a table.
- DROP INDEX: is a SQL statement that is used to delete an index.
- CREATE VIEW: is a SQL statement that is used to create a virtual table based on the result-set of an SQL statement.
- DROP VIEW: is a SQL statement that is used to delete a view.
- CREATE DATABASE: is a SQL statement that is used to create a new database in SQL Server.
- ALTER DATABASE: is a SQL statement that is used to add or remove files from a database.
- DROP DATABASE: is a SQL statement that is used to delete an entire database.
- CREATE PROCEDURE: is a SQL statement that is used to create a stored procedure in SQL Server.
- ALTER PROCEDURE: is a SQL statement that is used to modify an existing stored procedure.
- DROP PROCEDURE: is a SQL statement that is used to delete a stored procedure.
- CREATE FUNCTION: is a SQL statement that is used to create a user-defined function in SQL Server.
- ALTER FUNCTION: is a SQL statement that is used to modify an existing user-defined function.
- DROP FUNCTION: is a SQL statement that is used to delete a user-defined function.
- CREATE TRIGGER: is a SQL statement that is used to create a trigger in SQL Server, what is trigger? A trigger is a special type of stored procedure that is automatically executed when an event occurs in the database.
- ALTER TRIGGER: is a SQL statement that is used to modify an existing trigger.
- DROP TRIGGER: is a SQL statement that is used to delete a trigger.
- CREATE SCHEMA: is a SQL statement that is used to create a new schema in SQL Server, What is Schema? A schema is a logical container for objects in a database.
- ALTER SCHEMA: is a SQL statement that is used to modify an existing schema.
- DROP SCHEMA: is a SQL statement that is used to delete a schema.
- CREATE ROLE: is a SQL statement that is used to create a new role in SQL Server, What is Role? A role is a collection of users and permissions.
- ALTER ROLE: is a SQL statement that is used to modify an existing role.
- DROP ROLE: is a SQL statement that is used to delete a role.
- CREATE USER: is a SQL statement that is used to create a new user in SQL Server, What is User? A user is a person who uses the database.
- ALTER USER: is a SQL statement that is used to modify an existing user.
- DROP USER: is a SQL statement that is used to delete a user.
- Count(): is a SQL aggregate function that is used to return the number of rows in a specified column.
- AVG(): is a SQL aggregate function that is used to return the average value of a numeric column.
- SUM(): is a SQL aggregate function that is used to return the sum of all the values in a numeric column.
- MIN(): is a SQL aggregate function that is used to return the smallest value of the selected column.
- MAX(): is a SQL aggregate function that is used to return the largest value of the selected column.
- ROUND(): is a SQL function that is used to round a numeric value to a specified number of decimal places.
- LOWER(): is a SQL function that is used to convert a string to lowercase.
- UPPER(): is a SQL function that is used to convert a string to uppercase.
- LEN(): is a SQL function that is used to return the length of a string.
- LEFT(): is a SQL function that is used to return a specified number of characters from the left side of a string.
- RIGHT(): is a SQL function that is used to return a specified number of characters from the right side of a string.
- SUBSTRING(): is a SQL function that is used to return a specified number of characters from a string.
- CASCADE: is a SQL constraint that specifies that if a record in a table is updated or deleted, then all the records in other tables that are related to the record in the first table should be updated or deleted as well.
-
Single row functions are used to perform operations on a single row of a table. The following table lists the single row functions in SQL Server.
-
Character Functions
- Lower(), Upper()
- Left(), Right(), Substring()
- Length()
- Concat()
- Charindex()
- Replace()
- Ltrim(), Rtrim(), Trim()
- Reverse()
- Soundex()
-
Numeric Functions
- Abs(), Sign()
- Ceiling(), Floor()
- Round(), Truncate()
- Power(), Square(), Sqrt()
- Rand(), Randbetween()
- Exp(), Log(), Log10()
- Pi(), Radians(), Degrees()
-
Date Functions
- Getdate(), Getutcdate()
- Dateadd(), Datediff()
- Datepart()
- Datename()
- Year(), Month(), Day()
- Hour(), Minute(), Second()
- Datefromparts(), Datetimefromparts()
- Datetime2fromparts(), Datetimeoffsetfromparts()
- Smalldatetimefromparts(), Timefromparts()
- Timestamfromparts()
-
Conversion Functions
- Cast(), Convert()
- Try_cast(), Try_convert()
- Parse(), Try_parse()
-
Aggregate Functions
- Avg(), Count(), Max(), Min(), Sum()
- Stdev(), Stdevp(), Var(), Varp()
- Corr(), Covar_pop(), Covar_samp()
- Regr_avgx(), Regr_avgy(), Regr_count(), Regr_intercept(), Regr_r2(), Regr_slope(), Regr_sxx(), Regr_sxy(), Regr_syy()
- Revert(), Revertbinary(), Revertvarbinary()
-
Other Functions
-
Multi row functions are used to perform operations on multiple rows of a table. The following table lists the multi row functions in SQL Server.
-
Analytic Functions
- Row_number(), Rank(), Dense_rank(), Percent_rank(), Cume_dist(), Ntile(), Lag(), Lead()
- First_value(), Last_value(), Nth_value()
- Avg(), Count(), Max(), Min(), Sum()
- Stdev(), Stdevp(), Var(), Varp()
- Corr(), Covar_pop(), Covar_samp()
- Regr_avgx(), Regr_avgy(), Regr_count(), Regr_intercept(), Regr_r2(), Regr_slope(), Regr_sxx(), Regr_sxy(), Regr_syy()
- Revert(), Revertbinary(), Revertvarbinary()
-
Table Functions
- Openrowset(), Openxml(), Openquery(), Opencursor()
- Openjson(), Openjsonb()
- Openquery(), Opencursor()
- Openrowset(), Openxml()
- Openjson(), Openjsonb()
-
XML Functions
- Xml_data_type()
- Xml_namespaces()
- Xml_parse()
- Xml_query()
- Xml_serialize()
- Xml_table()
- Xml_validate()
-
JSON Functions
- Json_data_type()
- Json_query()
- Json_value()
-
XML Table Functions
- Xml_data_type()
- Xml_namespaces()
- Xml_parse()
- Xml_query()
- Xml_serialize()
- Xml_table()
- Xml_validate()