# Database and SQL Preliminaries

**Dr. Pengfei Zhao**

Finance Mathematics Program, 

BNU-HKBU United International College

## 1. DataBase Management System (DBMS)

### What is DBMS?

* `DBMS` is a **system** to manage `databases`, where sets of interrelated data is stored.
* Generally, database can be categorized to two types, `relational database` and `non-relational database` (see the detailed difference [here](https://www.pluralsight.com/blog/software-development/relational-non-relational-databases)). The system to manage a relational database is called `relational database management system`, shorted as `RDBMS`. We **only** focus on relational database in this class.
* A relational database normally contains several `relations` (tables including rows and columns), each of which contains a specific aspect of data/information about a particular enterprise.
* `RDBMS` = Collection of interrelated data (storage of relations) + Set of programs to manage the data.
* RDBMS provides an environment that is both `convenient` and `efficient` to use.
* RDBMS Applications:
    * Banking: all transactions
    * Airlines: reservations, schedules
    * Universities:  registration, grades
    * Sales: customers, products, purchases
    * Manufacturing: production, inventory, orders, supply chain
    * Human resources:  employee records, salaries, tax deductions

### Commercial RDBMS

* Not Free: Oracle, IBM (DB2, Universal Server), Microsoft (SQL Server, Access)...
* Free: MySQL, PostgreSQL, SQLite...

### DBMS vs File Systems

* In the early days, database applications were built on top of file systems.
* **Drawbacks** of using file systems to store data:
    * Data redundancy and inconsistency
    * Difficulty in accessing data 
    * Integrity problems
    * Concurrent access by multiple users
    * Security problems
* DBMS offer automated solutions to all the above problems.

### Data Independence

* One big problem in application development is the separation of applications from data.
* Do I have change my program when I...
    * replace my hard drive?
    * partition the data into two physical files (or merge two physical files into one)?
* Solution: introduce levels of abstraction.


 <img src="../Figures/DB/logic.png" width = "480" height = "300" alt="图片名称" align=center />

* `Physical level:` describe how a record is stored on disks.
* `Logical level:` describes data stored in database, and the relationships among the data.
* `View level:` Define a subset of the database for a particular application. Views can also hide information (e.g. salary) for security purposes.


### An Example of Data Independence

 <img src="../Figures/DB/data_independence.png" width = "480" height = "300" alt="图片名称" align=center />

### Example of a `Relation`

 <img src="../Figures/DB/relation-term.png" width = "580" height = "350" alt="图片名称" align=center />

## 2. Structured Query Language (SQL)

### What is SQL?

* SQL stands for Structured Query Language.
* SQL lets you access and manipulate databases.
* SQL is an ANSI (American National Standards Institute) standard.
* RDBMS is a platform/system, and SQL is a script language running on RDBMS, helping RDBMS manipulating data.

### What Can SQL do?

* SQL can execute queries against a database
* SQL can retrieve data from a database
* SQL can insert records in a database
* SQL can update records in a database
* SQL can delete records from a database
* SQL can create new databases
* SQL can create new tables in a database
* SQL can create stored procedures in a database
* SQL can create views in a database
* SQL can set permissions on tables, procedures, and views

### SQL is a Standard - BUT....

* Although SQL is an ANSI (American National Standards Institute) standard, there are different versions of the SQL language (e.g. SQL in Oracle and MySQL system may be slightly different).

* However, to be compliant with the ANSI standard, they all support at least the major commands (such as *SELECT*, *UPDATE*, *DELETE*, *INSERT*, *WHERE*) in a similar manner.


### DDL and DML

* `SQL` includes `DDL` and `DML`.
* Data Definition Language (DDL) are statements that specify and modify database schemas..
* Data Manipulation Language (DML) are statements that manipulate database content.