## Relational Databases

**SQL (Structured Query Language)** is the most widely used programming language for communicating with data in databases. It allows for quick access, organization, and analysis of large amounts of data using direct commands known as queries.

### Databases


Data is **stored** in **databases**. For example, managing data for a library would require creating a database with tables for checkouts, books, and patrons. Databases are more powerful and secure than spreadsheets, with encryption and the ability to handle larger data volumes.

#### A Closer Look at Tables
A table is a component of a database, such as a patrons table. This table stores data about library patrons, like their card number, name, the year they joined, and fines owed.

#### Rows and Columns  
Tables organize data into rows and columns. Each row represents an individual data entry, while each column describes a specific data part, like the year a patron joined.

#### Relational Databases
A database typically includes multiple tables that are linked through relationships. For example, the checkouts table connects to the patrons and books tables through shared data, such as card number and book ID.

#### Database Advantages
SQL queries allow multiple users to simultaneously gather insights from a database without altering the data. The database's information is accessed and presented according to the query instructions.

### Tables

#### Table Naming
A table name should clearly reflect the data it contains, such as inventory, products, or books. Table names should be in lowercase and use underscores instead of spaces.

#### Records and Fields
In databases, **rows** are called records, and **columns** are called fields.


#### Records
Each record holds data for an individual observation. For example, the patrons table contains one record for each patron. A record for Jasmin might show that she became a member in 2022 and owes $2.05 in fines.

#### Fields
Fields hold one piece of information for every record in a table. For example, the **name** field in the patrons table contains all patrons' names.

#### Field Naming
Field names must be lowercase, use underscores instead of spaces, and be singular (e.g., "card_num" instead of "card_nums"). They should not duplicate the table name and must be unique within a table. This distinction ensures clarity when writing queries.

#### Unique Identifiers
Each table includes a special field containing a unique identifier for each record, known as a **key**. In the patrons table, **card_num** is the unique identifier, as names might repeat.

#### Multiple Tables
It’s often better to use several related tables than a single large one. For example, combining patrons and checkouts into one table can create confusion due to duplicate information. By organizing data into separate, related tables, SQL allows for more efficient analysis and querying.

### Data

#### Data Storage
Data in a database is stored on a **server's hard disk**. Servers are powerful computers that store and provide access to data over a network. They can handle many data requests simultaneously, making them ideal for collaborative environments.


#### SQL Data Types
When creating a table, each field must be assigned a data type based on the kind of information it will store (e.g., numbers, text, or dates). The data type also influences the operations that can be performed on the data.


#### Strings
A **string** is a sequence of characters (letters, numbers, or punctuation). For example, the **name** field in the patrons table stores strings like "Maham" and "James". The **VARCHAR** data type is commonly used to store strings, allowing flexibility for varying string lengths.


#### Integers
**Integers** store whole numbers, such as the values in the **card_num** field in the checkouts table. The **INT** data type can store numbers from negative two billion to positive two billion.


#### Floats
**Floats** are numbers that include a decimal point, such as the $2.05 Jasmin owes in fines. The **NUMERIC** data type is used for floats and can store up to 38 digits, including both sides of the decimal point.

#### Schemas
A **schema** is the design or blueprint of a database. It outlines the structure, including the tables, relationships between tables, and the data types for each field. For example, the schema for the library database shows the **VARCHAR** data type for fields like book title, author, and genre, and that the patrons table is linked to the checkouts table, but not to the books table.

## Querying

### Introducing Queries

#### What is SQL Useful For?
SQL is used to answer questions within and across relational database tables. For example, in a library database, we can use SQL to find which books James checked out in 2022. In an HR database, SQL can be used to query salaries for employees in different departments to compare pay across departments.

#### Best for Large Datasets
SQL is often used alongside other tools like spreadsheets to analyze large datasets. It can uncover trends such as website traffic, customer reviews, and product sales. For example, SQL can help identify the best-selling products last week or track changes in website traffic after a feature release.

#### Keywords
SQL uses **keywords** to perform specific operations. The most common keywords are:
- **SELECT**: Used to choose which fields to retrieve.
- **FROM**: Indicates which table the fields should be selected from.

For example, to get a list of every patron's name, we use `SELECT name FROM patrons;`.

#### Our First Query
A basic query starts with **SELECT**, followed by **FROM**, and ends with a semicolon. The keywords are capitalized, while table and field names are in lowercase. The result set will display the data as requested.

#### Selecting Multiple Fields
To select multiple fields, list them after the **SELECT** keyword, separated by commas. For example, to select both **card_num** and **name**, we would write:

```sql
SELECT card_num, name FROM patrons;
```

#### Selecting More Fields
You can select more than two fields, such as **name**, **card_num**, and **total_fine**, by listing them all after the **SELECT** keyword and separating them with commas.

#### Selecting All Fields
To select all fields from a table, use the asterisk (*) symbol, also known as a **wildcard** character, instead of listing each field individually. For example:

```sql
SELECT * FROM patrons;
```

This will select every field in the **patrons** table.


### Writing Queries


#### Leveling Up SQL Queries
Let's explore additional SQL keywords to enhance our queries.

#### Aliasing
Aliasing helps rename columns in a result set for clarity or brevity using the **AS** keyword.  
Example:

```sql
SELECT name AS first_name, hire_date FROM employees;
```

This changes the column name **name** to **first_name** in the result set, while the actual field name in the table remains unchanged.

#### Selecting Distinct Records
To retrieve unique values from a column, use the **DISTINCT** keyword.  
Example:

```sql
SELECT DISTINCT year_hired FROM employees;
```

This removes duplicate years from the result set.

#### DISTINCT with Multiple Fields
To get unique combinations of multiple fields, apply **DISTINCT** before the field names.  
Example:

```sql
SELECT DISTINCT dept_id, year_hired FROM employees;
```

This ensures that each department-year hiring combination appears only once.

#### Views
A **view** is a saved SQL query that acts like a virtual table. Views don’t store data; they store queries, ensuring results are always updated with database changes.

To create a view:
```sql
CREATE VIEW employee_hire_years AS
SELECT name, year_hired FROM employees;
```

This query doesn’t return a result set but saves the query for future use.

#### Using Views
Once created, a view can be queried like a regular table:

```sql
SELECT * FROM employee_hire_years;
```

This retrieves all records from the saved query.


### SQL Flavors


#### Understanding SQL Flavors
SQL has multiple versions, or "flavors," ranging from free to enterprise-level databases like Microsoft SQL Server and Oracle Database. All flavors follow universal standards set by ISO and ANSI, differing mainly in additional features.

#### Two Popular SQL Flavors
1. **PostgreSQL** – A free, open-source relational database system developed at the University of California, Berkeley, with sponsorship from DARPA.
2. **SQL Server** – A relational database system created by Microsoft, available in both free and enterprise versions. It pairs well with Microsoft products and uses **T-SQL**, Microsoft’s proprietary SQL flavor.

#### Comparing PostgreSQL and SQL Server
Both systems are similar but have minor differences in syntax.  
For example, to limit query results:
- **PostgreSQL** uses `LIMIT`:

```pgsql
SELECT name, id FROM employees LIMIT 2;
```

- **SQL Server** uses `TOP`:

```pgsql
SELECT TOP 2 name, id FROM employees;
```

#### Choosing a SQL Flavor
If an employer requires a specific system, use that flavor. Otherwise, start with any version—PostgreSQL or SQL Server—as the core principles are universal. Mastering the fundamentals makes it easy to switch between flavors.


