# 1. Introduction & Basics of DBMS

## 1.1 Introduction to Database Management Systems(DBMS) 

### Definitions
- **Data**: Raw, unprocessesd facts.
    - Ex: 25, Seresh, True
- **Information**: Processed data.
    - Ex: The age of Suresh is 25
- **Database(db)**: Collection of *related* data.
    - Ex: Online banking system, library management system
- **Meta-data**: The database definition(description).
    - Ex: Type of data stored in the db

### Database Management System

Collection of programs that enables users to create and maintain the database.

#### Functionalities
- **Define**: Specifying the **data type**, **structures** and **contrains** for the data to be stored.
- **Construct**: Process of **storing data** on some storage medium.
- **Manipulate**: Quering the database to **retrieve** specific data, **updating** database and **generating reports**.
- **Share**: Allows multiple users and programs to **access** the database **concurrently**.
- **Security**: Protecting the database from unauthorized access or from hardware and software failures.


#### Properties of Database
- A database **represents** some aspects of the **real world** (miniworld).
- A database is a **logically coherent** collection of data with some inherent meaning.
- A database is **designed**, **built** and **populated** with data for a **specific purpose**.

### Database System Environment

![ALT TEXT](./img/dbse.png)

#### Example

![ALT TEXT](./img/db_ex.png)

## 1.2 DBMS Characteristics

### Approaches

![ALT TEXT](./img/approches.png)

#### Comparison Table:
<table border="1" cellpadding="10">
  <thead>
    <tr>
      <th>Feature</th>
      <th>File System Approach</th>
      <th>DBMS Approach</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Data Redundancy</td>
      <td>High (data duplication across files)</td>
      <td>Low (data normalization reduces redundancy)</td>
    </tr>
    <tr>
      <td>Data Consistency</td>
      <td>Difficult to maintain</td>
      <td>Ensured through constraints and integrity rules</td>
    </tr>
    <tr>
      <td>Data Integrity</td>
      <td>No support for integrity constraints</td>
      <td>Supports integrity constraints (e.g., primary key, foreign key)</td>
    </tr>
    <tr>
      <td>Concurrent Access</td>
      <td>Limited support, prone to data corruption</td>
      <td>Multi-user concurrency with transaction management (locks)</td>
    </tr>
    <tr>
      <td>Data Security</td>
      <td>OS-based, minimal security features</td>
      <td>Advanced security (authentication, authorization, encryption)</td>
    </tr>
    <tr>
      <td>Backup and Recovery</td>
      <td>Manual, difficult to manage</td>
      <td>Automated backups and recovery features</td>
    </tr>
    <tr>
      <td>Data Independence</td>
      <td>Tight coupling between data and programs</td>
      <td>Data independence (logical and physical layers)</td>
    </tr>
    <tr>
      <td>Querying & Manipulation</td>
      <td>Manual programming required</td>
      <td>High-level querying with SQL or other query languages</td>
    </tr>
    <tr>
      <td>Data Access Speed</td>
      <td>Slower for large datasets</td>
      <td>Optimized for efficient access and retrieval</td>
    </tr>
    <tr>
      <td>Data Sharing</td>
      <td>Limited support for sharing and collaboration</td>
      <td>Supports multi-user environments with controlled access</td>
    </tr>
  </tbody>
</table>

<h3>Which to Use?</h3>
<ul>
  <li><strong>File System Approach:</strong> Suitable for small-scale projects or simple applications where data management complexity is minimal, such as local logs, temporary data storage, or single-user desktop applications.</li>
  <li><strong>DBMS Approach:</strong> Essential for large-scale, multi-user applications, or when working with complex data structures and relationships, such as in enterprise applications, web applications, and data science projects. A DBMS is a more robust, scalable, and secure option when managing significant amounts of data.</li>
</ul>


### Characteristics of DBMS Approach

### Self-Describing
- **Database system** = Database + Meta-data (DB Definition)  
- **Stored in:** DBMS catalog  
- **Used by:** DBMS Software & Database Users  
- DBMS software can work with various database applications (e.g., University, Library, Game Store).  
- In traditional file processing, data definition is part of the application programs, limiting it to a specific database.

##### Ex: Database Catalog 

![ALT TEXT](./img/catalog.png)

### Data Independence
Insulation between Programs and Data  
- In traditional file processing, the structure of data files is embedded in application programs.  
- In the database approach, the structure of data files is stored in the DBMS catalog, separate from access programs (**program-data independence**).

### Data Abstraction
- The characteristic that enables **program-data independence** is called **data abstraction**.  
- DBMS provides users with a **conceptual representation** of data.  
- A **data model** (type of data abstraction) provides this conceptual representation.

### Multiple Views
- DBMS supports multiple **views** of the data for different users.  
- A **view** is a subset of the database containing **virtual data** derived from the database but not explicitly stored.

### Sharing of Data
Sharing of data and multiuser transaction processing.  
- A **multiuser DBMS** allows multiple users to access the database simultaneously.  
- DBMS includes **concurrency control** to manage simultaneous transactions and ensure data integrity.  
- **OLTP** (Online Transaction Processing) is a major part of database applications.  
- DBMS must enforce several **transaction properties**:
    - **Isolation**: Ensures transactions run independently, maintaining data consistency.
    - **Atomicity**: Guarantees that a transaction is either fully completed or entirely rolled back.

## 1.3 Database Users

Database users can be classified into two main categories based on their interaction with the database:

### **Actors on the Scene**
These are users who directly interact with the database and its applications:

#### Database Administrators
- In database environment, **primary resourse** --> database, **secondary resource** --> DBMS & related software.
- Database Administrator (DBA) **responsibilities**:
    - 1) **Administering** primary/secondary resources.
    - 2) **Authorizing** access ti the database.
    - 3) **Co-ordinating & monitoring** use of database.
    - 4) Acquiring hardware & software resources as needed.
    - 5) Troubleshooting errors and problems.

#### Database Designers
Responsible for:
- 1) **Identifying** the **data** to be stored in the database.
- 2) **Choosing** appropriate **structures** to represent and store data.
- 3) **Communicating** with database users --> understand their requirements --> designs database. 

#### End Users
End users --> peple whose jobs require access to the database --> for querying, updating, generating reports.<br>
Several categories of end users:

- **Casual end users**:Access database occasionally
    - typically middle or high-level managers or other occasional browsers.
- **Naive or parametric end users**: constantly querying and updating database using **canned transactions**.
- **Sophisticated end users**: Engineers, scientists, business analysts.
- **Stand-alone users**: Maintains **personal databases**
    - using ready-made program packages.

#### System Analysts & Application Programmers (Software Engineers)
- **System Analysts**: determine the rquirements of end users --> **develop specifications** for canned transactions.
<br>
- **Application Programmers**: test, debug, document and maintain these canned transactions.

### **Workers Behind the Scene**
These users are involved in the backend processes, ensuring the database operates smoothly:

#### System designers and implementers
Design and implement DBMS modules & interfaces as a package.

#### Tool Developers
Persons who design and implement tools (**Software packages**).

#### Operators and maintenance personnel
responsible for actual running and maintenance of hardware & software.

## 1.4 Advantages & Disadvantages of DBMS

### Advantages

#### Controlling Redundancy
- In **traditional file system**:
    - Each user group maintains its **own files**.This will lead  to **wastage of storage space**, **inconsistency**.
- In the **database approach**:
    - **Views** of different users --> **integrated**.
    - All the data stored in the only one place of the database.
    - This ensures **consistency** & **saves storage space**.

#### Restricting Unauthorized Access
- When multiple users share a large database, the type of access operation must be controlled. 
- DBMS must provide **security and authorization sybsystem**.
- **DBA** --> creates accounts & specifies account restrictions.
- **Parametric users** --> allowed to access database only through canned transactions.

#### Providing Persistent Storage for Program Objects

<table>
    <tr>
        <th>Feature</th>
        <th>Traditional File Systems</th>
        <th>DBMS</th>
    </tr>
    <tr>
        <td>Storage of Program Objects</td>
        <td>Must be explicitly stored in separate permanent files; involves conversion.</td>
        <td>Recognizes data structures of programming languages like Java, C++ and automatically converts.</td>
    </tr>
    <tr>
        <td>Object Persistence</td>
        <td>Once program terminates,values of program variables are discarded.</td>
        <td>Once program terminates,values of program variables are not discarded; stores objects permanently, making them persistent.</td>
    </tr>
</table>


#### Providing Storage Structures for Efficient Query Processing
- Database systems must provide capabilities for **efficiently executing** queries and updates.
- Since database is stored on disk, DBMS must provide specialized data structures (**indexes**) to speed up disk search.
- **Query processing qnd optimization** module --> responsible for efficient query execution.  

#### Providing Backup and Recovery
- The **backup and recovery subsystem** of DBMS --> responsible for recovery --> in case of hardware or software failures.
- **Ex:** if the computer crashes during a complex transaction, the **recovery subsystem** --> responsible for ensuring that **transaction resumes** from where it is interrupted or atleast **restore** to the state it was before transaction started executing.

#### Providing Multiple User Interfaces
- Multiple users --> different levels of technical knowledge --> so DBMS should provide a variety of user interfaces.
- **Ex:** 
    - Query languages --> **casual users**
    - Programming languages interfaces --> **application programmers**
    - forms --> **parametric users**
    - menu-driven interfaces --> **stand-alone users**
- Form-style interfaces and menu-driven interfaces --> **Graphical User Interfaces (GUIs)**

#### Representing Compelx Relationships among Data
- A database may have a variety of **data** --> **interrelated** in many ways.
- DBMS must be capable of:
    - **Representing complex relationships** among data.
    - **Retreive** and **update** related data **easily** and **efficiently**.

#### Enforcing Intergrity Constraints
- Simplest type of integrity constraint --> **specifying data type** for each data item.
- Another type of constaint --> **uniqueness** of data item values.
- Responsibility of **Database designers** --> **identifying integrity constraints** during databse design. 

#### Permitting Inferencing and Actions Using Rules
- Database systems must provide capabilities for defining **deduction rules** for inferencing new information.
- Such systems --> **deductive database systems**.
- **Active database systems** --> provides active rules that automatically **initiate actions** when certain **events and conditions occur**.

### Disadvantages

- **Overhead costs**(is the ongoing cost to run the DBMS) of using DBMS:
    - High initial investment.
    - Overhead for providing security, concurrency control, recovery
- **Database** & applications --> **simple**,well defined and **no changes** expected, using files approach is prefered.
- **Multiple-user access** --> **not required**.

## 1.5 History of Database Applications

#### Early Database Applications Using Hierarchical & Network Systems
Early systems were based on 3 **database models**:
- Hierarchical systems.
- Network model based systems.
- Inverted file systems.

![ALT TEXT](./img/heir.png)

![ALT TEXT](./img/network.png)

- **Main problem** of early database systems --> not flexible to develop new queries and reorganizing data was difficult.
- Another **drawback** that it only provided programming language interfaces.
- Implemented in the mid-1960s,through 1970s and 1980s on large & expensive mainframe computers

![](./img/history.png)

#### Providing Application Flexibility with Relational Databases
- Relational databases (RDBMS) --> proposed by E.F. Codd --> organizes data into **tables** which can be linked or related.
- **High-level query language** --> introduced in relational data model.
- **Provides flexibility** to quickly develp new queries & reorganize the database as per the change in requirements.
- Early experimental relational systems (developed in 1970s) & the commercial RDBMS(developed in 1980s) were **quite slow**.
- Development of new storage & indexing techniques, better query processing --> **performance** of RDBMS **improved**.
- Relational databases become the **dominant** type of database systems.

#### Object-Oriented Applications and the Need for More Complex Databases
- Emergence of object-oriented programming languages --> the development of object-oriented databases.
- Incorporated many of the useful **object-oriented features** --> data abstraction, encapsulation, inheritance.
- Mainly used in **specialised applications** --> engineering design, manufacturing systems, etc.

#### Interchanging Data on the Web for E-Commerce
- WWW (**World Wide Web**) is large network of interconnected computers.
- Users --> create documents using HTML and stores them on the Web servers.
- Documents can be linked together through **hyperlinks**.

#### Extending Database Capabilities fro New Applications
- Scientific applications to store large amounts of data from scientific experiments.
- Storage & retrieval of images.
- Storage & retrieval of videos.
- Data mining applications --> analyzing large amounts of data.
- Spatial applications --> weather information.
- Time Series applications --> eg:daily sales information.

- Basic relational model --> not suitable.
- Reasons:
    - More **complex data structures** were needed.
    - **New data types** were required.
    - **New query language**.
    - New **storage & indexing techniques** were needed.

## 1.6 Fundamentals of Database Systems

#### Data Models
- Used to describe the structure of the database --> helps to achieve data abstraction.
- Includes **a set of basic operations** for specifying retrievals or updates on the database.
- Also includes concepts to specify the behavious of a database application

#### Categories of Data Models

##### **High-level or conceptual Data Model**:
- Provides concepts that are close to the way many users perceive data.
- User concepts such as entities, attributes and relationships.
- **Entities** --> represent real world object or concept.
- **Attributes** --> further describe an entity.
- **Relationships** --> association among 2 or more entities.

##### **Low-level or Physical Data Model**:
- Describes **how data is stored** in the computer.
- **Access path** --> structure for efficient searching of database records.

##### **Representational (or implementation) Data Model**:
- Represent data using **resord structures** --> record-based data models.

#### Terminologies
- **Database Schema**: Descriptino of a database.
- **Schema Diagram**: Dis played schema.
- **Schema Construct**: Each object within the schema.
    - Ex: STUDENT, COURSE, etc.
- **Database State** (or **instance** or **snapshot**): The data in the database at a particular moment.

# 2. DBMS Architecture & Design

## 2.1 Three-Schema Architecture

<h4>Goal:</h4>to separate the user applications and the physical database.

![](./img/diag.png)

### 3 Levels:

#### Internal Level:
- Describes the physical storage structure of the database.
- Describes complete details of data storage and access

#### Conceptual Level:
- **Hides** the **details of the physical storage structure** and concentrates on describing entities, data types, relationships, constraints, etc.

#### External Level:
- Describes the part of the database that a user is interested in and hides the rest of the database from the user group.

### Data Independence
Capacity to change the schema at one level of a database system without having to change the schema at the next higher level.

#### Logical Data Independence:
Ability to **modify** the **conceptual schema** without changing the external schemas or application programs.

#### Physical Data Independence:
- Ability to **modify** the internal schema without changing the conceptual schema.
- Changes may be needed to **improve performance**.

## 2.2 DBMS Languages, Interfaces, and Classification

### DBMS Languages

- In DBMS, where no strict separation of levels is maintained, **Data Definition Language** (DOL) is used to define the internal and conceptual schemas.

- In DBMS, where a clear separation is maintained between conceptual and internal levels:
    - **DDL** → used to specify the **conceptual schema** only.
    - **Storage Definition Language** (SDL) → used to specify the **internal schema**.

- **View Definition Language** (VDL) → to specify **user views** and their mappings to the conceptual schema.

- **Data Manipulation Language** (DML) → for **manipulation** of data in the database.

#### Type of DML
<table border="1" cellpadding="10" cellspacing="0">
  <thead>
    <tr>
      <th>Type of DML</th>
      <th>Description</th>
      <th>Other Name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>High-level (Non-procedural DML)</td>
      <td>Used to specify complex database operations in a concise manner.</td>
      <td>Set-at-a-time DMLs</td>
    </tr>
    <tr>
      <td>Low-level (Procedural DML)</td>
      <td>Embedded in a general-purpose programming language.</td>
      <td>Record-at-a-time DMLs</td>
    </tr>
  </tbody>
</table>


### DBMS Interfaces

#### Menu-Based Interfaces:
- These interfaces present the users with a **list of options** (menus).
- Most popular → **Pull-down menus**.

![ALT TEXT](./img/menu.png)

#### Forms-Based Interfaces:
- Displays a form to each user.
- Designed and programmed for the **naive users**.

![ALT TEXT](./img/form.png)

#### Graphical User Interfaces:
- Displays a schema to the user in **diagrammatic form**.
- Utilizes both menus and forms.
- Uses a **pointing device** to pick certain parts of the displayed diagram.

#### Natural Language Interfaces:
- Has its **own schema** and **dictionary**.
- Refers to them while interpreting a request.
- If interpretation is successful → high-level query generated → **Query processing**.

- Otherwise, a dialogue is started with the user for further clarification.

#### Interfaces for Parametric Users:
- Parametric users → have small set of operations that they must perform repeatedly.
- System analysts and programmers → design & implement a special interface for these users.

#### Interfaces for the DBA:
- Privileged commands (like for creating accounts, granting access) that can be used only by the DBA's staff.

### DBMS Classification

Several criteria used to classify DBMS:
- Data model
- Number of users
- Cost of DBMS
- Number of sites
- Types of access path
- Generality

#### Data Models
- **Relational data model:**<br>
![](./img/rel_db_model.jpeg)
- **Object-Oriented database model:**<br>
![](./img/oop_db_model.jpeg)

- **Heirarchical model:**<br>
![](./img/hier_db_model.jpeg)

- **Network model:**<br>
![](./img/net_db_model.jpeg)


#### Number of Users
- **Single-user systems**: supports only one user at a time.

- **Multiuser systems**: supports multiple users concurrently.

#### Cost of DBMS
- **Low cost**: Cost of these systems → between $100 & $3000.
- **Medium cost**: Cost varies between $10,000 & $100,000.
- **High cost**: Cost of these systems more than $100,000.

#### Number of Sites:
- **Centralized database system**: DBMS & the database → reside at a single computer site.
- **Distributed database system**: DBMS & the database → distributed over many sites, connected by a computer network.

## 2.3 Database System Environment 

### DBMS Component Modules

![](./img/modules.png)

#### DDL compiler
**processes** schema definition and **stores** it in the catalog

#### Query compiler
**handles** high-level queries.

#### Pre-compiler
**extracts** DML commands from an application program.

#### DML compiler
**compilation** of DML commands into object code.

#### Runtime database processor
**handles** database accesses at runtime.

#### Stored data manager
**transfers** data between disk & main memory.

### Database System Utilities
Common utilities have the following types of functions:

#### Loading
- Used to **load existing data files** into the database.
- The source file format and the target data file structure are mentioned to the utility, it **automatically reformats** the data and **stores** it in the database.

#### Backup
- Create a **backup copy** of teh database.
- **Incremental backups** are often used.Though it is complex, it saves space.

#### File reorganization
- Used to reorganize a database file into a different file organization.

#### Performance monitoring
- Monitors database usage and provides statistic to the DBA.

### Other Tools

- **CASE tools:** used during the design of the database.
- **Data Dictionary system:** (or data repository), storing catalog information.
- **Application Development Environments:** provides an environment for developing database applications.*Ex:* JBuilder, PowerBuilder system, ...
- **Communications software:** remote access to the database.

## 2.4 DBMS Architecture

### Centralized DBMS Architecture
- Earlier, **mainframe computers** were used **to process** all system functions.
- Users accessed systems via **computer terminals** which provided **only display capabilities** but no processing capabilities.

![A:](./img/cen.png)

- **Processing** performed remotly on the **computer system**.
- Only **display information** was sent to the **terminals**.
- Prices of hardware declined led terminals replaced by **PC & workstations**.

### Basic Client/Server Architectures
- **Goal** is to define specialized servers with specific functionalities.
- **Client** is user machine that **provides user interface** capabilities & local processing.
- **Server provides services** to client machines.  

![](./img/clie_serv.png)

### Two-Tier Client/Server Architectures
- In RDBMS, **user interfaces** & **application programs** moved to **client side**.
- **Query & transaction functionality** on **server side** (query server/transaction server).
- When DBMS access is required the application program establishes a connection with the DBMS(server side). 
- **Open Database Connectivity** (ODBC) provides API allows programs (client-side) to call DBMS.
- **Advantages** of two-tier architecture:
    - Simplicity
    - Compatibility
- Emergence of World Wide Web led to three-tier architecture.

### Three-Tier Client/Server Architectures
- Additional intermediate layer between client & database server which called **Application server** or **Web server**:
    - **Stores** rules used to access data.
    - **Accepts** requests from client.
    - **Processes** the requests.
    - **Sends** commands to database server.

![](./img/three_tier.png)

# 3. Entity-Relationship Model & Design Process

## 3.1 Basic Concepts of Entity-Relationship Model

### Terminologies

#### **Entity**:
- A "thing" in the real world with an **independent existence**.
- May be an object with physical existence(*ex:* house, person) or with conceptual existence(*ex:* course, job).

##### Entity Type
- A collection of entities that have the same attributes.
- *Ex*: STUDENT<br>

![](./img/STUDENT.png)

##### Weak Entity Types
- Entity types that **do not have key attributes** of their own.
- Identified by relating to another entity type called the identifying or the **owner entity type**.
- Relationship between weak entity type to its owner: **identifying relationship**. 

##### Entity Set
- Collection of entities of a particular entity type at a point in time.
- *Ex*: STUDENTS with AGE between 17 and 20.

#### **Attributes**:
Properties that **describe the entities**.<br>
![](./img/attrs.png)

##### Composite Vs Simple Attributes
- **Composite**:
    - **Can ve divided** into further parts.
    - *Ex*: Name: First Name, Middle Name, Last Name.
- **Simple**:
    - **Cannot be divided** further.
    - *Ex*: Weight: cannot be further divided.

##### Single Vs Multi Valued Attributes
- **Single-Valued**:
    - Have a **signle value** for a particular entity.
    - *Ex*: Age,Weight. 
- **Multivalued**:
    - Can have **set of values** for a particular entity.
    - *Ex*: College degrees, languages known.

##### Derived Vs Stored Attributes
- **Derived**:
    - Can be derived from other attributes.
    - *Ex*: Age: can be derived from date of birth.
- **Stored**:
    - From which the value of other attributes are derived.
    - *Ex*: BirthData of a person.

##### Complex Attributes
- Has multivalued & composite components in it.
- **Multivalued** attributes represented within **{ }**.
- **Composite** attributes represented within **( )**.
- **Ex:** {CollegeDegrees(College, Year, Degree, Field)}

##### Null Values
Null is something which is **not applicable** or **unknown**.<br>
![](./img/null.png)

##### Key Attribute
- That attribute that is capable of **identifying** each entity **uniquelly**.
- *Ex*: Roll number(Id) of a student.

##### Value Set of Attributes:
- The set of values that can be assigned to an attribute.

## 3.2 Database Design Process

### Intro

![](./img/design.png)

#### Requirements Collection & Analysis
- Database designers understand & **document** the data **requirements** of the database users.

#### Functional Requirements
- Consists of user-defined operations.

#### Conceptual Design
- Creating conceptual schema.
- **Conceptual Schema**: Concise description of data requirements & detailed description of the entity types relationships & constraints.

#### Logical Design
- **Actual implementation** of the database, using commercial DBMS.

#### Physical Design
- The internal storage structures, indexes, access paths: specified.

### Symbols used in ER Diagram

#### Entity
![](./img/-4.png)

#### Weak Entity
![](./img/-3.png)

#### Attribute
![](./img/-2.png)

#### Key Attribute
![](./img/-1.png)

#### Multivalued Attribute
![](./img/-8.png)

#### Composite Attribute
![](./img/-7.png)

#### Derived Attribute
![](./img/-6.png)

#### Identifying Relationship
![](./img/-5.png)

### Example
Let us see an example database application,called COMPANY.

#### Requirements gathered
- Company is orgamized into **departments**. Each department has a unique name, unique number & a particular employee who manages the department. We also keep track of the start date fo the manager. A department may have several locations.
- A department controls number of **projects**, each of which has a unique name, unique number and a single location.
- **Employee details**: name, SSN, sex, salary. We keep track of number of hours per week on each project.
- Keep track of each employee's **dependents** (first name, sex, relationship to the employee).

#### Initial Conceptual Design of COMPANY Database
We can identify 4 entity types based on the requirements:
- **DEPARTMENT**: Name, Number, {Locations}, Manager, ManagerStartDate<br>
![](./img/t1.png)
- **PROJECT**: Name, Number, Location, ControllingDepartment<br>
![](./img/t2.png)
- **EMPLOYEE**: Name(FName, MName, LName), SSN, Sex, Salary, BirthDate, Department, {WorksOn (Project, Hours)}<br>
![](./img/t3.png)
- **DEPENDENT**: Employee, DependentName, Sex, BirthDate, Relationship<br>
![](./img/t3.png)

## 3.3 Concept of Relationships in ER Diagram

### Relationships
- Association among 2 or more entities.
- *Ex*: teacher **teaches** student.

### Degree of Relationship
Denotes the number of entity types that participate in a relationship.

#### 1. Unary relationship
Exists when there is an **association with only one entity**.<br>
![](./img/77.png)

#### 2. Binary relationship
Exists when there is an **association with only two entity**.<br>
![](./img/88.png)

#### 3. Ternary relationship
Exists when there is an **association among three entities**.<br>
![](./img/99.png)

### Relationship Constraints

#### Cardinality Ratio
- Maximum number of relationship instances that an entity can participate in.
- Possible cardinality ratios for binary relationship : **1:1**, **1:N**, **N:1**, **M:N**.<br>
![](./img/card.png)

#### Participation Constraints
- Specifies whether existence of an entity depends on its being related to another entity.
- 2 Types: **Total** participation & **Partial** participation.<br>
![](./img/part.png)

### ER Diagram for COMPANY Database

![](./img/er%20diag.png)

### Attributes of Relationship Types
- Attributes of 1:1 or 1:N relationship types can be migrated to one of the participating entity types.
    - In **1:1 relationship** type, attributes can be migrated to either of the entity types.
    - In **1:N** or **N:1 relationship** type, attributes are migrated only to the entity type on the N-side of the relationship
    - In **M:N relationship** type, some attributes can be determined by a **combination of participating entities**.

![](./img/atts.png)

### Role Names
- **Signifies the role** that a participating entity plays in each relationship instance.
### Recursive Relationships
- Same entity type **participates more than once** in a relationship type in different roles.<br>
![](./img/recurs.png) 

### Alternative Notations for ER Diagrams
Associates a pair of integer numbers **(min,max)** with each participation of an entity type in a relationship type, where **0 <= min <= max** and **max >=1**.
![](./img/alt.png)

### Enhanced ER Model

#### Generalization
**Bottom-up appraoch** where two lower level entities combine to form a higher level entity.<br>
![](./img/gen.png)

#### Specialization
**Top-up appraoch** where it defines set of subclasses of an entity type.<br>
![](./img/spe.png)

# 4. Relational Data Model & Constraints

## 4.1 Introduction

### History

- First introduced by Ted Codd (in 1970).
- Uses concept of mathenatical relation.
- **First commercial implementations of the relational model**: Oracle DBMS**, SQL/DS system (IBM).
- **Current popular RDBMSs**: SQL Server & Access (Microsoft),DB2 & Informix (IBM),etc.
- **Standard for commerial RDBMS**: SQL query Language.

### Terminologies

- Relational model represents data as a **collection of tables**
- A table is also called a **relation**.
- Each row is called **tuple**.
- Column headers are called **attributes**.

![](./img/term.png)

- **Domain**
    - A set of atomic values allowed for an attribute.
    - *Ex:*
        - **Name**: string of characters that represent name of persons.
        - **Emplyee_ages**: Possible ages of employees of a company (values between 20 & 70 years old). 

- **Relation schema**
    - Describes a relation.
    - Made up of a relation name **R** and a list of attributes **A1,A2,A3,...,An**.

![](./img/schema.png)

- **Degree (or arity) of a relation**
    - Number of attributes in a relation schema.
    - *Ex*:
        - In last picture,the degree is 6.

- **Cardinality**
    - Total number of tuples present in a relation.

![](./img/cardin.png)

- **Relational database schema**
    - Is a set of relation schemas and a set of intergrity constraints.

- **Relation state (or relation instance)**
    - Set of tuples at a given time.


## 4.2 Characteristics of Relations

### Ordering of Tuples within a Relation
- A relation is a set of tuples.
- Tuples in a relation need not have any particular order.

![](./img/ord_tup.png)

### Ordering of Values within a Tuple
- An **n-tuple** is an ordered list of n values, so ordering of values in a tuple is important.
<br><br>
- With an alternative definition of relation, prdering of values i a tuple is unnecessary.
- A tuple is set of **(\<attribute>,\<value>)** pair, then ordering of attributes is not important.

![](./img/ord_vals.png)

### Values & Nulls in a Tuple
- Each value in a tuple is an **atomic value**.

![](./img/atom_nul.png)

### Interpretation of a Relation
- The relation schema can be represented as a decalaration or assertion.
- Each typle can be interpreted as a fact.

## 4.3 Relational Model Constraints

### Constrains on database
- **Inherent model-based:** inherent in the data model.
- **Schema based:** Defined directly in the schamas of the data model.
- **Application based(semantic):** Must be expressed and enforced by the application programs.

#### Schema-based Constraints

- **Domain Constrains:**
    - Must be an **atomic**
    - Performs **data type check**.

![](./img/d_cons.png)

- **Key Constaints**:
    - An attribute that can uniquely identify each tuple in a relation is called a **key**.

    ![](./img/key_cons.png)

    - A **superkey** specifies that no two tuples can have the same value.
    - Every relation has at least one superkey : set of all attributes.

    ![](./img/suup.png)

    - A key satisfies 2 constaints:
        - Two tuples cannot have identical values for all the attributes in the key.
        - It is a **minimal superkey**.
    - **Candidate Keys**:
        - Set of attributes that uniquely identify the tuples in a relation.


- Contraints on Null Values:
    - Specifies wheter null values are permitted or not (**NOT NULL**).

- **Entity Integrity Constraint**:
    - States that **no primary key value can be null**

- **Referential Integrity Constraint**:
    - Specified between 2 relations.
    - States that a tuple in one relation that refers to another relation must refer to an **existing tuple** in that relation.
    
    ![](./img/intj.png)

    - **Foreign Key** must satisfy the following:
        - Same domain.
        - Value of FK in a tuple either occurs as a value of PK ie,
            - **t1[FK] = t2[PK]** or is null.

## 4.4 Manipulate Operations

### The INSERT operation

- **Domain constraints**: Violated if the attribute value doesn't match the domain.
  - **Handling**: Validate the data type, format, and range before insertion.

- **Key constraint**: Violated if the primary key is duplicated.
  - **Handling**: Check for duplicates or use unique constraints/sequences.

- **Entity Integrity**: Violated if the primary key is null.
  - **Handling**: Ensure all primary key values are non-null before insertion.

- **Referential Integrity**: Violated if the foreign key doesn't match any tuples in the referenced relation.
  - **Handling**: Validate the foreign key or use cascading options like `ON INSERT CASCADE`.

### The DELETE operation

- **Referential Integrity**: Violated if dependent tuples exist in other tables.
  - **Handling**: Use `ON DELETE CASCADE`, `SET NULL`, or `RESTRICT` to manage dependent records.

### The UPDATE operation

- **Domain Constraint**: Violated if the updated value doesn't match the domain.
  - **Handling**: Validate data type, format, and range before updating.

- **Key Constraint**: Violated if the primary key is updated to a duplicate value.
  - **Handling**: Check for duplicates and enforce unique constraints.

- **Referential Integrity**: Violated if the update changes a foreign key or referenced primary key incorrectly.
  - **Handling**: Use `ON UPDATE CASCADE` or validate the new key value, or restrict updates with `ON UPDATE RESTRICT`.

- **Entity Integrity**: Violated if the update sets a primary key or `NOT NULL` attribute to null.
  - **Handling**: Ensure non-null values for key and `NOT NULL` fields.

# 5. Relational Algebra Operations

## Unary
are operations that involve only **one relation** (table) as input and produce another relation as output.

### The SELECT operation
Filters rows based on a specified condition.<br>
**Syntax:** σ<small>\<selection_condition></small> (R) <br>
- **R** is the relation name.

#### Ex: 
Select the EMPLOYEE tuples whose Department Number is 2.<br>
![](./img/ex1.png)

### The PROJECTION operation:
Selects specific columns from a relation, eliminating duplicates.<br>
**Syntax:** π<small>\<attribute_list></small> (R)
- **R** is the relation name.

#### Ex:
To list the Employee's First name,Last name and Salary, we can use the PROJECTION operation.<br>
![](./img/ex2.png)

### The RENAME operation:
Renames the attributes or the relation itself.<br>
**Syntax:** ρ<small>\<S(B1,B2,...)></small> (R)
- **R** is the 'old' relation name.
- **S** is the 'new' relation name.

#### Ex: 
Change name of attributes 'FName','LName' to 'FirstName', 'LastName'  for tuples with DNo equals to 3.<br>
![](./img/ex3.png)

## Set Theory
set theory is used to describe operations on relations (tables).

#### Union Compatible Term
Union compatibility refers to the requirement that two relations (tables) must meet in order to perform a set theory operations in relational algebra. For two relations to be union-compatible, they must satisfy the following conditions:

#### Conditions for Union Compatibility:
- **Same number of attributes**: Both relations must have the same number of columns (attributes).
- **Corresponding attributes must have the same domain**: The attributes in corresponding columns must have the same data type or domain. For example, if the first column of one relation is of type integer, the first column of the other relation must also be of type integer.

### The UNION Operation (R ∪ S)
Includes all tuples that are either in **R** or in **S** or in both **R and S**.

![](./img/union.png)

### The INTERSECTION Operation (R ∩ S)
Includes all tuples that are in both **R and S**.

![](./img/inter.png)

### The MINUS Operation (R - S)
Includes all tuples that are in **R** and **not in S**.

![](./img/minus.png)

### The CARTESIAN PRODUCT Operation (R × S)
Combined attributes of 2 relations.

![](./img/cart_pr.png)

## Binary

### The JOIN Operation
Combined related tuples from 2 relations into a single relation.<br>
- **Syntax:** R ⋈<small>\<join condition></small> S
- **R** is the first relation.
- **S** is the second relation.

![](./img/join.png)

Same as:<br>
![](./img/smae.png)

### The THETA Join
is a join operation that combines two relations based on a condition involving a comparison operator (like =, >, <, etc.).<br>
- **Syntax:**<br> 
<small>\<join condition></small> : A<small>i</small> θ  B<small>j</small> so <br>
R ⋈ <small>A<small>i</small> θ  B<small>j</small></small> S
- **A<small>i</small>** attribute of **R**.
- **B<small>j</small>** attribute of **S**.
- **θ** {=,<,<=,>,>=,!=}.

### The EQUIJOIN Operation
The only comparison operator used is **'='**.

### The NATURAL JOIN Operation ( * )
Can be preformed only if there is a common attribute in between the relations.

### The DIVISION Operation ( ÷ )
returns all tuples from one relation (A) that match with every tuple in another relation (B). It's typically used when you want to find entities in A that are related to **all** entities in B.

![](./img/rel_alg.png)

## Additional Operations

### Aggregate Functions (Ｇ)
used to perform calculations on a set of values and return a single value.
- **Syntax:** <small>\<grouping_attributes></small>Ｇ<small>\<function_list></small> (R)
- **\<grouping_attributes>:** list of attributes in **R**.
- **\<function_list>:** list of (\<function>\<attribute>) pairs.

![](./img/aggr.png)

### Recursive Closure
refer to operations that allow you to compute transitive relationships within a dataset. They are often used in hierarchical or graph-based structures to find all possible connections or "closures" between related entities.

![](./img/rec.png)

### OUTER JOIN

#### Left Outer Join
- **Syntax:** R ⟕ <small>\<condition></small> S

![](./img/left_j.png)

#### Right Outer Join
- **Syntax:** R ⟖ <small>\<condition></small> S

![](./img/right_j.png)

#### Full Outer Join
- **Syntax:** R ⟗ <small>\<condition></small> S

![](./img/full_j.png)

#### OUTER UNIOIN JOIN
If two relations are not union compatible or partially compatible.

![](./img/union_j.png)

# Sources
- <a href = "https://www.youtube.com/playlist?list=PLBlnK6fEyqRi_CUQ-FXxgzKQ1dwr_ZJWZ">Database Management Systems by Neso Academy</a>
- <a href="https://www.cs.dartmouth.edu/~cs61/Resources/Examples/RelationalAlgebra/RAsymbols/rasymbols.html#:~:text=Aggregate,function%3A%20%EF%BC%A7">Relational Algebra special characters</a>