Here’s your **in-depth, detailed, and organized notes** for the topic
**DBMS → Indexing and Hashing/5: Index Design**
covering **every concept mentioned in the transcript**.

---

# **Indexing and Hashing/5 – Index Design**

## **Learning Outcomes**

1. **Discuss** how indexes can be created in SQL.
2. **Deliberate** on good index designs using guidelines for indexing.

---

## **1. Introduction & Context**

* This is the concluding part of the **Indexing and Hashing** topic.
* Previously discussed:

  * **Hashing** (Static & Dynamic).
  * **Ordered Indexing vs Hashing** comparison.
  * **Bitmap Index** for columns with **limited distinct values**.
* In this module:

  * Learn how to **create indexes** in SQL.
  * Guidelines on:

    * **What to index**
    * **How to index**
    * **What not to index**
    * Balancing performance trade-offs.

---

## **2. Creating Indexes in SQL**

### **2.1 Basic Syntax**

```sql
CREATE INDEX index_name
ON table_name (column_list);
```

* **index\_name**: User-defined name for the index.
* **table\_name**: The relation on which the index is created.
* **column\_list**: One or more attributes.

  * **Single attribute** → Simple Index.
  * **Multiple attributes** → Composite Index.

#### **Example**

```sql
CREATE INDEX b_index
ON branch (branch_name);
```

* Creates `b_index` on the **branch\_name** column of **branch** table.

---

### **2.2 Creating Unique Index**

* Enforces **candidate key** behavior for the search key.
* **Syntax**:

```sql
CREATE UNIQUE INDEX index_name
ON table_name (column_name);
```

* If DBMS already supports **UNIQUE constraints**, prefer using them in table design.
* Use unique index only if needed explicitly.

---

### **2.3 Dropping Index**

```sql
DROP INDEX index_name;
```

* Deletes the index.
* Similar to `DROP TABLE` for tables.

---

### **2.4 Optional Parameters in CREATE INDEX**

SQL allows specifying:

1. **Type of Index** (e.g., clustered, bitmap).
2. **Clustering options**.
3. **Composite Index** (max limit: 32 columns in SQL standard — practically rare).

   * Composite key size must be ≤ **half of a data block** (slightly less due to overhead).

---

### **2.5 Examples**

```sql
CREATE INDEX emp_ename
ON employee (ename);
```

* Creates a basic index on `ename`.

---

### **2.6 Advanced Options**

#### **a) TABLESPACE**

* Specifies which **tablespace** the index will be stored in.
* If not specified → goes to **default tablespace**.
* **Tablespace** = Physical storage partition for schema objects.

#### **b) STORAGE Options**

Control how storage is allocated:

* **INITIAL** → First extent size.
* **NEXT** → Size of second extent when initial is full.
* **PCTINCREASE** → Percentage increase for subsequent extents.
* Example:

  * `INITIAL 20K` → First extent = 20 KB.
  * `NEXT 20K` → Second extent = +20 KB (100% increase).
  * `PCTINCREASE 75` → Third extent = +75% of last extent.

#### **c) PCTFREE**

* % of space in each block kept free for updates.
* If `PCTFREE 0` → no free space reserved.

#### **d) COMPUTE STATISTICS**

* Collects usage stats for the index for optimizer use.

---

### **2.7 Index on Multiple Columns**

```sql
CREATE INDEX dept_sal_index
ON employee (department_name, salary);
```

* Order of columns is **important** in composite indexes.

---

### **2.8 Function-Based Index**

* When queries use a function on a column (`UPPER(name)`), a normal index on `name` is **not helpful**.
* Need to index the **function result**:

```sql
CREATE INDEX upper_name_index
ON employee (UPPER(ename));
```

* Speeds up searches that use `UPPER(ename)`.

---

### **2.9 Bitmap Index in SQL**

* Used when column has **small number of distinct values**.
* Example:

```sql
CREATE BITMAP INDEX gender_bitmap
ON student (gender);
```

* **Working**:

  * For each distinct value → bit array of length = number of rows.
  * 1 → record has that value.
  * 0 → record doesn’t have that value.
* Query like:

```sql
SELECT *
FROM student
WHERE gender='F' AND semester=4;
```

* Achieved via bitwise **AND** of `gender='F'` bitmap & `semester=4` bitmap.

---

### **2.10 Multi-Key Access**

When multiple attributes are indexed:

* **Strategies**:

  1. Use **index on one attribute** → filter results, then check other condition.
  2. Use index on the other attribute first.
  3. Use **both separate indexes** → intersect pointers.
  4. Use **composite index**.

**Ordering in composite index is critical** (lexicographic order applied).

---

## **3. Privileges for Creating Indexes**

* Not everyone can create indexes.
* **Required Privileges**:

  * `CREATE INDEX` → for your own schema.
  * `CREATE ANY INDEX` → for other schemas.
  * Adequate **TABLESPACE quota** or `UNLIMITED TABLESPACE` privilege.
  * **Function-based index** requires:

    * `QUERY REWRITE` privilege.
    * `QUERY_REWRITE_ENABLED = TRUE`.

---

## **4. Guidelines for Good Index Design**

### **Rule 0: Indexes → Access/Update Trade-off**

* More indexes:

  * **Faster query access**.
  * **Slower updates** (insert/delete/update).
  * More **disk space**.
* Don’t index everything — balance is essential.

---

### **Rule 1: Index the Correct Table**

* Create index if you frequently retrieve **<15% rows** from a **large table**.
* This % is approximate — depends on:

  * Table scan speed.
  * Data clustering.
* Indexes are **more useful on columns with distinct values**.
* **Primary & unique keys** → auto-indexed by DBMS.
* Foreign keys → good candidates for indexing (especially for joins).
* **Small tables** → often no index needed.

---

### **Rule 2: Index the Correct Column**

* Good for:

  * **High distinct values** → B-Tree index.
  * **Low distinct values** → Bitmap index.
* If column has many NULLs and you search only **non-NULL** rows:

  * Rewrite condition to use a very small constant comparison (forces index usage).
* Long/BLOB columns → cannot be indexed.

---

### **Rule 3: Limit the Number of Indexes per Table**

* Every insert/delete → all indexes must be updated.
* More indexes → more update overhead.
* **Read-only** tables → more indexes acceptable.
* **Frequent updates** → fewer indexes.

---

### **Rule 4: Choose Order of Columns in Composite Index**

* Put the **more selective** (high distinct) column first.
* Example:

  * Vendor-parts table:

    * Vendor ID → few distinct values.
    * Part Number → many distinct values.
  * Use `(part_number, vendor_id)` not `(vendor_id, part_number)`.

---

### **Rule 5: Gather Statistics**

* Use `COMPUTE STATISTICS` after creating/updating indexes.
* Periodically refresh statistics as data distribution changes.

---

### **Rule 6: Drop Unused Indexes**

* Unused indexes:

  * Waste storage.
  * Slow down updates.
* Dropping a table → drops its indexes automatically.
* Drop specific indexes using `DROP INDEX`.

---

## **5. Summary**

* Learned how to:

  * Create different types of indexes in SQL (simple, composite, unique, bitmap, function-based).
  * Use options like **TABLESPACE**, **STORAGE**, **PCTFREE**, and **COMPUTE STATISTICS**.
* Index design guidelines emphasize:

  * Trade-off between read & write performance.
  * Selecting correct table & columns.
  * Limiting unnecessary indexes.
  * Considering column order in composite indexes.
  * Using statistics for optimization.
  * Dropping unused indexes.