## Index Objects
- fundamental components of a DataFrame or Series that define the labels of rows and columns.

### **Key Features of Index Objects**
1. **Immutable**:
   - Index objects are immutable, meaning you cannot change their values after they are created. However, you can create a new index and reassign it.

2. **Holds Metadata**:
   - They store metadata about the axis, which includes labels for rows and columns.

3. **Supports Various Data Types**:
   - An Index can contain different types of data like strings, integers, or timestamps, depending on the DataFrame or Series.

4. **Automatically Assigned**:
   - When creating a DataFrame or Series without explicitly specifying an index, Pandas assigns default integer-based indexing (0, 1, 2, …).

### **Advantages of Index Objects**
1. **Efficient Label-Based Operations**:
   - Provides fast lookup and alignment for labeled data.

2. **Supports Advanced Slicing**:
   - Allows complex slicing operations using labels.

3. **Enhanced Data Alignment**:
   - Automatic alignment of data in arithmetic operations.

4. **Metadata Storage**:
   - Can store descriptive information (e.g., `.name`) about data.


In [4]:
import numpy as np 
import pandas as pd
from pandas import Series, DataFrame

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### Types of Index Objects

In [6]:
# default Index (rangeIndex):
# Automatically assigned when you don’t specify an index.
# Lightweight and efficient for numerical indexing.
df = pd.DataFrame({'A': [10,20,30]})
df

Unnamed: 0,A
0,10
1,20
2,30


In [7]:
# index:
# General-purpose index for a one-dimensional array of labels.
# Can hold mixed data types
idx = pd.Index([1,2,3])
idx

Index([1, 2, 3], dtype='int64')

In [8]:
# MultiIndex:
# Used for hierarchical or multi-level indexing.
# Allows working with higher-dimensional data in a lower-dimensional DataFrame.
multi_idx = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1)])
multi_idx

MultiIndex([('A', 1),
            ('A', 2),
            ('B', 1)],
           )

In [9]:
# DatetimeIndex:
# For date and time data.
# Supports date-related operations like filtering, slicing, and frequency conversion.
date_idx = pd.date_range(start='2023-01-01', periods=5)
date_idx

DatetimeIndex(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04',
               '2023-01-05'],
              dtype='datetime64[ns]', freq='D')

In [10]:
# CategoricalIndex:
# For categorical data, providing memory-efficient operations for repeated labels
cat_idx = pd.CategoricalIndex(['a', 'b', 'a', 'c'])
cat_idx

CategoricalIndex(['a', 'b', 'a', 'c'], categories=['a', 'b', 'c'], ordered=False, dtype='category')

In [11]:
# IntervalIndex
# Represents a set of intervals, used in range-based data.
interval_idx = pd.interval_range(start=0, end=5)
interval_idx

IntervalIndex([(0, 1], (1, 2], (2, 3], (3, 4], (4, 5]], dtype='interval[int64, right]')

### Common Operations with Index

In [12]:
# Accessing and Setting Index
df = pd.DataFrame({'A': [10, 20, 30]}, index=['a', 'b', 'c'])
df

Unnamed: 0,A
a,10
b,20
c,30


In [13]:
# Reindexing: aligns data to a new index
df_reindexed = df.reindex(['a', 'b', 'd'])
df_reindexed

Unnamed: 0,A
a,10.0
b,20.0
d,


In [14]:
# Indexing & Slicing
df.loc['a']
df.iloc[0]

A    10
Name: a, dtype: int64

A    10
Name: a, dtype: int64

In [15]:
# Resetting Index: converts the index into a column and assigns a default index
df_reset = df.reset_index()
df_reset

Unnamed: 0,index,A
0,a,10
1,b,20
2,c,30


In [16]:
# Setting a New Index
df_new_index = df.set_index(pd.Index(['x', 'y', 'z']))
df_new_index

Unnamed: 0,A
x,10
y,20
z,30


In [17]:
# Index Uniqueness: check if index labels are unique
df.index.is_unique

True

In [18]:
# Sorting by Index
sorted_df = df.sort_index()
sorted_df

Unnamed: 0,A
a,10
b,20
c,30


### Properties of Index

In [19]:
# .name: sets or gets the name of the index
df.index.name = 'Label'
df

Unnamed: 0_level_0,A
Label,Unnamed: 1_level_1
a,10
b,20
c,30


In [20]:
# .dtype: data type of the index
df.index.dtype

dtype('O')

In [21]:
# .nunique(): returns the number of unique labels
df.index.nunique()

3

In [24]:
# .is_monotonic: checks if the index is sorted in ascending order
df.index.is_monotonic_increasing

True

.... lot more


| **Category**            | **Method/Property**          | **Description**                                                   |
|-------------------------|------------------------------|-------------------------------------------------------------------|
| **General Info**        | `copy()`                    | Creates a copy of the Index.                                      |
|                         | `is_unique`                | Checks if all labels are unique.                                  |
|                         | `has_duplicates`           | Checks if there are duplicate labels.                             |
|                         | `empty`                    | Checks if the Index is empty.                                     |
| **Transformation**      | `astype(dtype)`            | Converts Index to a specified type.                               |
|                         | `to_list()`                | Converts Index to a Python list.                                  |
|                         | `to_numpy()`               | Converts Index to a NumPy array.                                  |
|                         | `map(func)`                | Applies a function to each label.                                 |
| **Reshaping**           | `append(other)`            | Appends another Index to the current one.                         |
|                         | `union(other)`             | Returns the union of two indices.                                 |
|                         | `intersection(other)`      | Returns the intersection of two indices.                          |
|                         | `difference(other)`        | Returns the difference between two indices.                       |
|                         | `drop(labels)`             | Drops specified labels from the Index.                            |
| **Sorting**             | `sort_values()`            | Returns a sorted copy of the Index.                               |
|                         | `argsort()`                | Returns indices that would sort the Index.                        |
| **Validation**          | `is_monotonic`             | Checks if Index is sorted (ascending or descending).              |
|                         | `is_monotonic_increasing`  | Checks if Index is strictly increasing.                           |
|                         | `is_monotonic_decreasing`  | Checks if Index is strictly decreasing.                           |
|                         | `isna()`                   | Checks for missing values in the Index.                           |
|                         | `notna()`                  | Checks for non-missing values in the Index.                       |
| **Properties**          | `name`                     | Gets or sets the name of the Index.                               |
|                         | `names`                    | For MultiIndex, returns names of levels.                          |
|                         | `nlevels`                  | Number of levels in MultiIndex.                                   |
|                         | `size`                     | Total number of elements in the Index.                            |
|                         | `shape`                    | Shape of the Index.                                               |
|                         | `dtype`                    | Data type of the Index.                                           |
|                         | `values`                   | Returns Index as a NumPy array.                                   |

