# pandas.MultiIndex.get_level_values </br>
doc: https://pandas.pydata.org/docs/reference/api/pandas.Index.get_level_values.html

In [13]:
import pandas as pd
print(f"Pandas version: {pd.__version__}")

Pandas version: 2.2.2


In [2]:
idx = pd.Index(list('abc'))

In [3]:
print(idx)

Index(['a', 'b', 'c'], dtype='object')


In [4]:
idx.get_level_values(level=0)

Index(['a', 'b', 'c'], dtype='object')

For `Index`, level should be 0, since there are no multiple levels. </br>
Following is an example of a wrong usage.

In [5]:
try:
    print(idx.get_level_values(level=1))
except Exception as e:
    print(f"Error: {e}")

Error: Too many levels: Index has only 1 level, not 2


In [6]:
del idx

# MultiIndex.get_level_values </br>
doc: https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.get_level_values.html

In [7]:
tmp = pd.MultiIndex.from_arrays((list('abc'), list('def')))
tmp.names = ['level_a', 'level_b']
print(tmp)

MultiIndex([('a', 'd'),
            ('b', 'e'),
            ('c', 'f')],
           names=['level_a', 'level_b'])


There are two ways of retrieving indices
- Integer position of the level in the `MultiIndex`
- Name of the level

In [8]:
print(tmp.get_level_values(level=0))
print(tmp.get_level_values(level='level_a'))

Index(['a', 'b', 'c'], dtype='object', name='level_a')
Index(['a', 'b', 'c'], dtype='object', name='level_a')


In [9]:
print(tmp.get_level_values(level=1))
print(tmp.get_level_values(level='level_b'))

Index(['d', 'e', 'f'], dtype='object', name='level_b')
Index(['d', 'e', 'f'], dtype='object', name='level_b')


In [10]:
try:
    tmp.get_level_values(level=2)
except Exception as e:
    print(f"Error: {e}")

Error: Too many levels: Index has only 2 levels, not 3


In [11]:
del tmp

Library versions:
- **pandas==1.4.3**
- **numpy==1.26.4** 

Here is the actual code snippet which introduced me to the function

In [15]:
"""
rows, index_rows = np.unique(
    X.index.get_level_values(level=f.cust_id).values, return_inverse=True
)

cols, index_cols = np.unique(
    X.index.get_level_values(level="agg_col").values, return_inverse=True
)

- f.cust_id : str = "cust_id" 
- X : pd.DataFrame = >9*10^6 rows and 1 column 
    - column name = tx_net [transaction ratio of a customer for an item category (normalized)] 
    - index = MultiIndex(levels=['cust_id', 'agg_col'])
        - cust_id = unique customer ID
        - agg_col = ID of the product category

X.index returns the following:
    MultiIndex([(       10,  4),
                (       18, 15),
                (       18, 21),
                (       19,  8),
                (       19, 12),
                (       21,  1),
                (       21, 13),
                (       22,  4),
                (       23,  4),
                (       35, -1),
                ...
                (115885741, 15),
                (115890981, 14),
                (115898751, -1),
                (115899116, -1),
                (115899248, -1),
                (115899591, 16),
                (115900226,  8),
                (115900681,  4),
                (115900681, 15),
                (115900758, 21)],
            names=['cust_id', 'agg_col'], length=9206259)
"""

'\nrows, index_rows = np.unique(\n    X.index.get_level_values(level=f.cust_id).values, return_inverse=True\n)\n\ncols, index_cols = np.unique(\n    X.index.get_level_values(level="agg_col").values, return_inverse=True\n)\n\n- f.cust_id : str = "cust_id" \n- X : pd.DataFrame = >9*10^6 rows and 1 column \n    - column name = tx_net [transaction ratio of a customer for an item category (normalized)] \n    - index = MultiIndex(levels=[\'cust_id\', \'agg_col\'])\n        - cust_id = unique customer ID\n        - agg_col = ID of the product category\n\nX.index returns the following:\n    MultiIndex([(       10,  4),\n                (       18, 15),\n                (       18, 21),\n                (       19,  8),\n                (       19, 12),\n                (       21,  1),\n                (       21, 13),\n                (       22,  4),\n                (       23,  4),\n                (       35, -1),\n                ...\n                (115885741, 15),\n                (11

`rows` contains all the unique customer IDs found in the index of DataFrame `X`. 
`index_rows` provides the indices that map each customer ID in the original index to its position in the rows array of unique customer IDs.

`cols` will contain all the unique values from the `"agg_col"` level of the index in the DataFrame `X`.
`index_cols` provides the indices that map each value in the original `"agg_col"` level to its position in the `cols` array of unique values from `"agg_col"`.