In [11]:
from sklearn.preprocessing import OrdinalEncoder
import numpy as np

X = np.array([
    ['low', 'red'],
    ['medium', 'blue'],
    ['high', 'green'],
    ['medium', 'red'],
])

X2 = np.array([['low'],['med'],['high']])

enc = OrdinalEncoder()
enc.fit(X)               # learn categories for each column
X_encoded = enc.transform(X)
print("Encoded:\n", X_encoded)
print("Categories:", enc.categories_)
# inverse transform back
print("Inverse:\n", enc.inverse_transform(X_encoded))

print('-' * 30)
"""
OrdinalEncoder — quick example for 1‑D data
When your data is one dimensional you must pass a 2‑D array (n_samples, 1) to sklearn.preprocessing.OrdinalEncoder — reshape your 1‑D array 
before fitting or transforming.

"""
# example 1-D categorical data
X = np.array(['low', 'medium', 'high', 'medium', 'low'])


# reshape to (n_samples, 1)
X_2d = X.reshape(-1, 1)

encoder = OrdinalEncoder()
encoded = encoder.fit_transform(X_2d)   # result is shape (n_samples, 1)

print("Encoded (2D):\n", encoded.ravel())  # .ravel() to print as 1-D
print("Categories:", encoder.categories_)
# inverse transform back to original labels
decoded = encoder.inverse_transform(encoded)
print("Decoded:\n", decoded.ravel())

Encoded:
 [[1. 2.]
 [2. 0.]
 [0. 1.]
 [2. 2.]]
Categories: [array(['high', 'low', 'medium'], dtype='<U6'), array(['blue', 'green', 'red'], dtype='<U6')]
Inverse:
 [['low' 'red']
 ['medium' 'blue']
 ['high' 'green']
 ['medium' 'red']]
------------------------------
Encoded (2D):
 [1. 2. 0. 2. 1.]
Categories: [array(['high', 'low', 'medium'], dtype='<U6')]
Decoded:
 ['low' 'medium' 'high' 'medium' 'low']


In [3]:
# Specify category order explicitly
# First column has meaningful order; second is nominal but we still fix categories
enc2 = OrdinalEncoder(categories=[['low', 'medium', 'high'], ['red', 'green', 'blue']])
enc2.fit(X)
print("Encoded with fixed categories:\n", enc2.transform(X))

Encoded with fixed categories:
 [[0. 0.]
 [1. 2.]
 [2. 1.]
 [1. 0.]]


In [5]:
# Handle unknown categories at transform time
enc3 = OrdinalEncoder(handle_unknown='use_encoded_value', unknown_value=-1)
enc3.fit(X)

X_new = np.array([['very_high', 'red'], ['low', 'purple']])
print("Transform with unknowns:\n", enc3.transform(X_new))

Transform with unknowns:
 [[-1.  2.]
 [ 1. -1.]]


Common parameters and behavior -- ( OrdinalEncoder )
- categories: 'auto' (default) or a list of category lists; 'auto' infers categories from training data.
- dtype: Numeric dtype of output (default numpy.float64).
- handle_unknown: 'error' (default) or 'use_encoded_value' — controls behavior when transforming unseen categories; with 'use_encoded_value' you must provide unknown_value.
- encoded_missing_value: value used to encode missing entries (NaN by default).
These options let you control ordering, data type, and how unknown/missing values are handled.

Practical notes (short)

Use OrdinalEncoder when categories have a real order (e.g., small < medium < large).
For nominal categories prefer OneHotEncoder to avoid introducing artificial order.
Fix categories via the categories parameter or persist the fitted encoder to ensure consistent mappings between train and production.