In [1]:
from sklearn.datasets import load_wine, fetch_openml
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, MinMaxScaler, OneHotEncoder, KBinsDiscretizer, PolynomialFeatures
from sklearn.feature_selection import VarianceThreshold
from sklearn.decomposition import PCA

# Dataset loading into Pandas dataframes
wine = load_wine(as_frame=True)
df_wine = wine.frame

boston = fetch_openml(name="boston", version=1, as_frame=True)
df_boston = boston.frame

We have loaded some sample data to experiment. Below shows a one-line approach to standardize the data (the numerical features)

In [2]:
df_wine_std = pd.DataFrame(StandardScaler().fit_transform(df_wine.drop('target', axis=1)), columns=df_wine.columns[:-1])

Here's another one-line approach for min-max scaling

In [3]:
df_boston_scaled = pd.DataFrame(MinMaxScaler().fit_transform(df_boston.drop('MEDV', axis=1)), columns=df_boston.columns[:-1])

How about adding polynomial features to the dataset? This can be done in a single line as well. 

In [4]:
df_interactions = pd.DataFrame(PolynomialFeatures(degree=2, include_bias=False).fit_transform(df_wine[['alcohol', 'malic_acid']]))

How about one-hot encoding categorical features in the boston dataset?

In [5]:
df_boston_ohe = pd.get_dummies(df_boston.astype({'CHAS': 'category'}), columns=['CHAS'])

How about discretizing continuous features?

In [6]:
df_wine['alcohol_bin'] = pd.qcut(df_wine['alcohol'], q=4, labels=False)

If one of your numerical features is right-skewed or positively skewed, that is, it visually exhibits a long tail to the right-hand side due to a few unduly larger values than the rest, a logarithmic transformation helps scale them into a better form for further analyses

In [7]:
df_wine['log_malic'] = np.log1p(df_wine['malic_acid'])

Creating a ratio between two features

In [8]:
df_wine['alcohol_malic_ratio'] = df_wine['alcohol'] / df_wine['malic_acid']

Removing low variance features

In [9]:
df_boston_high_var = pd.DataFrame(VarianceThreshold(threshold=0.1).fit_transform(df_boston.drop('MEDV', axis=1)))

Multiplicative interaction between two features

In [10]:
df_wine['wine_quality'] = df_wine['alcohol'] * df_wine['color_intensity']

Keeping track of outliers

In [11]:
df_boston['tax_outlier'] = ((df_boston['TAX'] < df_boston['TAX'].quantile(0.25) - 1.5 * (df_boston['TAX'].quantile(0.75) - df_boston['TAX'].quantile(0.25))) | (df_boston['TAX'] > df_boston['TAX'].quantile(0.75) + 1.5 * (df_boston['TAX'].quantile(0.75) - df_boston['TAX'].quantile(0.25)))).astype(int)