
<span style="text-decoration:underline">***Feature Engineering***</span>:-
1. **Prepare the dataset**
2. **Feature Transformation**
    1. *One Hot Encoder for Categorical Attributes*
    2. *Standardizer for Numerical Attributes*
3. **Feature Expansion**
    1. *Polynomial Expansion*

# Import Libraries

In [1]:
import pandas as pd
from sklearn.preprocessing import OneHotEncoder,StandardScaler,PolynomialFeatures

# Prepare the dataset

In [2]:
df = pd.DataFrame(data=[["yes",130],["no",230],["yes",80],["no",400]],columns=["is_new_user","logon_count"])
df

Unnamed: 0,is_new_user,logon_count
0,yes,130
1,no,230
2,yes,80
3,no,400


# Feature Transformation

### One Hot Encoder for Categorical Attributes

In [3]:
one = OneHotEncoder()
trans = one.fit_transform(df["is_new_user"].as_matrix().reshape(-1, 1)).toarray()
df["T_1"] = trans[:,0]
df["T_2"] = trans[:,1]
df

Unnamed: 0,is_new_user,logon_count,T_1,T_2
0,yes,130,0.0,1.0
1,no,230,1.0,0.0
2,yes,80,0.0,1.0
3,no,400,1.0,0.0


### Standardizer for Numerical Attributes

In [4]:
standard = StandardScaler()
res = standard.fit_transform(df["logon_count"].as_matrix().reshape(-1, 1).astype(float))
df["T_3"] = res[:,0]

# Feature Expansion

### Polynomial Expansion

In [5]:
poly = PolynomialFeatures(interaction_only=True)
res = poly.fit_transform(df[["T_1","T_2","T_3"]].as_matrix())
for i in range(4,res.shape[1]):
    df["E_"+str(i-3)] = res[:,i]
poly.get_feature_names()

['1', 'x0', 'x1', 'x2', 'x0 x1', 'x0 x2', 'x1 x2']

In [6]:
df

Unnamed: 0,is_new_user,logon_count,T_1,T_2,T_3,E_1,E_2,E_3
0,yes,130,0.0,1.0,-0.654289,0.0,-0.0,-0.654289
1,no,230,1.0,0.0,0.163572,0.0,0.163572,0.0
2,yes,80,0.0,1.0,-1.063219,0.0,-0.0,-1.063219
3,no,400,1.0,0.0,1.553936,0.0,1.553936,0.0
