# Enkodowanie zmiennych

Na tych zajęciach użyjemy zbioru danych dotyczących sprzedaży nieruchomości.

Link do zbioru danych: https://www.kaggle.com/datasets/mohammedaltet/egypt-houses-price

Kolumny w zbiorze:
- Type: the type of property
- Price: the price of property
- Bedrooms: number of bedrooms
- Bathrooms: number of bathrooms
- Area: the Area of the property by m^2
- Furnished: is the property Furnished or not
- Level: In what floor the property is ?
- Compound: ** In what Compound the property is ?**
- Payment_Option
- Delivery_Date
- City

In [1]:
import pandas as pd
import os
import numpy as np
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder, LabelEncoder, TargetEncoder
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error

In [2]:
# puść ten kod, 
# jeżeli wywołujesz plik  w folderze rozwiąznaia, 
# a ramka danych znajduje się w folderze data
import os 
os.chdir('../')

In [3]:
# Pobranie danych
df = pd.read_csv("data/egypt_houses_price_cleaned.csv")
del df['index']
del df['Unnamed: 0']

In [3]:
# Nagłówek
df.head()

Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,Finished,Nasr City
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,Finished,Camp Caesar
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,Finished,Smoha
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,Finished,Nasr City
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa


In [4]:
# info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 27321 entries, 0 to 27320
Data columns (total 12 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Type            27321 non-null  object 
 1   Price           27321 non-null  float64
 2   Bedrooms        27321 non-null  float64
 3   Bathrooms       27321 non-null  float64
 4   Area            27321 non-null  float64
 5   Furnished       27321 non-null  object 
 6   Level           27321 non-null  object 
 7   Compound        27321 non-null  object 
 8   Payment_Option  27321 non-null  object 
 9   Delivery_Date   27321 non-null  object 
 10  Delivery_Term   27321 non-null  object 
 11  City            27321 non-null  object 
dtypes: float64(4), object(8)
memory usage: 2.5+ MB


In [6]:
# Unikalne wartosci
df.nunique()

Type                10
Price             4181
Bedrooms            10
Bathrooms           10
Area               785
Furnished            3
Level               14
Compound           560
Payment_Option       4
Delivery_Date       10
Delivery_Term        5
City               179
dtype: int64

In [7]:
# wielkosc ramki
df.shape

(27321, 12)

In [10]:
# zmienne do enkodowania
cols_to_encode = ['Type','Furnished','Level','Payment_Option','Delivery_Date','Delivery_Term']

## One hot encoding

Można go zastosować używając funkcji z pandas "get_dummies" lub OneHotEncoder z scikit-learn.
Dzisiaj skupimy się na drugim podejściu.

In [8]:
# Stworzenie obiektu
oh_type= OneHotEncoder(drop = 'first',dtype = int)

In [11]:
# estymacja
oh_type.fit(df[cols_to_encode])

In [12]:
# Wyniki
results = oh_type.transform(df[cols_to_encode])

In [13]:
# print wyników
results.toarray()

array([[0, 1, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 1, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

In [14]:
# Dodanie wyników do df
df_ohe  =pd.DataFrame(data = results.toarray(),columns= oh_type.get_feature_names_out())
df_ohe  =df.join(df_ohe)
df_ohe.head()


Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,...,Delivery_Date_2026,Delivery_Date_2027,Delivery_Date_Ready to move,Delivery_Date_Unknown,Delivery_Date_soon,Delivery_Date_within 6 months,Delivery_Term_Finished,Delivery_Term_Not Finished,Delivery_Term_Semi Finished,Delivery_Term_Unknown
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,...,0,0,1,0,0,0,0,0,1,0


<b> Inne parametry obiektu </b>

1. drop - first / if_binary - usunięcie pierwszej wartości enkodowanej / zmiennej z tylko dwoma wartościami.
2. dtype - typ zmiennych wyjściowych.
3. handle_unknown - co zrobić,gdy przy transform pojawi się nieznana wartość.
4. min_frequency - minimalna liczebność, aby stworzyć kategorię.
5. max_categories - maksymalna liczba kategorii.

## Ordinal / label encoding

Oprócz tych samych pamrametrów, które występują w one-hot encoder występuje:
1. encoded_missing_value - wartość enkodingu dla braków danych int lub np.nan

In [None]:
# obiekt ordinal encoder
ordinal_enc = OrdinalEncoder(encoded_missing_value=-1).fit(df[cols_to_encode])

In [19]:
# transformacja
res_ord = ordinal_enc.transform(df[cols_to_encode])

In [20]:
res_ord

array([[ 2.,  0.,  8.,  0.,  6.,  1.],
       [ 0.,  0.,  2.,  0.,  6.,  1.],
       [ 0.,  0.,  0.,  0.,  6.,  1.],
       ...,
       [ 8.,  0., 13.,  0.,  6.,  3.],
       [ 4.,  1., 13.,  3.,  7.,  1.],
       [ 4.,  0., 13.,  0.,  7.,  1.]])

In [18]:
# nazwy zmiennych
ord_names = ['oe_' + i for i in cols_to_encode]
ord_names

['oe_Type',
 'oe_Furnished',
 'oe_Level',
 'oe_Payment_Option',
 'oe_Delivery_Date',
 'oe_Delivery_Term']

In [21]:
# Przygotowanie danych
df_ordinal  = pd.DataFrame(data=res_ord, columns=ord_names)
df_ordinal = df.join(df_ordinal)

In [22]:
df_ordinal

Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City,oe_Type,oe_Furnished,oe_Level,oe_Payment_Option,oe_Delivery_Date,oe_Delivery_Term
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,Finished,Nasr City,2.0,0.0,8.0,0.0,6.0,1.0
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,Finished,Camp Caesar,0.0,0.0,2.0,0.0,6.0,1.0
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,Finished,Smoha,0.0,0.0,0.0,0.0,6.0,1.0
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,Finished,Nasr City,0.0,0.0,1.0,0.0,6.0,1.0
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,0.0,0.0,11.0,0.0,6.0,3.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27316,Town House,890000.0,3.0,2.0,240.0,Unknown,Unknown,Unknown,Unknown Payment,Unknown,Unknown,North Coast,7.0,1.0,13.0,3.0,7.0,4.0
27317,Town House,4000000.0,4.0,3.0,218.0,Unknown,Unknown,Unknown,Cash or Installment,Unknown,Finished,New Cairo - El Tagamoa,7.0,1.0,13.0,1.0,7.0,1.0
27318,Twin House,13800000.0,3.0,4.0,308.0,No,Unknown,Cairo Festival City,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,8.0,0.0,13.0,0.0,6.0,3.0
27319,Stand Alone Villa,35000000.0,4.0,4.0,478.0,Unknown,Unknown,Unknown,Unknown Payment,Unknown,Finished,Mokattam,4.0,1.0,13.0,3.0,7.0,1.0


In [23]:
# Label encoder - tylko dla jednej zmiennej. zmieniaamy y
y_encoded = LabelEncoder().fit_transform(df['Type'])
y_encoded

array([2, 0, 0, ..., 8, 4, 4])

## TargetEncoder

Parametry
1. cv - liczba foldów w kroswalidacji
2. categories - "auto" lub lista o długości liczby zmiennych. Ustala ile jest kategorii w każdej zmiennej. Auto - ustala liczbę automatycznie.
3. target_type - typ zmiennej y {“auto”, “continuous”, “binary”, “multiclass”}
4. smooth - auto lub float. jest to połączenie średniej dla kategorii ze zmienną globalną. Ustala wagę.
5. shuffle - czy ma przestawiać losowo dane przed podziałem na foldy.

In [24]:
# train test split

train_x, test_x, train_y, test_y = train_test_split(df.drop('Price', axis=1),df['Price'], test_size=0.2, random_state=123)

In [25]:
pd.set_option('display.float_format', lambda x: '%.3f' % x)

In [26]:
# łączenie po indeksach
manual_mean = train_x.join(train_y).loc[:,['Type','Price']].groupby('Type').mean().reset_index()
manual_mean

Unnamed: 0,Type,Price
0,Apartment,1980497.77
1,Chalet,2316160.456
2,Duplex,3273082.253
3,Penthouse,3526969.541
4,Stand Alone Villa,11932340.408
5,Standalone Villa,12558367.434
6,Studio,1325582.507
7,Town House,4890879.055
8,Twin House,7213952.379
9,Twin house,5645244.731


In [27]:
# Manualny encoding
manual_mean = manual_mean.rename({'Price': 'type_mean_enc'}, axis=1)
df_manual_mean = test_x.merge(manual_mean,on = 'Type')

In [28]:
df_manual_mean

Unnamed: 0,Type,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City,type_mean_enc
0,Apartment,3.000,2.000,160.000,No,5,Madinaty,Cash,Ready to move,Finished,Madinaty,1980497.770
1,Chalet,2.000,2.000,88.000,Yes,7,Porto Golf Marina,Cash,Ready to move,Finished,North Coast,2316160.456
2,Stand Alone Villa,5.000,4.000,364.000,No,Unknown,Unknown,Cash or Installment,Unknown,Semi Finished,Sheikh Zayed,11932340.408
3,Stand Alone Villa,3.000,3.000,378.000,Unknown,Unknown,Unknown,Unknown Payment,Unknown,Unknown,Sheikh Zayed,11932340.408
4,Chalet,2.000,2.000,105.000,No,2,Sea View,Cash or Installment,Ready to move,Finished,North Coast,2316160.456
...,...,...,...,...,...,...,...,...,...,...,...,...
5460,Stand Alone Villa,4.000,4.000,429.000,No,Unknown,Hyde Park New Cairo,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,11932340.408
5461,Chalet,3.000,2.000,120.000,Yes,1,Stella Heights,Cash or Installment,Ready to move,Finished,North Coast,2316160.456
5462,Apartment,3.000,2.000,153.000,Unknown,Ground,Unknown,Cash,Ready to move,Finished,Sheikh Zayed,1980497.770
5463,Duplex,2.000,3.000,164.000,No,2,Unknown,Installment,Ready to move,Unknown,6th of October,3273082.253


In [29]:
# Obiekt target encoder
te = TargetEncoder(target_type='continuous').fit(train_x[['Type']],train_y)

In [30]:
# transformacja
te.transform([['Apartment']])



array([[1980524.23535362]])

In [31]:
# targetencoder - kilka zmiennych
te2 = TargetEncoder(target_type='continuous').fit(train_x[cols_to_encode], train_y)

In [34]:
results_te = te2.transform(df[cols_to_encode])
results_te

array([[ 3273357.25454368,  4780499.74610472,  1941235.98192534,
         6955427.85316283,  6222496.45277066,  4582164.1871527 ],
       [ 1980524.23535362,  4780499.74610472,  2084692.26522721,
         6955427.85316283,  6222496.45277066,  4582164.1871527 ],
       [ 1980524.23535362,  4780499.74610472,  2113922.28781191,
         6955427.85316283,  6222496.45277066,  4582164.1871527 ],
       ...,
       [ 7213447.15801359,  4780499.74610472,  8500167.17188027,
         6955427.85316283,  6222496.45277066,  5156123.92538596],
       [11926627.01624243,  4190448.72391961,  8500167.17188027,
         4700584.5308465 ,  3537862.69215056,  4582164.1871527 ],
       [11926627.01624243,  4780499.74610472,  8500167.17188027,
         6955427.85316283,  3537862.69215056,  4582164.1871527 ]])

In [35]:
# Nazwy zmiennych
target_names = ['te_' + i for i in cols_to_encode]
target_names

['te_Type',
 'te_Furnished',
 'te_Level',
 'te_Payment_Option',
 'te_Delivery_Date',
 'te_Delivery_Term']

In [36]:
# Przygotowanie danych
df_target_en = pd.DataFrame(data=results_te, columns = target_names)
df_target_en = df.join(df_target_en)
df_target_en.head()

Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City,te_Type,te_Furnished,te_Level,te_Payment_Option,te_Delivery_Date,te_Delivery_Term
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,Finished,Nasr City,3273357.255,4780499.746,1941235.982,6955427.853,6222496.453,4582164.187
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,Finished,Camp Caesar,1980524.235,4780499.746,2084692.265,6955427.853,6222496.453,4582164.187
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,Finished,Smoha,1980524.235,4780499.746,2113922.288,6955427.853,6222496.453,4582164.187
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,Finished,Nasr City,1980524.235,4780499.746,2488184.368,6955427.853,6222496.453,4582164.187
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,1980524.235,4780499.746,3356198.187,6955427.853,6222496.453,5156123.925


## Ordinal encoder sortowany 
Możemy nadać podobne wartości jak w ordinal encoder, sortując po wartościach średnich 

In [37]:
# średnia wartość y
manual_mean

Unnamed: 0,Type,type_mean_enc
0,Apartment,1980497.77
1,Chalet,2316160.456
2,Duplex,3273082.253
3,Penthouse,3526969.541
4,Stand Alone Villa,11932340.408
5,Standalone Villa,12558367.434
6,Studio,1325582.507
7,Town House,4890879.055
8,Twin House,7213952.379
9,Twin house,5645244.731


In [38]:
# Ordinal encoder - sortowany wg średniego y
manual_mean = manual_mean.sort_values(by='type_mean_enc')
manual_mean['ordinal_sorted_encoding'] = np.arange(0, manual_mean.shape[0])
manual_mean

Unnamed: 0,Type,type_mean_enc,ordinal_sorted_encoding
6,Studio,1325582.507,0
0,Apartment,1980497.77,1
1,Chalet,2316160.456,2
2,Duplex,3273082.253,3
3,Penthouse,3526969.541,4
7,Town House,4890879.055,5
9,Twin house,5645244.731,6
8,Twin House,7213952.379,7
4,Stand Alone Villa,11932340.408,8
5,Standalone Villa,12558367.434,9


## Case study - wpływ encodingu na model

In [39]:
# one-hot encoding
df_ohe.head()


Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,...,Delivery_Date_2026,Delivery_Date_2027,Delivery_Date_Ready to move,Delivery_Date_Unknown,Delivery_Date_soon,Delivery_Date_within 6 months,Delivery_Term_Finished,Delivery_Term_Not Finished,Delivery_Term_Semi Finished,Delivery_Term_Unknown
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,...,0,0,1,0,0,0,1,0,0,0
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,...,0,0,1,0,0,0,0,0,1,0


In [40]:
# Ordinal encoder
df_ordinal.head()

Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City,oe_Type,oe_Furnished,oe_Level,oe_Payment_Option,oe_Delivery_Date,oe_Delivery_Term
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,Finished,Nasr City,2.0,0.0,8.0,0.0,6.0,1.0
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,Finished,Camp Caesar,0.0,0.0,2.0,0.0,6.0,1.0
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,Finished,Smoha,0.0,0.0,0.0,0.0,6.0,1.0
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,Finished,Nasr City,0.0,0.0,1.0,0.0,6.0,1.0
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,0.0,0.0,11.0,0.0,6.0,3.0


In [41]:
# Target based encoding 
df_target_en.head()

Unnamed: 0,Type,Price,Bedrooms,Bathrooms,Area,Furnished,Level,Compound,Payment_Option,Delivery_Date,Delivery_Term,City,te_Type,te_Furnished,te_Level,te_Payment_Option,te_Delivery_Date,te_Delivery_Term
0,Duplex,4000000.0,3.0,3.0,400.0,No,7,Unknown,Cash,Ready to move,Finished,Nasr City,3273357.255,4780499.746,1941235.982,6955427.853,6222496.453,4582164.187
1,Apartment,4000000.0,3.0,3.0,160.0,No,10+,Unknown,Cash,Ready to move,Finished,Camp Caesar,1980524.235,4780499.746,2084692.265,6955427.853,6222496.453,4582164.187
2,Apartment,2250000.0,3.0,2.0,165.0,No,1,Unknown,Cash,Ready to move,Finished,Smoha,1980524.235,4780499.746,2113922.288,6955427.853,6222496.453,4582164.187
3,Apartment,1900000.0,3.0,2.0,230.0,No,10,Unknown,Cash,Ready to move,Finished,Nasr City,1980524.235,4780499.746,2488184.368,6955427.853,6222496.453,4582164.187
4,Apartment,5800000.0,2.0,3.0,160.0,No,Ground,Eastown,Cash,Ready to move,Semi Finished,New Cairo - El Tagamoa,1980524.235,4780499.746,3356198.187,6955427.853,6222496.453,5156123.925


In [42]:
# Lista ramek do treningu
dfs = [df_ohe,df_ordinal,df_target_en]

In [43]:
# zmienne numeryczne
numerical_cols = list(df.select_dtypes(exclude='object').drop('Price',axis=1).columns)
numerical_cols

['Bedrooms', 'Bathrooms', 'Area']

In [44]:
# listy zmiennych
lists_of_features  = [list(oh_type.get_feature_names_out()),ord_names, target_names]
lists_of_features


[['Type_Chalet',
  'Type_Duplex',
  'Type_Penthouse',
  'Type_Stand Alone Villa',
  'Type_Standalone Villa',
  'Type_Studio',
  'Type_Town House',
  'Type_Twin House',
  'Type_Twin house',
  'Furnished_Unknown',
  'Furnished_Yes',
  'Level_10',
  'Level_10+',
  'Level_2',
  'Level_3',
  'Level_4',
  'Level_5',
  'Level_6',
  'Level_7',
  'Level_8',
  'Level_9',
  'Level_Ground',
  'Level_Highest',
  'Level_Unknown',
  'Payment_Option_Cash or Installment',
  'Payment_Option_Installment',
  'Payment_Option_Unknown Payment',
  'Delivery_Date_2023',
  'Delivery_Date_2024',
  'Delivery_Date_2025',
  'Delivery_Date_2026',
  'Delivery_Date_2027',
  'Delivery_Date_Ready to move',
  'Delivery_Date_Unknown',
  'Delivery_Date_soon',
  'Delivery_Date_within 6 months',
  'Delivery_Term_Finished',
  'Delivery_Term_Not Finished',
  'Delivery_Term_Semi Finished',
  'Delivery_Term_Unknown '],
 ['oe_Type',
  'oe_Furnished',
  'oe_Level',
  'oe_Payment_Option',
  'oe_Delivery_Date',
  'oe_Delivery_Term']

In [49]:
# Porównanie modeli
for i in range(3):
    data_frame = dfs[i]
    cols = lists_of_features[i] +numerical_cols
    train_x, test_x, train_y, test_y  = train_test_split(data_frame[cols],data_frame['Price'], random_state=123, test_size=0.2)
    model = LinearRegression().fit(train_x,train_y)
    pred_train = model.predict(train_x)
    pred_test = model.predict(test_x)
    mae_train = mean_absolute_error(train_y, pred_train)
    mae_test = mean_absolute_error(test_y, pred_test)
    print(f'Dla modelu {i} mae train = {mae_train}, mae test = {mae_test}')


Dla modelu 0 mae train = 2605093.2029061667, mae test = 2586360.2044577477
Dla modelu 1 mae train = 2743595.6226335536, mae test = 2712083.1569494703
Dla modelu 2 mae train = 2657199.2163312766, mae test = 2644092.570161683


Spodziewaliśmy się najwyższego błędu dla ordinal encodera, gdyż regresja liniowa wymaga liniowej zależności. Target mean encoding pomógł, jednak w tej sytuacji najlepszy okazał się model z one-hot encoding.