# **Data Pre-Processing**

**Data Description**

1. CustomerID : Unique customer ID
2. ProdTaken : Whether the product is taken (1) or not (0)
3. Age : Age of the customer
4. TypeofContact : How customer was contacted (Company Invited or Self Inquiry)
5. CityTier : City tier depends on the development of a city, population, facilities, and living standards. The categories are ordered i.e. Tier 1 > Tier 2 > Tier 3
6. DurationOfPitch : Duration of the pitch by a salesperson to the customer
7. Occupation : Occupation of customer
8. Gender : Gender of customer
9. NumberOfPersonVisiting : Total number of persons planning to take the trip with the customer
10. NumberOfFollowups : Total number of follow-ups has been done by the salesperson after the sales pitch
11. ProductPitched : Product pitched by the salesperson
12. PreferredPropertyStar : Preferred hotel property rating by customer
13. MaritalStatus : Marital status of customer
14. NumberOfTrips : Average number of trips in a year by customer
15. Passport : The customer has a passport or not (0: No, 1: Yes)
16. PitchSatisfactionScore : Sales pitch satisfaction score
17. OwnCar : Whether the customers own a car or not (0: No, 1: Yes)
18. NumberOfChildrenVisiting : Total number of children with age less than 5 planning to take the trip with the customer
19. Designation : Designation of the customer in the current organization
20. MonthlyIncome : Gross monthly income of the customer

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
data = pd.read_csv('/content/drive/MyDrive/Google Colab/Travel.csv')
data.head()

Unnamed: 0,CustomerID,ProdTaken,Age,TypeofContact,CityTier,DurationOfPitch,Occupation,Gender,NumberOfPersonVisiting,NumberOfFollowups,ProductPitched,PreferredPropertyStar,MaritalStatus,NumberOfTrips,Passport,PitchSatisfactionScore,OwnCar,NumberOfChildrenVisiting,Designation,MonthlyIncome
0,200000,1,41.0,Self Enquiry,3,6.0,Salaried,Female,3,3.0,Deluxe,3.0,Single,1.0,1,2,1,0.0,Manager,20993.0
1,200001,0,49.0,Company Invited,1,14.0,Salaried,Male,3,4.0,Deluxe,4.0,Divorced,2.0,0,3,1,2.0,Manager,20130.0
2,200002,1,37.0,Self Enquiry,1,8.0,Free Lancer,Male,3,4.0,Basic,3.0,Single,7.0,1,3,0,0.0,Executive,17090.0
3,200003,0,33.0,Company Invited,1,9.0,Salaried,Female,2,3.0,Basic,3.0,Divorced,2.0,1,5,1,1.0,Executive,17909.0
4,200004,0,,Self Enquiry,1,8.0,Small Business,Male,2,3.0,Basic,4.0,Divorced,1.0,0,5,1,0.0,Executive,18468.0


## **Splitting Data**

In [None]:
from sklearn.model_selection import train_test_split
X = data.drop('ProdTaken', axis=1)
y = data['ProdTaken']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, stratify=y, random_state=42)

In [None]:
train =  pd.concat([X_train, pd.DataFrame(y_train)], axis=1)
test =  pd.concat([X_test, pd.DataFrame(y_test)], axis=1)

train.shape, test.shape

((3910, 20), (978, 20))

## **Mengganti value yang tidak rasional & Drop CustomerID**

In [None]:
# mengganti values yang tidak rasional
train.loc[train['Gender'] == 'Fe Male', 'Gender'] = 'Female'
train.loc[train['MaritalStatus'] == 'Unmarried', 'MaritalStatus'] = 'Single'
test.loc[test['Gender'] == 'Fe Male', 'Gender'] = 'Female'
test.loc[test['MaritalStatus'] == 'Unmarried', 'MaritalStatus'] = 'Single'
# menghapus kolom Customer ID
train.drop('CustomerID', axis=1, inplace=True)
test.drop('CustomerID', axis=1, inplace=True)

## **Handle Missing Values**

In [None]:
#Persentase missing value pada semua feature
for kolom in train.columns:
    missing_value = (train[kolom].isnull().sum()/train.shape[0])*100
    if missing_value == 0:
        pass
    else:
        print(f'Persentase missing value pada feature {kolom} adalah {missing_value:.2f}%')

Persentase missing value pada feature Age adalah 4.68%
Persentase missing value pada feature TypeofContact adalah 0.43%
Persentase missing value pada feature DurationOfPitch adalah 4.96%
Persentase missing value pada feature NumberOfFollowups adalah 1.00%
Persentase missing value pada feature PreferredPropertyStar adalah 0.49%
Persentase missing value pada feature NumberOfTrips adalah 2.74%
Persentase missing value pada feature NumberOfChildrenVisiting adalah 1.36%
Persentase missing value pada feature MonthlyIncome adalah 4.68%


In [None]:
# calculate percentage of total missing values
total_missing_values = train.isnull().any(axis = 1).sum()*100/train.shape[0]
print(f'{total_missing_values:.2f}%')

15.35%


Meskipun missing value di masing-masing kolom tidak lebih dari 5%, namun jumlah baris yang memuat setidaknya 1 missing value secara keseluruhan mencapai 15.35% dari total data. Oleh karena itu, perlu dilakukan imputasi pada data yang kosong menggunakan median (untuk data numerik) dan modus/top frequented (untuk data kategorik)

In [None]:
from sklearn.impute import SimpleImputer
for i in (train.select_dtypes(include = 'number').columns):
      imputer = SimpleImputer(strategy='median').fit(train[[i]])
      train[i] = imputer.transform(train[[i]])
      test[i] = imputer.transform(test[[i]])
    
for i in (train.select_dtypes(include = ['object', 'category']).columns):
      imputer = SimpleImputer(strategy='most_frequent').fit(train[[i]])
      train[i] = imputer.transform(train[[i]])
      test[i] = imputer.transform(test[[i]])

## **Feature Extraction**

Binning feature `Age` berdasarkan [referensi age structure saudi arabia](https://www.indexmundi.com/saudi_arabia/age_structure.html), yaitu :
  * 15-24 years (early working age)
  * 25-54 years (prime working age)
  * 55-64 years (mature working age)



In [None]:
train['AgeStructure'] = pd.cut(train['Age'], [15,24,54,64], labels=['Early Working Age','Prime Working Age', 'Mature Working Age'])
test['AgeStructure'] = pd.cut(test['Age'], [15,24,54,64], labels=['Early working age','Prime working age', 'Mature working age'])

Membuat feature `MarketingCost` yang dibentuk berdasarkan feature `DurationOfPitch` dan `NumberOfFollowups`

In [None]:
PhoneRate = 0.5
train['MarketingCost'] = train['DurationOfPitch'] * train['NumberOfFollowups'] * PhoneRate
test['MarketingCost'] = test['DurationOfPitch'] * test['NumberOfFollowups'] * PhoneRate

In [None]:
train.shape, test.shape

((3910, 21), (978, 21))

## **Handle Duplicated Data**

Terdapat 99 baris memiliki duplikat data pada data train, untuk mengoptimalkan hasil dari model maka perlu membuang duplicated data tersebut.

In [None]:
# total duplikat data
train.duplicated().sum(), test.duplicated().sum()

(99, 3)

In [None]:
# jumlah rows sebelum menghapus duplikat data
train.shape, test.shape

((3910, 21), (978, 21))

In [None]:
# menghapus duplikat data
train.drop_duplicates(keep='first', inplace=True)
test.drop_duplicates(keep='first', inplace=True)

In [None]:
# jumlah rows setelah menghapus duplikat data
train.shape, test.shape

((3811, 21), (975, 21))

## **Handle Outliers**

Berdasarkan EDA, terdapat 5 features yang memiliki outliers, yaitu
1. `MonthlyIncome`. Terdapat 283 data outlier yang bernilai antara SAR 32,856 - 98,678 di kolom MonthlyIncome. Berdasarkan pengelompokkan Designation, ada 2 data yang melebihi rentang monthly income, yaitu pada Designation Executive. Menurut survei dari [salaryexplorer.com](http://www.salaryexplorer.com/salary-survey.php?loc=2150&loctype=3&job=24&jobtype=1#:~:text=A%20person%20working%20in%20Executive%20and%20Management%20in%20Riyadh%20typically,%2C%20transport%2C%20and%20other%20benefits.), rataan tertinggi monthly income Executive di Riyadh adalah SAR 39,400. Penanganan 2 nilai ekstrem ini dilakukan dengan cara menggantinya dengan nilai maksimum yang bukan merupakan nilai ekstrem, yaitu SAR 38,677.
2. `NumberOfFollowups`. Terdapat 106 data outlier yang bernilai 6 dan 146 data outlier bernilai 1 di kolom NumberOfFollowups. Outlier ini kami pertimbangkan untuk tidak dihapus karena termasuk dalam natural variation. Sangat memungkinkan jumlah follow up yang dilakukan oleh sales marketer ke customer hanya 1 kali atau hingga mencapai 6 kali.
3. `NumberOfTrips`. Terdapat 87 data outlier yang bernilai antara 8 - 22 di kolom NumberOfTrips. Outlier ini kami pertimbangkan untuk tidak dihapus karena termasuk dalam natural variation. Sangat memungkinkan jumlah trip customer yang memiliki rataan penghasilan bulanan SAR 24,369.8 berkisar antara 8 sampai dengan 22 kali dalam setahun.
4. `NumberOfPersonVisiting`. Terdapat 2 data outlier yang bernilai 5 di kolom NumberOfPersonVisiting. Outlier ini kami pertimbangkan untuk tidak dihapus karena termasuk dalam natural variation. Banyaknya orang yang berencana mengambil trip bersama sangat memungkinkan berjumlah 5 orang.
5. `DurationOfPitch`. Terdapat 2 data DurationOfPitch yang sangat ekstrem yaitu 127 dan 126. Kami meyakini bahwa nilai outlier ini adalah kesalahan dalam entry data. Penanganan 2 nilai ekstrem ini dilakukan dengan cara menggantinya dengan nilai maksimum yang bukan merupakan nilai ekstrem, yaitu 36.

In [None]:
# menangani outlier
train.loc[train['MonthlyIncome'] > 38677.0, 'MonthlyIncome'] = 38677.0
train.loc[train['DurationOfPitch'] > 36.0, 'DurationOfPitch'] = 36.0
train.loc[train['NumberOfPersonVisiting'] > 5, 'NumberOfPersonVisiting'] = 5
train.loc[train['NumberOfFollowups'] > 6, 'NumberOfFollowups'] = 6
train.loc[train['NumberOfTrips'] > 22.0 , 'NumberOfTrips'] = 22.0

test.loc[test['DurationOfPitch'] > 36.0, 'DurationOfPitch'] = 36.0
test.loc[test['NumberOfPersonVisiting'] > 5, 'NumberOfPersonVisiting'] = 5
test.loc[test['NumberOfFollowups'] > 6, 'NumberOfFollowups'] = 6
test.loc[test['NumberOfTrips'] > 22.0 , 'NumberOfTrips'] = 22.0
test.loc[test['MonthlyIncome'] > 38677.0, 'MonthlyIncome'] = 38677.0
test.loc[test['MonthlyIncome'] > 16009.0, 'MonthlyIncome'] = 16009.0

In [None]:
train.shape, test.shape

((3811, 21), (975, 21))

## **Feature Transformation**

Beberapa feature numerik memiliki value yang sangat tinggi dibandingkan dengan feature lainnya, seperti `MonthlyIncome` yang memiliki nilai terkecil 1000 dan `NumberOfTrips` yang memiliki nilai terbesar 22. Oleh karena itu, diterapkan feature scaling untuk menyesuaikan nilai pada setiap feature. Feature scaling yang digunakan adalah Standardization mengingat beberapa feature positvely skewed.


In [None]:
from sklearn.preprocessing import StandardScaler
features = ['Age','DurationOfPitch','NumberOfTrips','NumberOfPersonVisiting', 'NumberOfFollowups', 'NumberOfChildrenVisiting','MonthlyIncome','MarketingCost']
for i in features:
  scaler = StandardScaler().fit(train[[i]])
  train[i]= scaler.transform(train[[i]])
  test[i]= scaler.transform(test[[i]])

## **Feature Encoding**
Pada tahap EDA ditemukan feature `Gender`memiliki tiga unique values, yaitiu Male, Female, dan Fe Male sehingga perlu dilakukan perubahan dari Fe Male menjadi Female. Kemudian pada feature `MaritalStatus` perlu dilakukan perubahan dari unmarried menjadi single, karena menurut [referensi](https://www.un.org/en/development/desa/population/publications/dataset/marriage/marital-status.asp) tidak ada value unmarried, sehingga dilakukan perubahan untuk mengurangi ambiguitas. 

Feature kategorik terbagi menjadi dua tipe, yaitu nominal dan ordinal. Pada data dengan tipe nominal dan memiliki dua unique values, dilakukan perubahan label menjadi numerik dalam bentuk biner 0 atau 1. Sementara untuk data dengan tipe nominal dan memiliki lebih dari dua unique  values, dilakukan perubahan menjadi feature tersendiri (OneHoteEncoding).

In [None]:
# Label encoding
from sklearn import preprocessing
label_encoder = preprocessing.LabelEncoder()

features = ['TypeofContact','Gender']
for var in features:
  train[var]= label_encoder.fit_transform(train[var])
  test[var]= label_encoder.transform(test[var])

In [None]:
# Encoding
categorical_cols = ['Occupation', 'ProductPitched', 'MaritalStatus','Designation', 'AgeStructure']

encoding_train = pd.get_dummies(train[categorical_cols], prefix_sep = ':', drop_first=True)
train = pd.concat([train, encoding_train], axis=1)
train.drop(categorical_cols, axis=1, inplace=True)

encoding_test = pd.get_dummies(test[categorical_cols], prefix_sep = ':')
test = pd.concat([test, encoding_test], axis=1)
test.drop(categorical_cols, axis=1, inplace=True)
zeros = [col for col in encoding_train.columns.tolist() if col not in encoding_test.columns.tolist()]
for i in zeros:
  test[i] = 0
drop_first = [col for col in encoding_test.columns.tolist() if col not in encoding_train.columns.tolist()]
for i in drop_first:
  test.drop(i, axis=1, inplace=True)

In [None]:
train.shape, test.shape

((3811, 31), (975, 31))

In [None]:
# cek kesamaan kolom di train dan test
train.columns.tolist() == test.columns.tolist()

True

## **Handle Class Imbalance**

Target pada case ini adalah `ProdTaken`, berdasarkan tahap EDA ditemukan bahwa target memiliki ketimpangan values yaitu 80% memiliki label 0 dan 20% sisanya memiliki label 1. Masalah imbalance data pada target ditangani dengan metode SMOTE.

In [None]:
y = train['ProdTaken'].values
column_names = train.drop(['ProdTaken'], axis=1).columns.tolist()
X = train.drop(['ProdTaken'], axis=1).values

X.shape, y.shape

((3811, 30), (3811,))

In [None]:
pd.DataFrame(y).value_counts()

0.0    3093
1.0     718
dtype: int64

In [None]:
# Oversampling SMOTE
from imblearn.over_sampling import SMOTE
smote = SMOTE(sampling_strategy=1,random_state = 42)
X_over_SMOTE, y_over_SMOTE = smote.fit_resample(X,  y)

print(pd.Series(y_over_SMOTE).value_counts())

0.0    3093
1.0    3093
dtype: int64


In [None]:
X_df = pd.DataFrame(X_over_SMOTE, columns = column_names)
pd.set_option('display.max_columns', None)
X_df.head()

Unnamed: 0,Age,TypeofContact,CityTier,DurationOfPitch,Gender,NumberOfPersonVisiting,NumberOfFollowups,PreferredPropertyStar,NumberOfTrips,Passport,PitchSatisfactionScore,OwnCar,NumberOfChildrenVisiting,MonthlyIncome,MarketingCost,Occupation:Large Business,Occupation:Salaried,Occupation:Small Business,ProductPitched:Deluxe,ProductPitched:King,ProductPitched:Standard,ProductPitched:Super Deluxe,MaritalStatus:Married,MaritalStatus:Single,Designation:Executive,Designation:Manager,Designation:Senior Manager,Designation:VP,AgeStructure:Prime Working Age,AgeStructure:Mature Working Age
0,-0.171019,1.0,3.0,-1.162194,1.0,-1.266742,-2.693587,5.0,-0.673531,0.0,4.0,0.0,-1.379883,-0.243095,-1.415815,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0
1,-0.171019,1.0,1.0,-0.914522,1.0,0.119291,-0.703231,3.0,-0.673531,0.0,5.0,0.0,0.955775,-1.000276,-0.919053,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0
2,-0.280484,0.0,1.0,-0.66685,1.0,0.119291,0.291947,3.0,-0.673531,0.0,2.0,1.0,-0.212054,0.577745,-0.477486,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0
3,0.047911,1.0,1.0,1.314528,1.0,1.505325,0.291947,4.0,1.520339,0.0,4.0,0.0,0.955775,-0.361095,1.288782,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0
4,0.047911,1.0,1.0,-1.162194,0.0,-1.266742,-0.703231,5.0,0.423404,0.0,2.0,1.0,-0.212054,-1.170433,-1.08464,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,1.0,0.0


In [None]:
y_df = pd.DataFrame(y_over_SMOTE, columns=['ProdTaken'])
y_df.head()

Unnamed: 0,ProdTaken
0,0.0
1,0.0
2,0.0
3,0.0
4,0.0


In [None]:
X_df.shape, y_df.shape

((6186, 30), (6186, 1))

## **Feature Selection**

Untuk menentukan feature terbaik yang akan digunakan saat membuat model perlu menggunakan uji ANOVA.


In [None]:
from sklearn.feature_selection import f_classif
F_statistic, p_value = f_classif(X_df, y_df)
anova_table = pd.DataFrame(data = {'Feature': X_df.columns, 
                                   'F-score' : F_statistic,
                                   'p-value' : p_value.round(decimals=3)})
anova_table['significance'] = anova_table.apply(lambda x: 'Not Significant' if x['p-value'] >= 0.05 else 'Significant', axis=1)
anova_table = anova_table.merge(X_df[X_df.columns].describe().T.reset_index(), left_on='Feature', right_on='index').sort_values(['F-score','count'], ascending=False)
anova_table

  y = column_or_1d(y, warn=True)


Unnamed: 0,Feature,F-score,p-value,significance,index,count,mean,std,min,25%,50%,75%,max
9,Passport,727.367885,0.0,Significant,Passport,6186.0,0.385784,0.473658,0.0,0.0,0.0,1.0,1.0
24,Designation:Executive,601.583565,0.0,Significant,Designation:Executive,6186.0,0.472156,0.494507,0.0,0.0,0.0,1.0,1.0
23,MaritalStatus:Single,369.455128,0.0,Significant,MaritalStatus:Single,6186.0,0.392677,0.477377,0.0,0.0,0.0,1.0,1.0
18,ProductPitched:Deluxe,262.211206,0.0,Significant,ProductPitched:Deluxe,6186.0,0.293548,0.450583,0.0,0.0,0.0,1.0,1.0
25,Designation:Manager,262.211206,0.0,Significant,Designation:Manager,6186.0,0.293548,0.450583,0.0,0.0,0.0,1.0,1.0
0,Age,254.587966,0.0,Significant,Age,6186.0,-0.135495,1.038282,-2.141386,-0.853512,-0.280484,0.48577,2.565601
13,MonthlyIncome,196.175382,0.0,Significant,MonthlyIncome,6186.0,-0.104794,0.957282,-3.736873,-0.716083,-0.325992,0.27283,3.005759
6,NumberOfFollowups,179.875323,0.0,Significant,NumberOfFollowups,6186.0,0.103564,0.963689,-2.693587,-0.703231,0.291947,0.291947,2.282302
28,AgeStructure:Prime Working Age,168.689722,0.0,Significant,AgeStructure:Prime Working Age,6186.0,0.850524,0.34789,0.0,1.0,1.0,1.0,1.0
22,MaritalStatus:Married,150.03382,0.0,Significant,MaritalStatus:Married,6186.0,0.434636,0.480411,0.0,0.0,0.0,1.0,1.0


In [None]:
significant_features = anova_table[anova_table['significance']=='Significant']['Feature'].tolist()
significant_features

['Passport',
 'Designation:Executive',
 'MaritalStatus:Single',
 'ProductPitched:Deluxe',
 'Designation:Manager',
 'Age',
 'MonthlyIncome',
 'NumberOfFollowups',
 'AgeStructure:Prime Working Age',
 'MaritalStatus:Married',
 'ProductPitched:Super Deluxe',
 'MarketingCost',
 'PreferredPropertyStar',
 'CityTier',
 'ProductPitched:King',
 'Designation:VP',
 'DurationOfPitch',
 'Occupation:Large Business',
 'PitchSatisfactionScore',
 'Occupation:Salaried',
 'Gender',
 'NumberOfTrips',
 'TypeofContact']

In [None]:
X_df.drop(columns=([col for col in X_df.columns.tolist() if col not in significant_features]), axis=1, inplace=True)
train = pd.concat([y_df, X_df], axis=1)

train.shape

(6186, 24)

In [None]:
y = test['ProdTaken'].values
column_names = test.drop(['ProdTaken'], axis=1).columns.tolist()
X = test.drop(['ProdTaken'], axis=1).values

# Oversampling SMOTE
from imblearn.over_sampling import SMOTE
smote = SMOTE(sampling_strategy=1,random_state = 42)
X_over_SMOTE, y_over_SMOTE = smote.fit_resample(X,  y)

X_df = pd.DataFrame(X_over_SMOTE, columns = column_names)
y_df = pd.DataFrame(y_over_SMOTE, columns=['ProdTaken'])

In [None]:
X_df.drop(columns=([col for col in X_df.columns.tolist() if col not in significant_features]), axis=1, inplace=True)
test = pd.concat([y_df, X_df], axis=1)
test = test[train.columns.tolist()]
test.shape

(1584, 24)

In [None]:
# cek kesamaan kolom di train dan test
train.columns.tolist() == test.columns.tolist()

True

In [None]:
train.to_csv('preprocessed_train.csv', index=False)
test.to_csv('preprocessed_test.csv', index=False)

## **Feature tambahan yang dapat membantu performance model**

1. `SocioEconomicStatus`, ukuran total gabungan ekonomi dan sosiologis dari pengalaman kerja dan akses ekonomi seorang customer ke sumber daya dan posisi sosial.
2. `GenerationalSegment`, mengelompokkan customer berdasarkan kategori generasi yang mencakup Gen Z, Milenial, Generasi X, dan Baby Boomers. Generasi-generasi ini diyakini memiliki preferensi, perilaku, ciri kepribadian, dan keyakinan tertentu.
3. `SalesPitching`, Kode sales marketer yang melakukan pitching ke customer
4. `PreviousProdTaken`, Jumlah product package yang sebelumnya pernah diambil
5. `GeographicSegment`, membagi customer berdasarkan lokasinya. Lokasi pelanggan dapat membantu untuk lebih memahami kebutuhan mereka dan karakteristik daerah tempat mereka tinggal, seperti iklim, kepadatan penduduk.
6. `PackagePrice` yang merupakan pengembangan dari feature `ProductPitched`
  *   Basic : SAR 199.4
  *   Standard : SAR 219.4
  *   Deluxe : SAR 284.4
  *   Super Deluxe : SAR 359
  *   King : SAR 683.2
  
  Asumsi harga diperoleh dari [referensi](https://id.hotels.com/en/ho396179/the-ritz-carlton-riyadh-riyadh-saudi-arabia/?chkin=2022-09-01&chkout=2022-09-03&x_pwa=1&rfrr=HSR&pwa_ts=1659394608325&referrerUrl=aHR0cHM6Ly9pZC5ob3RlbHMuY29tL0hvdGVsLVNlYXJjaA%3D%3D&useRewards=false&rm1=a2&regionId=3051&destination=Riyadh%2C+Riyadh%2C+Saudi+Arabia&destType=MARKET&selected=4718588&sort=RECOMMENDED&top_dp=3945675&top_cur=IDR&semdtl=&userIntent=&selectedRoomType=200127700&selectedRatePlan=381495702&expediaPropertyId=4718588)