**Task3: Feature Engineering**

Task list:

1. Extract additional features from the existing columns, such as the length of the restaurant name or address.

2. Create new features like "Has Table Booking" or "Has Online Delivery" by encoding categorical variables.

In [35]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder

In [36]:
new_data = pd.read_csv("new_data.csv")

**1. Feature extraction**

In [37]:
# Extracting length of the restaurant name
new_data['Restaurant Name Length'] = new_data['Restaurant Name'].apply(len)    
restaurantName_length = new_data[['Restaurant Name', 'Restaurant Name Length']]
restaurantName_length

Unnamed: 0,Restaurant Name,Restaurant Name Length
0,Le Petit Souffle,16
1,Izakaya Kikufuji,16
2,Heat - Edsa Shangri-La,22
3,Ooma,4
4,Sambo Kojin,11
...,...,...
9546,Naml۱ Gurme,11
9547,Ceviz Aac۱,10
9548,Huqqa,5
9549,Ak Kahve,8


In [38]:
# Extracting length of the address
new_data['Address Length'] = new_data['Address'].apply(len)
address_length = new_data[['Address', 'Address Length']]
address_length

Unnamed: 0,Address,Address Length
0,"Third Floor, Century City Mall, Kalayaan Avenu...",71
1,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...",67
2,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...",56
3,"Third Floor, Mega Fashion Hall, SM Megamall, O...",70
4,"Third Floor, Mega Atrium, SM Megamall, Ortigas...",64
...,...,...
9546,"Kemanke Karamustafa Paa Mahallesi, R۱ht۱m Cadd...",95
9547,"Kouyolu Mahallesi, Muhittin st_nda Caddesi, No...",67
9548,"Kuru_eme Mahallesi, Muallim Naci Caddesi, No 5...",64
9549,"Kuru_eme Mahallesi, Muallim Naci Caddesi, No 6...",66


In [39]:
# Extracting if restaurant has both offers for table booking and online delivery

# Encoding both Table booking and Online delivery
label_encoder = LabelEncoder()

new_data['Has Table Booking Encoded'] = label_encoder.fit_transform(new_data['Has Table booking'])

new_data['Has Online Delivery Encoded'] = label_encoder.fit_transform(new_data['Has Online delivery'])

# Define a function to determine if both conditions are met
def both_tb_od(row):
    if row['Has Table Booking Encoded'] == 1 and row['Has Online Delivery Encoded'] == 1:
        return 'Yes'
    else:
        return 'No'

# Apply the function to each row and create a new column
new_data['Both TB and OD'] = new_data.apply(both_tb_od, axis=1)

# Display the updated DataFrame
new_data.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Price range,Aggregate rating,Rating color,Rating text,Votes,Restaurant Name Length,Address Length,Has Table Booking Encoded,Has Online Delivery Encoded,Both TB and OD
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,3,4.8,Dark Green,Excellent,314,16,71,1,0,No
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,3,4.5,Dark Green,Excellent,591,16,67,1,0,No
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,4,4.4,Green,Very Good,270,22,56,1,0,No
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,4,4.9,Dark Green,Excellent,365,4,70,0,0,No
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,4,4.8,Dark Green,Excellent,229,11,64,1,0,No


In [40]:
# Extracting if restaurant either offers table booking or online delivery

# Define a function to determine if either conditions are met
def tb_or_od(row):
    if row['Has Table Booking Encoded'] == 1 | row['Has Online Delivery Encoded']==0:
        return 'Yes'
    elif row['Has Table Booking Encoded'] == 0 | row['Has Online Delivery Encoded']==1:
        return 'Yes'
    else:
        return 'No'
    
new_data['TB or OD'] = new_data.apply(tb_or_od, axis=1)
new_data.head()

Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Aggregate rating,Rating color,Rating text,Votes,Restaurant Name Length,Address Length,Has Table Booking Encoded,Has Online Delivery Encoded,Both TB and OD,TB or OD
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,4.8,Dark Green,Excellent,314,16,71,1,0,No,No
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,4.5,Dark Green,Excellent,591,16,67,1,0,No,No
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,4.4,Green,Very Good,270,22,56,1,0,No,No
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,4.9,Dark Green,Excellent,365,4,70,0,0,No,No
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,4.8,Dark Green,Excellent,229,11,64,1,0,No,No


**2. Creating new features by encoding categorical varaibales such as "Has Table Booking" and "Has Online Delivery"**

The aspect of encoding categorical variables such as "Has Table Booking" and "Has Online Delivery" with data content as 'Yes' or 'No' into numerical data content as '1' or '0' has been fulfilled above

In [42]:
new_data_truncated = new_data[['Has Table booking', 'Has Online delivery', 'Has Table Booking Encoded', 'Has Online Delivery Encoded']]
new_data_truncated

Unnamed: 0,Has Table booking,Has Online delivery,Has Table Booking Encoded,Has Online Delivery Encoded
0,Yes,No,1,0
1,Yes,No,1,0
2,Yes,No,1,0
3,No,No,0,0
4,Yes,No,1,0
...,...,...,...,...
9546,No,No,0,0
9547,No,No,0,0
9548,No,No,0,0
9549,No,No,0,0
