# Zomato Data Analysis 

Zomato, a popular Indian startup since 2008, is your go-to for exploring restaurants and ordering food online. Starting as a restaurant review hub, it now delivers a hassle-free dining experience. With a user-friendly setup, Zomato has become a global food discovery and delivery platform, making eating out or in a breeze.

In this initial phase, the focus is on cleansing the dataset to enhance its quality and reliability. Subsequently, forthcoming analyses will involve identifying outliers and conducting Exploratory Data Analysis (EDA) to glean further insights.

In [1]:
# Importing Major Libraries

import pandas as pd
import numpy as np 
from matplotlib import pyplot as plt 
import seaborn as sns
import plotly.express as px
import warnings
warnings.filterwarnings('ignore')

In [2]:
# Importing dataset 

df = pd.read_csv('zomato_ds1.csv')

In [3]:
# Viewing Dataset

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,...,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,...,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,...,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,...,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,...,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,...,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,...,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,...,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,...,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,...,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


- Rows are obscured by dots, indicating columns between them are present and not visible.

In [4]:
# Checking all columns

pd.set_option('display.max_columns',None)

In [5]:
# Rechecking entire dataset with all columns.

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


In [6]:
# Checking Information of Dataset

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 211944 entries, 0 to 211943
Data columns (total 26 columns):
 #   Column                Non-Null Count   Dtype  
---  ------                --------------   -----  
 0   res_id                211944 non-null  int64  
 1   name                  211944 non-null  object 
 2   establishment         211944 non-null  object 
 3   url                   211944 non-null  object 
 4   address               211810 non-null  object 
 5   city                  211944 non-null  object 
 6   city_id               211944 non-null  int64  
 7   locality              211944 non-null  object 
 8   latitude              211944 non-null  float64
 9   longitude             211944 non-null  float64
 10  zipcode               48757 non-null   object 
 11  country_id            211944 non-null  int64  
 12  locality_verbose      211944 non-null  object 
 13  cuisines              210553 non-null  object 
 14  timings               208070 non-null  object 
 15  

- We possess a total of **13 numerical columns**, with 4 designated as float type, incorporating decimal values, and the remaining 9 classified as integer type.
- We have **13 object-type columns**, possibly with text or categories. Further analysis will clarify their content.
- The dataset includes some blank columns.

In [7]:
# Checking description of dataset

df.describe()

Unnamed: 0,res_id,city_id,latitude,longitude,country_id,average_cost_for_two,price_range,aggregate_rating,votes,photo_count,opentable_support,delivery,takeaway
count,211944.0,211944.0,211944.0,211944.0,211944.0,211944.0,211944.0,211944.0,211944.0,211944.0,211896.0,211944.0,211944.0
mean,13494110.0,4746.785434,21.499758,77.615276,1.0,595.812229,1.882535,3.395937,378.001864,256.971224,0.0,-0.255907,-1.0
std,7883722.0,5568.766386,22.781331,7.500104,0.0,606.239363,0.892989,1.283642,925.33337,867.66894,0.0,0.964172,0.0
min,50.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,-18.0,0.0,0.0,-1.0,-1.0
25%,3301027.0,11.0,15.496071,74.877961,1.0,250.0,1.0,3.3,16.0,3.0,0.0,-1.0,-1.0
50%,18695730.0,34.0,22.514494,77.425971,1.0,400.0,2.0,3.8,100.0,18.0,0.0,-1.0,-1.0
75%,18812970.0,11306.0,26.841667,80.219323,1.0,700.0,2.0,4.1,362.0,128.0,0.0,1.0,-1.0
max,19159790.0,11354.0,10000.0,91.832769,1.0,30000.0,4.0,4.9,42539.0,17702.0,0.0,1.0,-1.0


- The dataset comprises a total of 211,994 columns.
- We've examined the mean and standard deviation of the data. We'll explore their significance for potential future use.

### Checking for null values in the dataset.

In [8]:
df.isnull().sum()

res_id                       0
name                         0
establishment                0
url                          0
address                    134
city                         0
city_id                      0
locality                     0
latitude                     0
longitude                    0
zipcode                 163187
country_id                   0
locality_verbose             0
cuisines                  1391
timings                   3874
average_cost_for_two         0
price_range                  0
currency                     0
highlights                   0
aggregate_rating             0
rating_text                  0
votes                        0
photo_count                  0
opentable_support           48
delivery                     0
takeaway                     0
dtype: int64

- There are 134 inconsistencies in the columns of the address rows.
- There are 163,187 inconsistencies in the columns related to zip codes.
- There are 1,391 inconsistencies in the columns related to cuisines.
- There are 3,874 inconsistencies in the columns related to timings.
- There are 48 inconsistencies in the columns related to OpenTable support.

In [9]:
df.isnull().sum().sum()

168634

- There are a total of 168,634 inconsistencies in the data. It is essential to address and rectify these issues before proceeding with further analysis.

### Dealing with Null Values 

In [10]:
# Conducting a check for inconsistencies in the address column.

df[df.address.isnull()]

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
25475,2601516,CakeFizz,['Bakery'],https://www.zomato.com/bhopal/cakefizz-jhangir...,,Bhopal,26,Jhangirabad,23.252214,77.435611,,1,"Jhangirabad, Bhopal",Bakery,24 Hours,400,2,Rs.,"['Cash', 'Home Baker', 'Desserts and Bakes', '...",3.2,Average,9,0,0.0,-1,-1
78175,18313028,Pabrai's Fresh and Naturelle Ice Creams,['Dessert Parlour'],https://www.zomato.com/hyderabad/pabrais-fresh...,,Hyderabad,6,Madhapur,17.440883,78.393356,500081.0,1,"Madhapur, Hyderabad","Ice Cream, Desserts",11AM to 1AM (Mon-Sun),250,1,Rs.,"['No Alcohol Available', 'Cash', 'Delivery', '...",4.3,Very Good,393,328,0.0,1,-1
103427,18725005,test Moti Palace Township,[],https://www.zomato.com/junagadh/junagadh/-moti...,,Junagadh,11322,Moti Palace Township,21.498610,70.416507,,1,"Moti Palace Township, Junagadh",,,0,1,Rs.,[],0.0,Not rated,0,0,0.0,-1,-1
103433,18725012,test Joshipura,[],https://www.zomato.com/junagadh/junagadh/-josh...,,Junagadh,11322,Joshipura,21.530117,70.453044,,1,"Joshipura, Junagadh",,,0,1,Rs.,[],0.0,Not rated,0,0,0.0,-1,-1
103575,18725005,test Moti Palace Township,[],https://www.zomato.com/junagadh/junagadh/-moti...,,Junagadh,11322,Moti Palace Township,21.498610,70.416507,,1,"Moti Palace Township, Junagadh",,,0,1,Rs.,[],0.0,Not rated,0,0,0.0,-1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
197718,18041435,Lal Bagh,['Casual Dining'],https://www.zomato.com/udaipur/lal-bagh-panchw...,,Udaipur,11054,Panchwati,24.603519,73.687260,,1,"Panchwati, Udaipur","North Indian, Chinese, Continental",,1600,4,Rs.,"['Cash', 'Takeaway Available', 'Serves Alcohol...",2.4,Poor,121,25,0.0,-1,-1
197978,18041435,Lal Bagh,['Casual Dining'],https://www.zomato.com/udaipur/lal-bagh-panchw...,,Udaipur,11054,Panchwati,24.603519,73.687260,,1,"Panchwati, Udaipur","North Indian, Chinese, Continental",,1600,4,Rs.,"['Cash', 'Takeaway Available', 'Serves Alcohol...",2.4,Poor,121,25,0.0,-1,-1
198518,18041435,Lal Bagh,['Casual Dining'],https://www.zomato.com/udaipur/lal-bagh-panchw...,,Udaipur,11054,Panchwati,24.603519,73.687260,,1,"Panchwati, Udaipur","North Indian, Chinese, Continental",,1600,4,Rs.,"['Cash', 'Takeaway Available', 'Serves Alcohol...",2.4,Poor,121,25,0.0,-1,-1
198722,18041435,Lal Bagh,['Casual Dining'],https://www.zomato.com/udaipur/lal-bagh-panchw...,,Udaipur,11054,Panchwati,24.603519,73.687260,,1,"Panchwati, Udaipur","North Indian, Chinese, Continental",,1600,4,Rs.,"['Cash', 'Takeaway Available', 'Serves Alcohol...",2.4,Poor,121,25,0.0,-1,-1


In [11]:
# Comparing it with not null values 

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


- It appears that there are no discernible patterns or insights that would facilitate a meaningful replacement for the inconsistencies in the address column. Considering the primary focus of the address column on the actual restaurant address, alternatives such as using city or locality may not be suitable.

Certainly, **using "Unknown" as a placeholder for the inconsistencies in the address column** is a pragmatic approach. This temporary solution allows for continued analysis and can be revisited as needed.

In [12]:
# Filling NUll values in Address with placeholder "Unknown"

df.address = df.address.fillna('Unknown')

In [13]:
df.address.isnull().sum()

0

- **Address columns** are now void of null entries, ensuring a comprehensive and accurate dataset.

In [14]:
# Conducting a check for inconsistencies in the zipcodes.

df[df.zipcode.isnull()]

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
5,3400275,Domino's Pizza,['Quick Bites'],https://www.zomato.com/agra/dominos-pizza-civi...,"114/23 G, Deep Shikha Complex, Sanjay Place, C...",Agra,34,Civil Lines,27.201516,78.007556,,1,"Civil Lines, Agra","Pizza, Fast Food",10:57 AM to 11 PM,400,2,Rs.,"['Credit Card', 'Lunch', 'Delivery', 'Dinner',...",4.0,Very Good,707,62,0.0,-1,-1
7,3400368,Domino's Pizza,['Quick Bites'],https://www.zomato.com/agra/dominos-pizza-sika...,"Plot C-1/6, Sector 13, Sikandra, Agra",Agra,34,Sikandra,27.203930,77.954260,,1,"Sikandra, Agra","Pizza, Fast Food",10:57 AM to 11 PM,400,2,Rs.,"['Lunch', 'Delivery', 'Credit Card', 'No Alcoh...",3.8,Good,617,18,0.0,-1,-1
8,3401284,Cake House,['Bakery'],https://www.zomato.com/agra/cake-house-2-civil...,"23/301, Wazirpura Rd, Judge Compound Chowraha,...",Agra,34,Civil Lines,27.204148,78.009025,,1,"Civil Lines, Agra","Bakery, Fast Food",9 AM to 9 PM,500,2,Rs.,"['Takeaway Available', 'Cash', 'Indoor Seating...",3.4,Average,322,14,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211937,18855810,Biryani aur Baatein,['Casual Dining'],https://www.zomato.com/vadodara/biryani-aur-ba...,"Shop 14, Atlantis K-10, A Wing, Genda Circle R...",Vadodara,32,Alkapuri,22.317746,73.168043,,1,"Alkapuri, Vadodara","Biryani, Mughlai, Chinese","11:30am – 3:30pm, 7:30pm – 1am (Mon-Sun)",500,2,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,154,96,0.0,-1,-1
211938,18662583,Wok On Fire,['Casual Dining'],https://www.zomato.com/vadodara/wok-on-fire-fa...,"Ground Floor 1, Rossette Building, Opposite Se...",Vadodara,32,Fatehgunj,22.323357,73.187461,,1,"Fatehgunj, Vadodara","Chinese, Asian",11 AM to 11 PM,1000,3,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,301,126,0.0,1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1


In [15]:
# Comparing it with the dataset 

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


In [16]:
# Trying to seek any relationship between zipcode and locality 

selected_columns = ['zipcode', 'locality']
df_filtered = df[selected_columns]

In [17]:
df_filtered

Unnamed: 0,zipcode,locality
0,,Khandari
1,282001.0,Agra Cantt
2,282010.0,Shahganj
3,282002.0,Civil Lines
4,,Tajganj
...,...,...
211939,390024.0,Fatehgunj
211940,,Karelibaug
211941,,Alkapuri
211942,,Akota


In [18]:
df_filtered[df_filtered.locality=="Khandari"]

Unnamed: 0,zipcode,locality
0,,Khandari
33,,Khandari
65,282002.0,Khandari
68,,Khandari
72,282003.0,Khandari
...,...,...
2556,,Khandari
2569,282003.0,Khandari
2575,282002.0,Khandari
2579,,Khandari


In some locations, zipcodes are listed along with their corresponding localities, while in other cases, those same localities have empty or blank zipcode columns.
- For instance, in the "Khandari" locality, we observe both null values and assigned values in the zipcode column.

We aim to determine the most frequent (mode) zipcodes associated with non-null values in the locality column, then use this mode to replace the null values in the corresponding zipcode column.

In [19]:
# Creating a function to replace null values in the "zipcode" column.

def fill_missing_zipcodes(dataframe):
# Create a dictionary to store locality-zipcode mappings
    locality_zipcode_mapping = {}

# Iterate through the rows to populate the dictionary
    for index, row in dataframe.iterrows():
        locality = row['locality']
        zipcode = row['zipcode']

# Update the mapping only if the zipcode is not null
        if not pd.isnull(zipcode):
            locality_zipcode_mapping[locality] = zipcode

# Iterate again to fill missing zipcodes
    for index, row in dataframe.iterrows():
        locality = row['locality']
        zipcode = row['zipcode']

# Check if the zipcode is null and if the locality has a mapping
        if pd.isnull(zipcode) and locality in locality_zipcode_mapping:
            dataframe.at[index, 'zipcode'] = locality_zipcode_mapping[locality]

# Replace missing zipcodes with 0 where there is no mapping
    dataframe['zipcode'].fillna(0, inplace=True)

In [20]:
# Running Function to remove null values from dataset

fill_missing_zipcodes(df)

In [21]:
# Checking if any null values are left in zipcode

df.zipcode.isnull().sum()

0

- No null values remain in the "zipcode" column.

In [22]:
# Handling missing values in cuisine entries

df[df.cuisines.isnull()].head()

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
1140,3401673,Sri Dauji Mishthan Bhandar,['Sweet Shop'],https://www.zomato.com/agra/sri-dauji-mishthan...,"F11, Near Agrawal Hospital, Langre Ki Chowki, ...",Agra,34,Kamla Nagar,27.207933,78.025556,0,1,"Kamla Nagar, Agra",,7 AM to 11 PM,200,1,Rs.,"['Breakfast', 'Cash', 'Takeaway Available', 'I...",3.6,Good,23,0,0.0,-1,-1
7770,18689668,Vedanta Restaurant,['Casual Dining'],https://www.zomato.com/ajmer/vedanta-restauran...,"Hotel Metro Inn, Near Saint Francis Hospital, ...",Ajmer,11303,Railway Quarters,26.448262,74.638991,Apna baza,1,"Railway Quarters , Ajmer",,10 AM to 11 PM,0,1,Rs.,"['Cash', 'Debit Card', 'Dinner', 'Takeaway Ava...",0.0,Not rated,3,3,0.0,-1,-1
8000,18689668,Vedanta Restaurant,['Casual Dining'],https://www.zomato.com/ajmer/vedanta-restauran...,"Hotel Metro Inn, Near Saint Francis Hospital, ...",Ajmer,11303,Railway Quarters,26.448262,74.638991,Apna baza,1,"Railway Quarters , Ajmer",,10 AM to 11 PM,0,1,Rs.,"['Cash', 'Debit Card', 'Dinner', 'Takeaway Ava...",0.0,Not rated,3,3,0.0,-1,-1
11389,19008079,Shree sweets,[],https://www.zomato.com/allahabad/shree-sweets-...,"6, Rambagh Hanuman Mandir, Allahabad",Allahabad,24,Rambagh,25.439095,81.847112,211003.0,1,"Rambagh, Allahabad",,7 AM to 10 PM,100,1,Rs.,[],0.0,Not rated,1,0,0.0,-1,-1
11390,19008131,Aditya Mishthan Bhandar,[],https://www.zomato.com/allahabad/aditya-mishth...,"153, South Malaka, Allahabad, Rambagh, Allahabad",Allahabad,24,Rambagh,25.440297,81.844704,211003.0,1,"Rambagh, Allahabad",,7 AM to 11 PM,100,1,Rs.,[],0.0,Not rated,1,0,0.0,-1,-1


In [23]:
# Comparing rows with null values to rows where cuisines are not null.

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,282002.0,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,282001.0,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,390018.0,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,390007.0,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,390020.0,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


 In the absence of discernible patterns, replacing null values with an "No Description" placeholder.

In [24]:
df.cuisines = df.cuisines.fillna('No Description')

In [25]:
# Checking for Null values in cuisines 

df.cuisines.isnull().sum()

0

- No null values remain in the "cuisine" column.

In [26]:
# Handling missing values in timing entries.

df[df.timings.isnull()]

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
126,3400957,Anise,['Fine Dining'],https://www.zomato.com/agra/anise-tajganj?utm_...,"Courtyard by Marriott Agra, Taj Nagri , Phase ...",Agra,34,Tajganj,27.156386,78.059525,282001.0,1,"Tajganj, Agra",North Indian,,2000,4,Rs.,"['Cash', 'Debit Card', 'Dinner', 'Credit Card'...",3.8,Good,23,40,0.0,-1,-1
214,19032611,Om Sai Lassi And Juice Corner,['Beverage Shop'],https://www.zomato.com/agra/om-sai-lassi-and-j...,"Bodla Chauraya, Shahganj, Agra",Agra,34,Shahganj,27.188775,77.962173,282010.0,1,"Shahganj, Agra",Juices,,100,1,Rs.,"['Cash', 'Takeaway Available', 'Indoor Seating...",3.9,Good,46,3,0.0,1,-1
220,19050396,Agarwal Lassi,['Shack'],https://www.zomato.com/agra/agarwal-lassi-civi...,22/156chandi wali gali kothi near paliwal park...,Agra,34,Civil Lines,27.199784,78.017700,492006,1,"Civil Lines, Agra",Street Food,,100,1,Rs.,"['Cash', 'Takeaway Available']",3.6,Good,19,0,0.0,1,-1
297,19031666,chai ki tapri,['Café'],https://www.zomato.com/agra/chai-ki-tapri-sika...,shop no 2 hno 216 sec- 6a avas vikas colony si...,Agra,34,Sikandra,27.199757,77.958770,282007.0,1,"Sikandra, Agra",Cafe,,100,1,Rs.,['Outdoor Seating'],0.0,Not rated,1,0,0.0,-1,-1
383,3400322,Filmy Zayka - Hotel Alleviate,['Casual Dining'],https://www.zomato.com/agra/filmy-zayka-hotel-...,"6/27, Yamuna Kinara Road, Belanganj, Civil Lin...",Agra,34,Civil Lines,27.189864,78.024778,492006,1,"Civil Lines, Agra","Chinese, North Indian",,700,2,Rs.,"['Takeaway Available', 'Lunch', 'Cash', 'Credi...",4.1,Very Good,90,95,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
207778,19065540,Bake N Break Bakery,['Bakery'],https://www.zomato.com/visakhapatnam/bake-n-br...,"Door 21-2-13/14, Sector 9, MVP Colony, Visakha...",Vizag,28,MVP Colony,17.716327,83.311777,530017.0,1,"MVP Colony, Vizag",Bakery,,300,1,Rs.,"['Cash', 'Indoor Seating', 'Desserts and Bakes']",0.0,Not rated,1,0,0.0,-1,-1
208022,19065540,Bake N Break Bakery,['Bakery'],https://www.zomato.com/visakhapatnam/bake-n-br...,"Door 21-2-13/14, Sector 9, MVP Colony, Visakha...",Vizag,28,MVP Colony,17.716327,83.311777,530017.0,1,"MVP Colony, Vizag",Bakery,,300,1,Rs.,"['Cash', 'Indoor Seating', 'Desserts and Bakes']",0.0,Not rated,1,0,0.0,-1,-1
208465,19065540,Bake N Break Bakery,['Bakery'],https://www.zomato.com/visakhapatnam/bake-n-br...,"Door 21-2-13/14, Sector 9, MVP Colony, Visakha...",Vizag,28,MVP Colony,17.716327,83.311777,530017.0,1,"MVP Colony, Vizag",Bakery,,300,1,Rs.,"['Cash', 'Indoor Seating', 'Desserts and Bakes']",0.0,Not rated,1,0,0.0,-1,-1
209415,3200082,Salsa Habanero Mexican Taqueria,['Casual Dining'],https://www.zomato.com/vadodara/salsa-habanero...,"144/145 Villa D Este Next to Madhav Tower, Sam...",Vadodara,32,Subhanpura,22.318358,73.151159,390023.0,1,"Subhanpura, Vadodara",Mexican,,1000,3,Rs.,"['Cash', 'No Alcohol Available', 'Dinner', 'In...",4.1,Very Good,72,14,0.0,-1,-1


In [27]:
# Comparing it with Not NUll columns

df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,282002.0,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,282001.0,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,390018.0,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,390007.0,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,390020.0,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


- The 'timing' column is of object type, encompassing string values that include both timings and days of the week when the establishment is open.

In [28]:
df.timings = df.timings.fillna("No Info")

In [29]:
df.timings.isnull().sum()

0

The **timings column** is now free of any null values.

In [30]:
# Handling null values in opentable_support entries

df.opentable_support

0         0.0
1         0.0
2         0.0
3         0.0
4         0.0
         ... 
211939    0.0
211940    0.0
211941    0.0
211942    0.0
211943    0.0
Name: opentable_support, Length: 211944, dtype: float64

In [31]:
df.opentable_support.mean()

0.0

Given that the 'opentable_support' column has all zero values, as evident from their mean being zero, we are filling null values with zero for consistency.

In [32]:
df.opentable_support = df.opentable_support.fillna(0)

In [33]:
df.opentable_support.isnull().sum()

0

- No null values remain in the "opentable_support" column.

### Verifying the entire dataset for any remaining inconsistencies.

In [34]:
df.isnull().sum()

res_id                  0
name                    0
establishment           0
url                     0
address                 0
city                    0
city_id                 0
locality                0
latitude                0
longitude               0
zipcode                 0
country_id              0
locality_verbose        0
cuisines                0
timings                 0
average_cost_for_two    0
price_range             0
currency                0
highlights              0
aggregate_rating        0
rating_text             0
votes                   0
photo_count             0
opentable_support       0
delivery                0
takeaway                0
dtype: int64

The entire dataset is now void of any null values.

#### Removing unnecessary columns from the dataset for improved clarity.

In [35]:
df

Unnamed: 0,res_id,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,country_id,locality_verbose,cuisines,timings,average_cost_for_two,price_range,currency,highlights,aggregate_rating,rating_text,votes,photo_count,opentable_support,delivery,takeaway
0,3400299,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,282002.0,1,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,Rs.,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,0.0,-1,-1
1,3400005,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,1,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,Rs.,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,0.0,-1,-1
2,3401013,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,1,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,Rs.,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,0.0,1,-1
3,3400290,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,1,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,Rs.,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,0.0,1,-1
4,3401744,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,282001.0,1,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,Rs.,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,0.0,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,3202251,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,1,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,Rs.,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,0.0,-1,-1
211940,3200996,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,390018.0,1,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,Rs.,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,0.0,1,-1
211941,18984164,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,390007.0,1,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,Rs.,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,0.0,-1,-1
211942,3201138,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,390020.0,1,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,Rs.,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,0.0,1,-1


In [36]:
df.country_id.nunique()

1

- Since all the data is from the same country, the "country_id" column is unnecessary and will be dropped for simplification.

In [37]:
df.res_id.nunique()

55568

- Observing significant duplicacy in the "res_id" column, resembling a bubble for a region. Given this, the "res_id" will be disregarded, and index numbers will be used for clarity.

In [38]:
df.currency.nunique()

1

- Given that the country is consistent for all rows and the restaurants only accept Rs./INR, the "currency" column is redundant and will be dropped for simplicity.

In [39]:
df.opentable_support.nunique()

1

- As the "opentable_support" column consistently holds zero values across all rows, it does not provide valuable insights. Consequently, this column will be dropped for efficiency.

In [40]:
# Dropping unnecessary rows from dataset 

df = df.drop(["res_id","country_id","currency","opentable_support"],axis=1)

In [41]:
df

Unnamed: 0,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,locality_verbose,cuisines,timings,average_cost_for_two,price_range,highlights,aggregate_rating,rating_text,votes,photo_count,delivery,takeaway
0,Bikanervala,['Quick Bites'],https://www.zomato.com/agra/bikanervala-khanda...,"Kalyani Point, Near Tulsi Cinema, Bypass Road,...",Agra,34,Khandari,27.211450,78.002381,282002.0,"Khandari, Agra","North Indian, South Indian, Mithai, Street Foo...",8:30am – 10:30pm (Mon-Sun),700,2,"['Lunch', 'Takeaway Available', 'Credit Card',...",4.4,Very Good,814,154,-1,-1
1,Mama Chicken Mama Franky House,['Quick Bites'],https://www.zomato.com/agra/mama-chicken-mama-...,"Main Market, Sadar Bazaar, Agra Cantt, Agra",Agra,34,Agra Cantt,27.160569,78.011583,282001.0,"Agra Cantt, Agra","North Indian, Mughlai, Rolls, Chinese, Fast Fo...","12:30PM to 12Midnight (Mon, Wed, Thu, Fri, Sat...",600,2,"['Delivery', 'No Alcohol Available', 'Dinner',...",4.4,Very Good,1203,161,-1,-1
2,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-2-sh...,"62/1, Near Easy Day, West Shivaji Nagar, Goalp...",Agra,34,Shahganj,27.182938,77.979684,282010.0,"Shahganj, Agra","Fast Food, Mithai",9:30 AM to 11 PM,300,1,"['No Alcohol Available', 'Dinner', 'Takeaway A...",4.2,Very Good,801,107,1,-1
3,Bhagat Halwai,['Quick Bites'],https://www.zomato.com/agra/bhagat-halwai-civi...,"Near Anjana Cinema, Nehru Nagar, Civil Lines, ...",Agra,34,Civil Lines,27.205668,78.004799,282002.0,"Civil Lines, Agra","Desserts, Bakery, Fast Food, South Indian",8am – 11pm (Mon-Sun),300,1,"['Takeaway Available', 'Credit Card', 'Lunch',...",4.3,Very Good,693,157,1,-1
4,The Salt Cafe Kitchen & Bar,['Casual Dining'],https://www.zomato.com/agra/the-salt-cafe-kitc...,"1C,3rd Floor, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.157709,78.052421,282001.0,"Tajganj, Agra","North Indian, Continental, Italian",11:30 AM to 11:30 PM,1000,3,"['Lunch', 'Serves Alcohol', 'Cash', 'Credit Ca...",4.9,Excellent,470,291,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211939,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024.0,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,-1,-1
211940,Raju Omlet,['Quick Bites'],https://www.zomato.com/vadodara/raju-omlet-kar...,"Mahalaxmi Apartment, Opposite B O B, Karoli Ba...",Vadodara,32,Karelibaug,22.322455,73.197203,390018.0,"Karelibaug, Vadodara",Fast Food,"4pm – 11pm (Mon, Tue, Wed, Fri, Sat, Sun), 4pm...",300,1,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,187,40,1,-1
211941,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,390007.0,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,-1,-1
211942,Subway,['Quick Bites'],https://www.zomato.com/vadodara/subway-1-akota...,"G-2, Vedant Platina, Near Cosmos, Akota, Vadodara",Vadodara,32,Akota,22.270027,73.143068,390020.0,"Akota, Vadodara","Fast Food, Sandwich, Salad",8 AM to 1 AM,500,2,"['Dinner', 'Delivery', 'Credit Card', 'Lunch',...",3.7,Good,128,34,1,-1


In [42]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 211944 entries, 0 to 211943
Data columns (total 22 columns):
 #   Column                Non-Null Count   Dtype  
---  ------                --------------   -----  
 0   name                  211944 non-null  object 
 1   establishment         211944 non-null  object 
 2   url                   211944 non-null  object 
 3   address               211944 non-null  object 
 4   city                  211944 non-null  object 
 5   city_id               211944 non-null  int64  
 6   locality              211944 non-null  object 
 7   latitude              211944 non-null  float64
 8   longitude             211944 non-null  float64
 9   zipcode               211944 non-null  object 
 10  locality_verbose      211944 non-null  object 
 11  cuisines              211944 non-null  object 
 12  timings               211944 non-null  object 
 13  average_cost_for_two  211944 non-null  int64  
 14  price_range           211944 non-null  int64  
 15  

Currently, the dataset comprises a total of 22 columns, including 10 numerical columns and 12 object-type columns.

- Noticing that the "zipcode" column is turned object type after using **fill_missing_zipcodes() function** , it will be converted to int type for accuracy.

In [43]:
df['zipcode'] = df['zipcode'].astype(int)

ValueError: invalid literal for int() with base 10: '282002.0'

- The error suggests that there might be non-integer values (e.g., NaN or string values) in the "zipcode" column that cannot be directly converted to integers. 
- There might be some error with code of block

In [None]:
# Trying to identify the problematic values by attempting to convert each value individually using a try-except block.

In [None]:
for value in df['zipcode']:
    try:
        int(value)
    except ValueError:
        print(f"Cannot convert {value} to int")
        
# /////TURNED OFF THIS COMMAND FOR EASE IN READING //////

- "Apna  baza, absoluteba, Bangalore, Shop no. S" are few phrases we encountered manually 

In [44]:
# Convert the "zipcode" column to strings so that we can use isalpha()

df['zipcode'] = df['zipcode'].astype(str)

In [45]:
# Custom function to check if the input contains only alphabets or words
def contains_only_alphabets(value):
    return value.isalpha()

# Apply the function to filter rows
alphabetic_rows = df[df['zipcode'].apply(contains_only_alphabets)]

# Display the rows where the "zipcode" column contains only alphabets or words
alphabetic_rows

Unnamed: 0,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,locality_verbose,cuisines,timings,average_cost_for_two,price_range,highlights,aggregate_rating,rating_text,votes,photo_count,delivery,takeaway
19951,AB's - Absolute Barbecues,['Casual Dining'],https://www.zomato.com/bangalore/abs-absolute-...,"1st Floor, GRS Towers, Above Spencers Hyper Ma...",Bangalore,4,Sarjapur Road,12.918717,77.669925,absoluteba,"Sarjapur Road, Bangalore","European, Mediterranean, North Indian, BBQ","12 Noon to 4 PM, 6:30 PM to 12 Midnight",1600,3,"['Lunch', 'Credit Card', 'Debit Card', 'Dinner...",4.8,Excellent,2012,827,-1,-1
20497,Cake Box,['Bakery'],https://www.zomato.com/bangalore/cake-box-1-ko...,"74/2A, K.R. Garden, Koramangala 8th Block, Ban...",Bangalore,4,Koramangala 8th Block,12.943477,77.620535,Bangalore,"Koramangala 8th Block, Bangalore","Bakery, Desserts",10:30 AM to 2 AM,400,1,"['Delivery', 'Cash', 'Takeaway Available', 'Pr...",3.2,Average,122,25,1,-1
20614,Lakeview Milkbar,['Dessert Parlour'],https://www.zomato.com/bangalore/lakeview-milk...,"8th Block, Opposite Bethany School, 80 Feet Ro...",Bangalore,4,Koramangala 8th Block,12.940799,77.620468,Bangalore,"Koramangala 8th Block, Bangalore","Desserts, Cafe",9 AM to 1 AM (Mon-Sun),500,2,"['Dinner', 'Delivery', 'Breakfast', 'Lunch', '...",3.8,Good,309,357,-1,-1
21490,Tiffin Box,['Mess'],https://www.zomato.com/bangalore/tiffin-box-ko...,"214, 3rd Main Road, Koramangala 8th Block, Ban...",Bangalore,4,Koramangala 8th Block,12.941822,77.616169,Bangalore,"Koramangala 8th Block, Bangalore","South Indian, Beverages",8 AM to 9 PM,400,1,"['Takeaway Available', 'Cash', 'Debit Card', '...",0.0,Not rated,0,0,-1,-1
21515,Sri Ganesh Juice Junction,['Beverage Shop'],https://www.zomato.com/bangalore/sri-ganesh-ju...,"537, 1st Main, Near Bethani School, 8th Block,...",Bangalore,4,Koramangala 8th Block,12.940879,77.619751,Bangalore,"Koramangala 8th Block, Bangalore","Beverages, Juices, Fast Food",7am – 11pm (Mon-Sun),150,1,"['Takeaway Available', 'Credit Card', 'No Seat...",4.0,Very Good,250,36,1,-1
21549,Sri Ganesh Juice Junction,['Beverage Shop'],https://www.zomato.com/bangalore/sri-ganesh-ju...,"513, Opposite Ganesh Temple, Koramangala 8th B...",Bangalore,4,Koramangala 8th Block,12.941632,77.621547,Bangalore,"Koramangala 8th Block, Bangalore","Juices, Fast Food",9 AM to 10:30 PM,150,1,"['No Seating Available', 'Delivery', 'Cash', '...",3.7,Good,150,18,1,-1
23155,AB's - Absolute Barbecues,['Casual Dining'],https://www.zomato.com/bangalore/abs-absolute-...,"1st Floor, GRS Towers, Above Spencers Hyper Ma...",Bangalore,4,Sarjapur Road,12.918717,77.669925,absoluteba,"Sarjapur Road, Bangalore","European, Mediterranean, North Indian, BBQ","12 Noon to 4 PM, 6:30 PM to 12 Midnight",1600,3,"['Lunch', 'Credit Card', 'Debit Card', 'Dinner...",4.8,Excellent,2012,827,-1,-1
23209,AB's - Absolute Barbecues,['Casual Dining'],https://www.zomato.com/bangalore/abs-absolute-...,"1st Floor, GRS Towers, Above Spencers Hyper Ma...",Bangalore,4,Sarjapur Road,12.918717,77.669925,absoluteba,"Sarjapur Road, Bangalore","European, Mediterranean, North Indian, BBQ","12 Noon to 4 PM, 6:30 PM to 12 Midnight",1600,3,"['Lunch', 'Credit Card', 'Debit Card', 'Dinner...",4.8,Excellent,2012,827,-1,-1
24583,Blue Tokai Coffee Roasters,['Café'],https://www.zomato.com/bangalore/blue-tokai-co...,"583, 80 Feet Road, Opposite Bethany High Schoo...",Bangalore,4,Koramangala 8th Block,12.940603,77.620291,Bangalore,"Koramangala 8th Block, Bangalore","Coffee, Cafe",8am – 11pm (Mon-Sun),800,2,"['Dinner', 'Delivery', 'Credit Card', 'Debit C...",4.3,Very Good,179,251,1,-1


- Now we are able to see string data which is spoiling the column 

In [46]:
# Dealing with non_numerical columns in this and replacing non-numerical columns with 0

# Making a variable and identifying rows where the "zipcode" column contains only alphabets or words
non_numeric_rows = df['zipcode'].apply(contains_only_alphabets)

# Replace non-numeric values with zero
df.loc[non_numeric_rows, 'zipcode'] = '0'

In [47]:
# Checking if there any non_numerical values are left in the data 

alphabetic_rows = df[df['zipcode'].apply(contains_only_alphabets)]

alphabetic_rows

Unnamed: 0,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,locality_verbose,cuisines,timings,average_cost_for_two,price_range,highlights,aggregate_rating,rating_text,votes,photo_count,delivery,takeaway


- Now we have all numerical columns 

In [48]:
# changing dtype from str to int for better clarity 

df['zipcode'] = df['zipcode'].astype(int)

ValueError: invalid literal for int() with base 10: '282002.0'

- We have removed all the alphanumeric rows but still we are getting an error
- Let us first change it into float for better understanding

In [49]:
# Convert the "zipcode" column to float
df['zipcode'] = df['zipcode'].astype(float)

ValueError: could not convert string to float: '380 006'

- As we can see few data contains spaces as well. It is taking it as String 

In [50]:
# Replace spaces in the "zipcode" column
df['zipcode'] = df['zipcode'].str.replace(' ', '')

In [51]:
# Convert the "zipcode" column to float
df['zipcode'] = df['zipcode'].astype(float)

ValueError: could not convert string to float: 'Apnabaza'

- We can see an str Apnabaza is left during process, we need to replace it with zero 

In [52]:
# Convert the "zipcode" column to numeric, coercing errors to NaN
df['zipcode'] = pd.to_numeric(df['zipcode'], errors='coerce')

In [53]:
# Replace NaN values with 0
df['zipcode'].fillna(0, inplace=True)

In [54]:
# Convert the "zipcode" column to integers
df['zipcode'] = df['zipcode'].astype(int)

In [55]:
df.zipcode.dtype

dtype('int32')

**The "zipcode" column's data type has been formally and efficiently updated from object to integer.**

In [56]:
# Verifying the count of rows converted to zero during the process.

turned_zero = df.zipcode==0

In [57]:
turned_zero.value_counts()

zipcode
False    166806
True      45138
Name: count, dtype: int64

### NOTE:

A total of 45,138 rows have been converted to zero during the process. However, **it's noteworthy that 118,049 rows have been successfully retained.** Future considerations will be given to how to address or handle the rows that have been converted to zero.

### Identifying and addressing duplicates in the dataset.

In [58]:
df[df.duplicated]

Unnamed: 0,name,establishment,url,address,city,city_id,locality,latitude,longitude,zipcode,locality_verbose,cuisines,timings,average_cost_for_two,price_range,highlights,aggregate_rating,rating_text,votes,photo_count,delivery,takeaway
101,Peshawri - ITC Mughal,['Fine Dining'],https://www.zomato.com/agra/peshawri-itc-mugha...,"ITC Mughal, Fatehabad Road, Tajganj, Agra",Agra,34,"ITC Mughal, Tajganj",27.161150,78.043993,0,"ITC Mughal, Tajganj, Agra","North Indian, Mughlai","12:30 PM to 2:45 PM, 7 PM to 11:45 PM",6000,4,"['Lunch', 'Cash', 'Credit Card', 'Dinner', 'De...",4.4,Very Good,353,154,-1,-1
116,Taj Bano - ITC Mughal,['Fine Dining'],https://www.zomato.com/agra/taj-bano-itc-mugha...,"ITC Mughal, Fatehabad Road, Tajganj, Agra",Agra,34,"ITC Mughal, Tajganj",27.161132,78.044022,0,"ITC Mughal, Tajganj, Agra","Mughlai, North Indian","6 AM to 10:30 AM, 12:30 PM to 2:30 PM, 7 PM to...",2500,4,"['Credit Card', 'Lunch', 'Cash', 'Debit Card',...",4.3,Very Good,96,205,-1,-1
140,Pinch Of Spice,['Casual Dining'],https://www.zomato.com/agra/pinch-of-spice-civ...,"23/453, Opposite Sanjay Cinema, Wazipura Road,...",Agra,34,Civil Lines,27.201735,78.007625,492006,"Civil Lines, Agra","North Indian, Chinese, Mughlai","12 Noon to 4 PM, 7 PM to 11:30 PM",1600,4,"['Lunch', 'Delivery', 'Credit Card', 'Dinner',...",4.6,Excellent,915,105,1,-1
141,Pinch Of Spice,['Casual Dining'],https://www.zomato.com/agra/pinch-of-spice-taj...,"1076/2, Fatehabad Road, Tajganj, Agra",Agra,34,Tajganj,27.159649,78.043304,282010,"Tajganj, Agra","North Indian, Chinese, Continental",12 Noon to 11:30 PM (Mon-Sun),1600,4,"['Delivery', 'Dinner', 'Cash', 'Credit Card', ...",4.6,Excellent,965,690,1,-1
142,Urban Deck,['Casual Dining'],https://www.zomato.com/agra/urban-deck-2-civil...,"5th Floor, The P L Palace Hotel, MG Road, Sanj...",Agra,34,Civil Lines,27.199573,78.003699,282005,"Civil Lines, Agra","Mexican, Italian, North Indian",1PM to 12Midnight (Mon-Sun),1600,4,"['Dinner', 'Cash', 'Debit Card', 'Takeaway Ava...",4.3,Very Good,672,192,1,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
211937,Biryani aur Baatein,['Casual Dining'],https://www.zomato.com/vadodara/biryani-aur-ba...,"Shop 14, Atlantis K-10, A Wing, Genda Circle R...",Vadodara,32,Alkapuri,22.317746,73.168043,390007,"Alkapuri, Vadodara","Biryani, Mughlai, Chinese","11:30am – 3:30pm, 7:30pm – 1am (Mon-Sun)",500,2,"['Dinner', 'Cash', 'Takeaway Available', 'Debi...",4.1,Very Good,154,96,-1,-1
211938,Wok On Fire,['Casual Dining'],https://www.zomato.com/vadodara/wok-on-fire-fa...,"Ground Floor 1, Rossette Building, Opposite Se...",Vadodara,32,Fatehgunj,22.323357,73.187461,390024,"Fatehgunj, Vadodara","Chinese, Asian",11 AM to 11 PM,1000,3,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,301,126,1,-1
211939,Kali Mirch Cafe And Restaurant,['Casual Dining'],https://www.zomato.com/vadodara/kali-mirch-caf...,"Manu Smriti Complex, Near Navrachna School, GI...",Vadodara,32,Fatehgunj,22.336931,73.192356,390024,"Fatehgunj, Vadodara",North Indian,"11am – 3pm, 7pm – 11:30pm (Mon-Sun)",600,2,"['Dinner', 'Cash', 'Lunch', 'Delivery', 'Indoo...",4.1,Very Good,243,40,-1,-1
211941,The Grand Thakar,['Casual Dining'],https://www.zomato.com/vadodara/the-grand-thak...,"3rd Floor, Shreem Shalini Mall, Opposite Conqu...",Vadodara,32,Alkapuri,22.310563,73.171163,390007,"Alkapuri, Vadodara","Gujarati, North Indian, Chinese","11:30 AM to 3:30 PM, 7:30 PM to 11 PM",700,2,"['Dinner', 'Cash', 'Debit Card', 'Lunch', 'Tak...",4.0,Very Good,111,38,-1,-1


- With a dataset comprising 151,533 rows, we have observed that a considerable amount of data appears to be legitimate. While acknowledging the potential presence of exactly identical rows, we have decided not to remove duplicates, recognizing that eliminating such a significant amount of data could adversely impact the dataset's integrity and compromise future analyses.

#### The data has been successfully cleaned, devoid of inconsistencies. We will proceed to save it as a new CSV file.

In [59]:
# Saving the refined dataset as a new .csv file for subsequent analysis and error identification.

df.to_csv('Zomato_CleanData.csv')

# SUMMARY:

Throughout the data cleaning process, several key steps were undertaken to enhance the quality and integrity of the dataset. The initial assessment involved checking for missing values, followed by the identification and replacement of null values.

Additionally, a function was crafted to fill missing zipcodes by leveraging a locality-zipcode mapping. The dtype of the "zipcode" column was then successfully transformed from object to integer, with meticulous handling of various data anomalies, including phrases and alphanumeric entries. The process revealed that a substantial number of rows were converted to zero, prompting considerations for future handling.

Concerning duplicates, it was acknowledged that exact duplicates existed in the dataset, but a decision was made not to remove them, recognizing the potential loss of valuable information within the 151,533 rows. The refined and cleaned dataset, free from inconsistencies, was ultimately saved as a new CSV file for future analyses and error findings.

In summary, the data cleaning process involved addressing missing values, handling diverse data anomalies, converting data types, and making informed decisions regarding duplicates. The resulting dataset is now prepared for further analysis with improved quality and reliability.