## Analysis of Zomato Restaurants in Bangalore

####  With this analysis we will try to identify the best and the most famous restaurants in Bangalore. We will try to give insights to find the best places to open a restaurant.

##### Data Description: There is one .xlsx file in the dataset. (Zomato.xlsx) This file includes:
#### Name, Location, Dishes and various other aspects of the restuarant.

##### Import Pandas and Numpy

In [None]:
import pandas as pd
import numpy as np

##### Import Matplotlib and Seaborn

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

#### Read the file as dataframe df

In [None]:
df= pd.read_csv("../input/zomato-bangalore-restaurants/zomato.csv",encoding='latin1')

In [None]:
df.info()

In [None]:
df.head()

### Preprocessing
#### We will remove all the irrelevant data from the datasheet. We will also change the names of certain colums.


In [None]:
df.drop(['url','address','phone'],axis=1,inplace=True)
df

In [None]:
df.drop(['reviews_list','menu_item'],axis=1,inplace=True)
df

In [None]:
df.drop(['location'],axis=1,inplace=True)

In [None]:
df.columns=['Name','Online_Order','Book_table','Rate','Votes','Rest_type','Dish_liked','Cuisines','Avg_cost','Meal_type','City']

In [None]:
df

#### We will remove the /5 from the rate column so that it becomes easy for analysis

In [None]:
df['Rate']=df['Rate'].apply(lambda x:str(x).split('/')[0])

In [None]:
df.isnull().sum()

In [None]:
sns.set_context("paper", font_scale = 2, rc = {"font.size": 10,"axes.titlesize": 15,"axes.labelsize": 15})   
sns.countplot(x='Online_Order',data=df,palette='magma')
plt.title('Restaurants that Take Online Order')
plt.show()

In [None]:
sns.set_context("paper", font_scale = 2, rc = {"font.size": 10,"axes.titlesize": 15,"axes.labelsize": 15})   
sns.countplot(x='Book_table',data=df,palette='magma')
plt.title('Restaurants that have Table Bookings')
plt.show()

##### Analysis: Most Restaurants have the option of online order and Table bookings

In [None]:
sns.set_context('paper',font_scale=2,rc={'font.size':10,'axes.labelsize':15,'axes.titlesize':15})
b=sns.countplot(x='Meal_type',data=df,palette='magma')
plt.title('Restaurants according to Meal Type')
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

In [None]:
plt.figure(figsize=(10,6))
b=sns.countplot(x='City',data=df,palette='magma')
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

#### Analysis: BTM and Koramangala have the highest number of Restaurants in the city.

### Analysing the Rate Column

In [None]:
a=list(df['Rate'])
for i in range(len(a)):
    if a[i]=='-':
        a[i]=None
    elif a[i]=='NEW':
        a[i]=None
    elif a[i]=='nan':
        a[i]=None
    elif a[i]=='unrated':
        a[i]=None
    else:
        a[i]=float(a[i])

In [None]:
df['Rate']=a

In [None]:
df['Rate'].value_counts().head(10)

In [None]:
sns.set_context('paper',font_scale=2,rc={'font.size':15,'axes.titlesize':15,'axes.labelsize':15})
plt.figure(figsize=(10,6))
b=sns.countplot(x='Rate',data=df,palette='magma',order=df['Rate'].value_counts().index)
plt.title('Ratings of Restaurants in Bangalore')
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

#### Analysis: We can see that most of the restaurants have a rating from 3.5-4.0. Very less restaurants have an extremely bad rating or an extremely high rating.

In [None]:
plt.figure(figsize=(7,5))
sns.set_context('paper',font_scale=2,rc={'font.size':10 ,'axes.titlesize':15,'axes.labelsize':15})
plt.title('Top 10 Types of Restaurants')
b=sns.countplot(x='Rest_type',data=df,palette='magma',order=df['Rest_type'].value_counts().head(10).index)
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()


In [None]:
plt.figure(figsize=(7,5))
sns.set_context('paper',font_scale=2,rc={'font.size':10 ,'axes.titlesize':15,'axes.labelsize':15})
plt.title('Bottom 10 Restaurants of each type')
b=sns.countplot(x='Rest_type',data=df,palette='magma',order=df['Rest_type'].value_counts().tail(10).index)
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()


#### Analysis: We see that Quick Bites and Cafes are the most common types of restaurants.


In [None]:
plt.figure(figsize=(17,7))
sns.set_context('paper',font_scale=2,rc={'font.size':10 ,'axes.titlesize':15,'axes.labelsize':15})
b=sns.countplot(x='Avg_cost',data=df,palette='magma',order=df['Avg_cost'].value_counts().head(20).index)
plt.title('Average Cost of Retaurants')
plt.show()

#### Average Cost of most Restaurants range betwwen 300 and 500. Hence showing that Bangalore has a lot of restaurants for a middle class family.

In [None]:
plt.figure(figsize=(10,5))
sns.set_context('paper',font_scale=2,rc={'font.size':20,'axes.titlesize':20,'axes.labelsize':20})
plt.title('Top 10 Cuisines')
b=sns.countplot(x='Cuisines',data=df,palette='magma',order=df['Cuisines'].value_counts().head(10).index)
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

In [None]:
plt.figure(figsize=(10,5))
b=sns.countplot(x='Cuisines',data=df,palette='magma',order=df['Cuisines'].value_counts().tail(10).index)
sns.set_context('paper',font_scale=2,rc={'font.size':20,'axes.titlesize':20,'axes.labelsize':20})
plt.title('Least Found Cuisines')
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

#### The graphs above show that Indian origin food are the most famous type of food for people in Bangalore.

In [None]:
plt.figure(figsize=(20,5))
b=sns.barplot(x="Name",y='Rate',data=df,palette='magma',order=df['Name'].value_counts().head(10).index)
sns.set_context('paper',font_scale=2,rc={'font.size':20,'axes.titlesize':20,'axes.labelsize':20})
plt.title("Rate vs Names")
b.set_xticklabels(b.get_xticklabels(),rotation=90)
plt.show()

In [None]:
plt.figure(figsize=(20,8))
sns.barplot(x='Name',y='Votes',data=df,
            order=df[['Votes', 'Name']].groupby(['Name']).mean().sort_values("Votes", ascending = False).head(50).index)
plt.title('Bar plot of Votes vs names for top 50 restaurants')
plt.xticks(rotation=90)
plt.show()

In [None]:
df.loc[df['Name'] == 'Byg Brewski Brewing Company']

In [None]:
plt.figure(figsize=(20,8))
sns.barplot(x='Name',y='Rate',data=df,
            order=df[['Rate', 'Name']].groupby(['Name']).mean().sort_values("Rate", ascending = False).head(50).index)
plt.title('Bar plot of rate vs names for top 50 restaurants')
plt.xticks(rotation=90)
plt.show()

#### Analysis: Byg Brewski Brewing Company,Toit ,Black Pearl have the highest number of votes and the high ratings they have gained also proves the same.

In [None]:
a = pd.DataFrame(df['Rate'])
a['Name'] = df['Name']
a = a.dropna(axis = 0, how ='any')
plt.figure(figsize=(20,8))
sns.barplot(x='Name',y='Rate',data=a,palette='magma',
            order=a[['Rate','Name']].groupby(['Name']).mean().sort_values('Rate',ascending=False).tail(50).index)
plt.title('Bar Plot of Rate vs Worst 50 restaurants')
plt.xticks(rotation=90)
plt.show()

In [None]:
df.loc[df['Name'] == 'Alibi - Maya International Hotel']

In [None]:
df['Avg_cost']=df['Avg_cost'].apply(lambda x:str(x).replace(',',''))
a = list(df['Avg_cost'])
for i in range(0, len(a)):
    if a[i] != 'nan':
        a[i] = int(a[i])
    else:
        a[i] = None
df['Avg_cost'] = a

In [None]:
plt.figure(figsize=(20,10))
sns.barplot(x='Name',y='Avg_cost',data=df,palette='magma',
            order=df[['Avg_cost','Name']].groupby(['Name']).mean().sort_values(['Avg_cost'],ascending=False).head(50).index)
sns.set_context('paper',font_scale=2,rc={'fontsize':15,'axes.labelsize':15,'axes.titlesize':15})
plt.title('Avg Cost of TOp 50 Restaurants')
plt.xticks(rotation=90)
plt.show()

#### Analysis: Restaurants who have a presence all over the country with different branches have higher costs as compared to other restaurants.

In [None]:
plt.figure(figsize=(18,8))
sns.countplot(x='Rate',data=df,hue='Online_Order',palette='magma')
plt.title('Count Plot of Rate')
plt.show()

In [None]:
plt.figure(figsize=(18,8))
sns.countplot(x='Rate',data=df,hue='Book_table',palette='magma')
plt.show()

#### Analysis: Restaurants having the option of online order tend to have a better rate as compared to others. While restaurants with a rating of 3.5-4.0 allow online booking.

In [None]:
plt.figure(figsize=(16,6))
plt.title('Countplot of Restaurants in every City')
sns.countplot(x='City',data=df,palette='magma',hue='Online_Order')
sns.set_context('paper',font_scale=2,rc={'fontsize':15,'axes.labelsize':15,'axes.titlsize':15})
plt.xticks(rotation=90)
plt.show()

In [None]:
plt.figure(figsize=(16,6))
sns.countplot(x='City',data=df,hue='Book_table',palette='magma')
sns.set_context('paper',font_scale=2,rc={'fontsize':2,'axes.titlesize':15,'axes.labelsize':15})
plt.title('No of Restaurants who allow table bookings')
plt.xticks(rotation=90)
plt.show()

In [None]:
plt.figure(figsize=(16,5))
sns.barplot(x='City',y='Avg_cost',data=df,palette='magma',
            order=df[['Avg_cost','City']].groupby(['City']).mean().sort_values('Avg_cost',ascending=False).head(50).index)
plt.xticks(rotation=90)
plt.title('Bar Plot of city vs Average Cost')
plt.show()

#### Analysis: Places like Church Street and Brigade Road have many high end restaurants while Kormangala and Bellandur have restaurants for a middle class family. Basavanagudi and Banashankari have restaurants with a relatively low cost.

In [None]:
plt.figure(figsize=(16,5))
sns.barplot(x='City',y='Rate',data=df,palette='magma',
           order=df[['Rate','City']].groupby(['City']).mean().sort_values('Rate',ascending=False).head(50).index)
sns.set_context('paper',font_scale=2,rc={'fontsize':15,'axes.titlesize':15,'axes.labelsize':15})
plt.title('Bar plot of City vs Rate')
plt.xticks(rotation=45,horizontalalignment='right',
        fontweight='light',
        fontsize='x-small')
plt.show()

#### Analysis: High End Restaurants tend to have a higher rating then the other restaurants. 

In [None]:
a = df.iloc[:, :].values
for i in range(0, len(a)):
    if a[i, 6] == 'Friendly Staff' or a[i,6]=='Rooftop Ambience':
        a[i, 6] = None
a = pd.DataFrame(a)
a.columns = df.columns

In [None]:
# plotting the top 10 dishes liked by people 
plt.figure(figsize=(16,8))
sns.countplot(x='Dish_liked',data=a,palette='magma',
           order=a['Dish_liked'].value_counts().head(10).index)
sns.set_context('paper',font_scale=2,rc={'fontsize':15,'axes.titlesize':15,'axes.labelsize':15})
plt.title('Most Famous Dishes')
plt.xticks(rotation=45,horizontalalignment='right',
        fontweight='light',
        fontsize='x-small')
plt.show()

#### Analysis: Biryani is a clear winner in terms of favourite dishes. Waffles and Parathas are other likeable food options.

In [None]:
plt.figure(figsize=(16,8))
sns.scatterplot(x='Rate',y='Votes',data=df)
plt.show()

#### Analysis: Ratings of a restaurant tend to increase with the number of votes done by customers.

In [None]:
plt.figure(figsize=(16,8))
sns.scatterplot(x='Rate',y='Avg_cost',data=df)
plt.show()

#### Analyis: Expensive Restaurants tend to get a higher rating while mid range restaurants are also capable of pulling decent ratings.

In [None]:
plt.figure(figsize=(10,6))
a=df['Online_Order'].value_counts()
plt.pie(a,explode=(0.0,0.1),labels=a.index,autopct='%1.1f%%',shadow=True)
plt.title('Online Order by Restaurants')
plt.show()

## Conclusion

### Business Insights
1. Frazer Town,Kormangala and Malleshwaram are good places to start a restaurant. 
2. Biryani and waffles are some of the most favourite cuisines.
3. Having an option of online order can boost the chance of popularity and ratings for a restaurant.
4. A place with a price if 1000 for 2 can be a viable option to start for a new restaurant.