# Top Five Restaurants of Each Cuisine

I created a map to visualize the positions of the top five restaurants of each cuisine. I was interested in seeing whether highly rated restaurants that serve the same cuisine were located near each other. My criteria for the top five restaurants were restaurants that had at least 50 reviews, and had the highest rating. If a tie-breaker was needed, the restaurant with more votes was chosen.

In [180]:
import pandas as pd

df = pd.read_csv('2020-XTern-DS.csv')

To find the top five restaurants, I would need to use the rating, number of votes, and number of reviews. I dropped any rows of data that didn't have data for these three categories. I then converted the types to float so that I could compare them. To make sure that the rating was a good representation of the number of votes, I limited the dataset to restaurants that had more than 50 reviews.

In [176]:
df['Rating'] = df['Rating'].astype(str) 

df = df[~df['Rating'].str.contains('-')]
df = df[~df['Rating'].str.contains('NEW')]
df = df[~df['Rating'].str.contains('Opening Soon')]
df['Rating'] = df['Rating'].astype(float) 


df = df[~df['Votes'].str.contains('-')]
df['Votes'] = df['Votes'].astype(float)

df = df[~df['Reviews'].str.contains('-')]
df['Reviews'] = df['Reviews'].astype(float)
df = df[df['Reviews'] > 50]

I looped through the dataset to create a dictionary with the cuisine as the key and the number of restaurants that serve that cuisine as the value. I appended all of the cuisines with 20 or less restaurants to a list so that I could remove them from the dataset. I was looking for the top five, and I didn't want any of the restaurants to be placed there by default. I felt that there needed to be more than 20 restaurants for a top five to be significant.

In [177]:
cuisine_count = {}

for row in df.itertuples():
    cuisines = row.Cuisines
    for cuisine in cuisines.split(", "):
        if cuisine in cuisine_count:
            cuisine_count[cuisine] += 1
        else:
            cuisine_count[cuisine] = 1

cuisine_low_count = []
for cuisine in cuisine_count:
    if cuisine_count[cuisine] <= 20:
        cuisine_low_count.append(cuisine)
        
for cuisine in cuisine_low_count:
    cuisine_count.pop(cuisine)


I created the map and then added markers to it. This was where I chose the top five restaurants by rating as well. If there were more than five restaurants chosen already, then the tie-breaker would be whichever restaurant had more votes. Each cuisine had a different color marker, so that the top five could be easily seen.

In [178]:
import folium

m = folium.Map(location = [39.553436, -85.589212], zoom_start = 10)

colors = ['lightgray', 'blue', 'green', 'purple', 'orange', 'darkred', 'lightred', 'gray', 'darkblue', 'darkgreen', 'cadetblue', 'darkpurple', 'white', 'pink', 'lightblue', 'lightgreen', 'beige', 'black', 'red']
result_df = pd.DataFrame(columns = ['Restaurant', 'Latitude', 'Longitude', 'Cuisines', 'Average_Cost', 'Minimum_Order', 'Rating', 'Votes', 'Reviews', 'Cook_Time'])
color_index = -1
for cuisine in cuisine_count:
    df_cuisine = df[df['Cuisines'].str.contains(cuisine)]
    count = 0
    color_index += 1
    while count < 5:
        df_max = df_cuisine[df_cuisine['Rating'] == df_cuisine['Rating'].max()]
        result_df.append(df_max, ignore_index = True)
        count += len(df_max)
        if count > 5:
            x = count - 5
            for i in range(x):
                df_max = df_max[df_max['Votes'] != df_max['Votes'].min()]
        if color_index == 0:
            df_result = df_max
        else:
            df_result.append(df_max)
        for row in df_max.itertuples():
            rating = row.Rating
            latitude = row.Latitude
            longitude = row.Longitude
            restaurant = row.Restaurant
            folium.Marker(location=[latitude, longitude], icon=folium.Icon(color=colors[color_index]), popup=f'<i>Restaurant: {restaurant} \n Rating: {rating} \n Cuisine: {cuisine}</i>').add_to(m)
        df_cuisine = df_cuisine[df_cuisine['Rating'] != df_cuisine['Rating'].max()]


In [179]:
m

### Conclusion

I was expecting that the restaurants that are of the same cuisine would be spread apart from each other. Since the restaurants being represented have the highest rating for its cuisine, I predicted that they wouldn't be close to each other, otherwise there would be competition. From the map, I found this to be mostly true. I did find that Restaurant 18 and 7185, which are of the same cuisine pizza appear to be on the same block and also have the same rating. Besides these two points, I found that the top 5 restaurants of each cuisine appear to be fairly spaced out from each other.