# Location Based Analysis Of Restaurant Data

## Step 1: Import Necessary Libraries

In [8]:
import pandas as pd

Description: This block imports the necessary library for data manipulation.

## Step 2: Load the Dataset

In [9]:
# Load the dataset
file_path = 'dataset.csv'
dataset = pd.read_csv(file_path)

# Display the first few rows of the dataset to understand its structure
dataset.head()


Unnamed: 0,Restaurant ID,Restaurant Name,Country Code,City,Address,Locality,Locality Verbose,Longitude,Latitude,Cuisines,...,Currency,Has Table booking,Has Online delivery,Is delivering now,Switch to order menu,Price range,Aggregate rating,Rating color,Rating text,Votes
0,6317637,Le Petit Souffle,162,Makati City,"Third Floor, Century City Mall, Kalayaan Avenu...","Century City Mall, Poblacion, Makati City","Century City Mall, Poblacion, Makati City, Mak...",121.027535,14.565443,"French, Japanese, Desserts",...,Botswana Pula(P),Yes,No,No,No,3,4.8,Dark Green,Excellent,314
1,6304287,Izakaya Kikufuji,162,Makati City,"Little Tokyo, 2277 Chino Roces Avenue, Legaspi...","Little Tokyo, Legaspi Village, Makati City","Little Tokyo, Legaspi Village, Makati City, Ma...",121.014101,14.553708,Japanese,...,Botswana Pula(P),Yes,No,No,No,3,4.5,Dark Green,Excellent,591
2,6300002,Heat - Edsa Shangri-La,162,Mandaluyong City,"Edsa Shangri-La, 1 Garden Way, Ortigas, Mandal...","Edsa Shangri-La, Ortigas, Mandaluyong City","Edsa Shangri-La, Ortigas, Mandaluyong City, Ma...",121.056831,14.581404,"Seafood, Asian, Filipino, Indian",...,Botswana Pula(P),Yes,No,No,No,4,4.4,Green,Very Good,270
3,6318506,Ooma,162,Mandaluyong City,"Third Floor, Mega Fashion Hall, SM Megamall, O...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.056475,14.585318,"Japanese, Sushi",...,Botswana Pula(P),No,No,No,No,4,4.9,Dark Green,Excellent,365
4,6314302,Sambo Kojin,162,Mandaluyong City,"Third Floor, Mega Atrium, SM Megamall, Ortigas...","SM Megamall, Ortigas, Mandaluyong City","SM Megamall, Ortigas, Mandaluyong City, Mandal...",121.057508,14.58445,"Japanese, Korean",...,Botswana Pula(P),Yes,No,No,No,4,4.8,Dark Green,Excellent,229


Description: This block loads the dataset from the specified file path and displays the first few rows to understand the structure of the data.

## Step 3: Explore the Latitude and Longitude Coordinates

In [10]:
# Describe the latitude and longitude columns
lat_long_desc = dataset[['Latitude', 'Longitude']].describe()
lat_long_desc


Unnamed: 0,Latitude,Longitude
count,9551.0,9551.0
mean,25.854381,64.126574
std,11.007935,41.467058
min,-41.330428,-157.948486
25%,28.478713,77.081343
50%,28.570469,77.191964
75%,28.642758,77.282006
max,55.97698,174.832089


Description: This block provides a statistical summary of the latitude and longitude coordinates.

## Step 4: Group Restaurants by City and Analyze Concentration

In [11]:
# Group by city and count the number of restaurants in each city
city_group = dataset.groupby('City').size().reset_index(name='Number of Restaurants')
city_group = city_group.sort_values(by='Number of Restaurants', ascending=False)
city_group


Unnamed: 0,City,Number of Restaurants
88,New Delhi,5473
50,Gurgaon,1118
89,Noida,1080
43,Faridabad,251
48,Ghaziabad,25
...,...,...
37,Dicky Beach,1
68,Lorn,1
107,Quezon City,1
66,Lincoln,1


Description: This block groups the restaurants by city, counts the number of restaurants in each city, and sorts the results in descending order.


## Step 5: Calculate Statistics by City

In [12]:
# Calculate average ratings by city
avg_ratings = dataset.groupby('City')['Aggregate rating'].mean().reset_index(name='Average Rating')

# Calculate the number of unique cuisines by city
unique_cuisines = dataset.groupby('City')['Cuisines'].nunique().reset_index(name='Number of Unique Cuisines')

# Merge the statistics into a single DataFrame
city_stats = pd.merge(avg_ratings, unique_cuisines, on='City')

# Sort by average rating
city_stats = city_stats.sort_values(by='Average Rating', ascending=False)
city_stats


Unnamed: 0,City,Average Rating,Number of Unique Cuisines
56,Inner City,4.900000,2
107,Quezon City,4.800000,1
73,Makati City,4.650000,2
95,Pasig City,4.633333,3
75,Mandaluyong City,4.625000,4
...,...,...,...
88,New Delhi,2.438845,892
83,Montville,2.400000,1
78,Mc Millan,2.400000,1
89,Noida,2.036204,248


Description: This block calculates the average ratings and the number of unique cuisines by city and sorts the results by average rating.



## Step 6: Identify Interesting Insights or Patterns

In [13]:
# Insights from the data
top_cities_by_restaurants = city_group.head(5)
top_cities_by_ratings = city_stats.head(5)
top_cities_by_cuisines = city_stats.sort_values(by='Number of Unique Cuisines', ascending=False).head(5)

insights = {
    "Top 5 Cities by Number of Restaurants": top_cities_by_restaurants,
    "Top 5 Cities by Average Rating": top_cities_by_ratings,
    "Top 5 Cities by Number of Unique Cuisines": top_cities_by_cuisines
}

for insight, data in insights.items():
    print(f"\n{insight}\n{'-'*len(insight)}\n{data.to_string(index=False)}")



Top 5 Cities by Number of Restaurants
-------------------------------------
     City  Number of Restaurants
New Delhi                   5473
  Gurgaon                   1118
    Noida                   1080
Faridabad                    251
Ghaziabad                     25

Top 5 Cities by Average Rating
------------------------------
            City  Average Rating  Number of Unique Cuisines
      Inner City        4.900000                          2
     Quezon City        4.800000                          1
     Makati City        4.650000                          2
      Pasig City        4.633333                          3
Mandaluyong City        4.625000                          4

Top 5 Cities by Number of Unique Cuisines
-----------------------------------------
     City  Average Rating  Number of Unique Cuisines
New Delhi        2.438845                        892
  Gurgaon        2.651431                        362
    Noida        2.036204                        248
Farid

Description: This block identifies and prints interesting insights from the data, including the top 5 cities by the number of restaurants, average rating, and the number of unique cuisines.

## Detailed City-Level Restaurant Statistics


In [15]:
import pandas as pd

# Load the dataset
file_path = 'dataset.csv'
dataset = pd.read_csv(file_path)

# Calculate detailed statistics by city
city_stats_detailed = dataset.groupby('City').agg(
    Number_of_Restaurants=pd.NamedAgg(column='Restaurant ID', aggfunc='count'),
    Average_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='mean'),
    Median_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='median'),
    Number_of_Cuisines=pd.NamedAgg(column='Cuisines', aggfunc='nunique'),
    Average_Price_Range=pd.NamedAgg(column='Price range', aggfunc='mean'),
    Total_Votes=pd.NamedAgg(column='Votes', aggfunc='sum')
).reset_index()

# Sort by number of restaurants
city_stats_detailed = city_stats_detailed.sort_values(by='Number_of_Restaurants', ascending=False)

# Display the detailed city-level restaurant statistics
print(city_stats_detailed.to_string(index=False))


                  City  Number_of_Restaurants  Average_Rating  Median_Rating  Number_of_Cuisines  Average_Price_Range  Total_Votes
             New Delhi                   5473        2.438845           3.10                 892             1.621597       628340
               Gurgaon                   1118        2.651431           3.20                 362             1.855993       132160
                 Noida                   1080        2.036204           2.80                 248             1.601852        73488
             Faridabad                    251        1.866932           2.80                  87             1.454183         6486
             Ghaziabad                     25        2.852000           3.20                  18             1.800000         2366
          Bhubaneshwar                     21        3.980952           4.00                  18             1.857143         4243
               Lucknow                     21        4.195238           4.20       

## Analyzing Ratings by Price Range

In [17]:
import pandas as pd

# Load the dataset
file_path = 'dataset.csv'
dataset = pd.read_csv(file_path)

# Calculate average and median ratings by price range
ratings_by_price = dataset.groupby('Price range').agg(
    Average_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='mean'),
    Median_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='median'),
    Number_of_Restaurants=pd.NamedAgg(column='Restaurant ID', aggfunc='count')
).reset_index()

# Display the ratings by price range
print(ratings_by_price.to_string(index=False))


 Price range  Average_Rating  Median_Rating  Number_of_Restaurants
           1        1.999887            2.9                   4444
           2        2.941054            3.3                   3113
           3        3.683381            3.8                   1408
           4        3.817918            3.9                    586


### Importing the data into html file for representation 

In [19]:
import pandas as pd

# Load the dataset
file_path = 'dataset.csv'
dataset = pd.read_csv(file_path)

# Calculate detailed statistics by city
city_stats_detailed = dataset.groupby('City').agg(
    Number_of_Restaurants=pd.NamedAgg(column='Restaurant ID', aggfunc='count'),
    Average_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='mean'),
    Median_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='median'),
    Number_of_Cuisines=pd.NamedAgg(column='Cuisines', aggfunc='nunique'),
    Average_Price_Range=pd.NamedAgg(column='Price range', aggfunc='mean'),
    Total_Votes=pd.NamedAgg(column='Votes', aggfunc='sum')
).reset_index()

# Sort by number of restaurants
city_stats_detailed = city_stats_detailed.sort_values(by='Number_of_Restaurants', ascending=False)

# Calculate average and median ratings by price range
ratings_by_price = dataset.groupby('Price range').agg(
    Average_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='mean'),
    Median_Rating=pd.NamedAgg(column='Aggregate rating', aggfunc='median'),
    Number_of_Restaurants=pd.NamedAgg(column='Restaurant ID', aggfunc='count')
).reset_index()

# Create an HTML file and save the data
html_content = f"""
<!DOCTYPE html>
<html>
<head>
    <title>Restaurant Analysis Report</title>
    <style>
        body {{
            font-family: Arial, sans-serif;
        }}
        h1, h2 {{
            color: #2E4053;
        }}
        table {{
            width: 100%;
            border-collapse: collapse;
        }}
        table, th, td {{
            border: 1px solid black;
        }}
        th, td {{
            padding: 8px;
            text-align: left;
        }}
        th {{
            background-color: #f2f2f2;
        }}
    </style>
</head>
<body>
    <h1>Restaurant Analysis Report</h1>

    <h2>Detailed City-Level Restaurant Statistics</h2>
    {city_stats_detailed.to_html(index=False, classes='table table-striped')}

    <h2>Ratings by Price Range</h2>
    {ratings_by_price.to_html(index=False, classes='table table-striped')}
</body>
</html>
"""

# Save the HTML content to a file with UTF-8 encoding
with open('restaurant_analysis_report.html', 'w', encoding='utf-8') as file:
    file.write(html_content)

print("HTML report has been generated and saved successfully.")


HTML report has been generated and saved successfully.
