# Exploratory Data Analysis

In this notebook, we will perform exploratory data analysis (EDA) on the restaurant reviews and popularity data collected from Google Maps in Bandar Lampung. The goal is to understand the data better, identify trends, and visualize key insights.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style='whitegrid')

In [2]:
# Load the processed data
data_path = '../data/processed/processed_data.csv'
df = pd.read_csv(data_path)

# Display the first few rows of the dataframe
df.head()

In [3]:
# Summary statistics
df.describe()

In [4]:
# Visualize the distribution of ratings
plt.figure(figsize=(10, 6))
sns.histplot(df['rating'], bins=10, kde=True)
plt.title('Distribution of Restaurant Ratings')
plt.xlabel('Rating')
plt.ylabel('Frequency')
plt.show()

In [5]:
# Analyze the relationship between ratings and number of reviews
plt.figure(figsize=(10, 6))
sns.scatterplot(x='number_of_reviews', y='rating', data=df)
plt.title('Ratings vs Number of Reviews')
plt.xlabel('Number of Reviews')
plt.ylabel('Rating')
plt.show()

## Conclusion

In this exploratory analysis, we have visualized the distribution of restaurant ratings and examined the relationship between ratings and the number of reviews. Further analysis can be conducted to derive more insights.