This project, titled "Data Cleaning and Visualization of Zomato Restaurant Data", analyzes and visualizes a dataset of restaurant information from Zomato. The analysis involves several key steps to prepare the data for visualization and extract meaningful insights.
The notebook focuses on a dataset of restaurant information from the Zomato platform. The analysis involves several key steps to prepare the data for visualization and extract meaningful insights.
- Data Loading and Initial Exploration: The notebook begins by loading the
zomato.csv
file into a pandas DataFrame and then performs initial checks on its shape, columns, and data types. - Data Cleaning and Preprocessing: A significant portion of the notebook is dedicated to cleaning the data. This includes:
- Dropping irrelevant columns such as
url
,menu_item
,reviews_list
,dish_liked
,address
, andphone
to streamline the dataset. - Removing duplicate entries to ensure data integrity.
- Cleaning and transforming the
rate
column by converting it to a numerical format and handling 'NEW' and '-' values as missing data, which are then filled with the mean rate of the dataset. - Standardizing the
approx_cost(for two people)
column by removing commas and converting the values to a float data type. - Grouping restaurant types with low counts into an "others" category to simplify the
rest_type
column for better visualization.
- Dropping irrelevant columns such as
- Data Visualization: The cleaned data is then used to create various visualizations (though the visualizations are not included in the provided output, the code indicates their creation).
The project uses the following Python libraries for data manipulation and visualization:
- pandas for data loading and manipulation
- numpy for numerical operations
- matplotlib.pyplot for plotting and visualization
- seaborn for creating statistical graphics
This notebook provides a comprehensive example of a data cleaning and visualization workflow, demonstrating how to prepare a raw dataset for analysis and gain valuable insights.