Skip to content

Raviteja0710/Data-Analysis-using-Python-Libraries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Data Cleaning and Visualization of Zomato Restaurant Data

This project, titled "Data Cleaning and Visualization of Zomato Restaurant Data", analyzes and visualizes a dataset of restaurant information from Zomato. The analysis involves several key steps to prepare the data for visualization and extract meaningful insights.

Project Overview

The notebook focuses on a dataset of restaurant information from the Zomato platform. The analysis involves several key steps to prepare the data for visualization and extract meaningful insights.

Key Features of the Notebook

  • Data Loading and Initial Exploration: The notebook begins by loading the zomato.csv file into a pandas DataFrame and then performs initial checks on its shape, columns, and data types.
  • Data Cleaning and Preprocessing: A significant portion of the notebook is dedicated to cleaning the data. This includes:
    • Dropping irrelevant columns such as url, menu_item, reviews_list, dish_liked, address, and phone to streamline the dataset.
    • Removing duplicate entries to ensure data integrity.
    • Cleaning and transforming the rate column by converting it to a numerical format and handling 'NEW' and '-' values as missing data, which are then filled with the mean rate of the dataset.
    • Standardizing the approx_cost(for two people) column by removing commas and converting the values to a float data type.
    • Grouping restaurant types with low counts into an "others" category to simplify the rest_type column for better visualization.
  • Data Visualization: The cleaned data is then used to create various visualizations (though the visualizations are not included in the provided output, the code indicates their creation).

Libraries Used

The project uses the following Python libraries for data manipulation and visualization:

  • pandas for data loading and manipulation
  • numpy for numerical operations
  • matplotlib.pyplot for plotting and visualization
  • seaborn for creating statistical graphics

Conclusion

This notebook provides a comprehensive example of a data cleaning and visualization workflow, demonstrating how to prepare a raw dataset for analysis and gain valuable insights.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published