Skip to content

SQL project demonstrating data cleaning techniques to transform raw datasets into clean, structured, and analysis-ready data.

Shivangkus/Data-Cleaning-in-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Data Cleaning in SQL

Welcome to the Data Cleaning in SQL repository! This project demonstrates the application of SQL techniques to clean and prepare raw datasets for analysis. It serves as a practical example of how to transform messy data into structured, reliable information using SQL.

🧹 Project Overview

In this project, I focused on cleaning a raw dataset by addressing common data quality issues such as:

  • Removing duplicates
  • Handling missing or NULL values
  • Standardizing data formats
  • Correcting data inconsistencies

The goal was to prepare the dataset for further analysis, ensuring its integrity and reliability.

πŸ› οΈ Tools & Technologies

  • SQL: Utilized SQL queries for data manipulation and cleaning.
  • MySQL Workbench: Executed SQL scripts and managed the database.
  • CSV Files: Worked with CSV files for importing and exporting data.

πŸ“ Repository Structure

  • Dataset/: Contains the raw and cleaned datasets in CSV format.
  • Data Cleaning SQL Project Queries.sql: SQL script file with all the queries used for data cleaning tasks.

πŸ” Key SQL Techniques Used

  • SELECT DISTINCT – To identify and remove duplicate records
  • IS NULL / IS NOT NULL – For detecting and handling missing values
  • UPDATE – To correct data inconsistencies
  • ALTER TABLE – For modifying table structures when necessary

πŸš€ Getting Started

To replicate this project:

  1. Clone the repository:
    git clone https://github.com/Shivangkus/Data-Cleaning-in-SQL.git

Open the Data Cleaning SQL Project Queries.sql file in MySQL Workbench.

Execute the SQL queries step by step to clean the dataset.

Import the cleaned dataset into your preferred analysis tool.

πŸ“ˆ Next Steps After cleaning the data, you can proceed with:

Exploratory Data Analysis (EDA) – To uncover patterns and insights

Data Visualization – Using tools like Tableau or Power BI

Statistical Analysis – For deeper understanding and modeling

πŸ“„ License This project is licensed under the MIT License

About

SQL project demonstrating data cleaning techniques to transform raw datasets into clean, structured, and analysis-ready data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published