Skip to content

Koder-khalil/sql_clean_data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

sql_clean_data

In this project I will try clean data and practice sql commands.

Housing Data Cleaning

This project demonstrates how to clean and transform raw housing data using SQL.

Overview

The dataset used is housing_data. The cleaning process involves:

  1. Handling missing values in propertyaddress.
  2. Splitting propertyaddress and owneraddress into separate columns (address, city, state).
  3. Standardizing values in the soldasvacant column (converting Y/N to Yes/No).
  4. Checking and ensuring no nulls in the saleprice column.
  5. Removing duplicate records based on multiple attributes.
  6. Creating a clean view of the dataset (clean_data).

Output

The final cleaned dataset can be exported as:

housing_clean.csv

Export Example

PostgreSQL

\copy (SELECT * FROM clean_data) TO 'housing_clean.csv' CSV HEADER;

MySQL

SELECT *
INTO OUTFILE '/path/to/housing_clean.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM clean_data;

SQLite

.headers on
.mode csv
.output housing_clean.csv
SELECT * FROM clean_data;
.output stdout

Files

  • data_cleaning.sql : SQL script for cleaning the Nashville housing dataset.
  • housing_clean.csv : Final cleaned dataset (to be generated after running the export).

Author

Data Cleaning SQL Project

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published