kamibrenda / data_cleaning_sql Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

For all my sql projects and tutorials as i advance my sql skills

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
Data-exp-proj		Data-exp-proj
hackerrank_ques		hackerrank_ques
scripts		scripts
README.md		README.md

Repository files navigation

Data Cleaning Project with SQL

Objectives

Removing Duplicates
Standardising the data
Evaluating Null or blank values
Removing any unnecessary columns

Results

Display of duplicate records in the table

Dealing with redundant data which is demonstrated by the row number indicating the same record occurs more than once and thus dealt with by eliminating them
Standardising the data

Finding issues in the data and fixing it such as changing the date and time format to 'YYYY-MM-DD'
Evaluating Null or blank values

Removing null columns and blank values in the staged data for which has been demonstrated by the industry column below.

Results Display

Partition By results
Substrings + Use cases

with Fuzzymatch
Window Functions vs Group By

with rolling total

Starts at a specific value and adds on values from subsequent rows based on the partitions. In this case the starting point is Pam's salary which is conseuently added to Angela's salary to get the 83k and so forth till the final value of 124k. This is partitioned by the unique value of gender in male vs female hence from Jim starts at 45k and the rule applies to get the final value of 313k.

with row_num, rank and dense_rank

About

For all my sql projects and tutorials as i advance my sql skills

Report repository

Releases

No releases published

Packages

No packages published

Languages

TSQL 100.0%