Skip to content

For all my sql projects and tutorials as i advance my sql skills

Notifications You must be signed in to change notification settings

kamibrenda/data_cleaning_sql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Cleaning Project with SQL

Objectives

  1. Removing Duplicates
  2. Standardising the data
  3. Evaluating Null or blank values
  4. Removing any unnecessary columns

Results

  1. Display of duplicate records in the table

    Dealing with redundant data which is demonstrated by the row number indicating the same record occurs more than once and thus dealt with by eliminating them

    image

  2. Standardising the data

    Finding issues in the data and fixing it such as changing the date and time format to 'YYYY-MM-DD'

    image

  3. Evaluating Null or blank values

    Removing null columns and blank values in the staged data for which has been demonstrated by the industry column below.

image

Results Display

  1. Partition By results

    image

  2. Substrings + Use cases

    image

    with Fuzzymatch

    image

  3. Window Functions vs Group By

    image

    with rolling total

    Starts at a specific value and adds on values from subsequent rows based on the partitions. In this case the starting point is Pam's salary which is conseuently added to Angela's salary to get the 83k and so forth till the final value of 124k. This is partitioned by the unique value of gender in male vs female hence from Jim starts at 45k and the rule applies to get the final value of 313k.

    image

    with row_num, rank and dense_rank

    image

About

For all my sql projects and tutorials as i advance my sql skills

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages