Skip to content

This project utilizes R to preprocess Spotify's "Unpopular Songs" and "Genre of Artists" datasets from Kaggle. Following tidy data principles, it handles duplicates, transforms variables, scans for outliers, and normalizes data. The resulting clean dataset is ready for statistical analysis, ensuring accurate and ethical data practices.

Notifications You must be signed in to change notification settings

yongpuitung/Spotify-Data-Preprocessing

Repository files navigation

Data Wrangling using R

In this project, I utilize R to preprocess Spotify's "Unpopular Songs" and "Genre of Artists" datasets from Kaggle. Following Hadley Wickham’s “Tidy Data” principles, I have cleaned up all types of messy data to ensure the resulting clean dataset is ready for statistical analysis, ensuring accurate and ethical data practices.

Check out the detailed data preprocessing in the code.

You can download the datasets from Kaggle.

Link 1

Screenshot of unpopular songs csv

Link 2

Screenshot of z_genre_of_artists csv

In the project, I applied various data cleaning techniques, including:

  • Removing Duplicates
  • Verifying Data Structures
  • Ensuring
    • Every column is a variable.
    • Every row is an observation.
    • Every cell is a single value.
  • Scanning for Missing Values
  • Scanning for Special Values
  • Scanning for Errors
  • Scanning for Outliers

This project is derived from my own RMIT Master of Analytics assignment in the “Data Wrangling” course (2022). It has been slightly modified and refined to showcase my data preprocessing techniques using R.

Feel free to read the step-by-step explanation in my blog.

Screenshot of medium blog

About

This project utilizes R to preprocess Spotify's "Unpopular Songs" and "Genre of Artists" datasets from Kaggle. Following tidy data principles, it handles duplicates, transforms variables, scans for outliers, and normalizes data. The resulting clean dataset is ready for statistical analysis, ensuring accurate and ethical data practices.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages