Data preprocessing & Data cleaning project
-
Updated
May 13, 2024 - HTML
Data preprocessing & Data cleaning project
This is a project under the domain of Data Science using various techniques
A case study about a bike-share company to help understand the difference between different users
Explore global drug seizures from 1980 to 2020 through our data analysis project. We delve into drug confiscation data, revealing trends, hotspots, and the evolution of seized drug types. Using visualization, analysis, and dashboards, we provide insights into the fight against drug trafficking.
A Heart Diseases Prediction built using Django web applications. Users can check whether it suffer heart diseases or not.
This project utilizes R to preprocess Spotify's "Unpopular Songs" and "Genre of Artists" datasets from Kaggle. Following tidy data principles, it handles duplicates, transforms variables, scans for outliers, and normalizes data. The resulting clean dataset is ready for statistical analysis, ensuring accurate and ethical data practices.
Fundamental R skills. From importing and transforming data to mastering relational data and data visualization, these challenges offer hands-on experience for a robust foundation in R.
Данный проект направлен на демонстрацию основных принципов анализа, преобразования, очистки и визуализации данных
Sprocket Central Pty Ltd is a long-standing KPMG client who specializes in high-quality bikes and accessible cycling accessories to riders. Their marketing team is looking to boost business by analyzing their existing customer dataset to determine customer trends and behavior.
An analysis of Seattle rain data collected at SeaTac International Airport from 1948 to 2017. *PLEASE SEE README.MD BELOW FOR THE FULL REPORT*
A case study I completed as part of the final course in the Google Data Analytics Professional Certificate program on Coursera. *PLEASE SEE README.MD BELOW FOR THE FULL REPORT*
A collection of 'dirty' datasets that I cleaned and analysed
We have to keep adress content from any given html file and remove as much other content from the input html as possible.
This is a GitHub repository with a project on loan portfolio analysis and communication for a financial institution using Python, Pandas, and Matplotlib. It provides insights into loan distribution and customer demographics, as well as communication effectiveness.
For the kaggle project - Loan Interest Rate Prediction, the repository contains the deployment details, project documentation file and project link.
For the kaggle project - Credit Card Lead Prediction, the repository contains the deployment details, project documentation file and project link.
Scrapped data about top data analytics firms from goodfirms website
Using R to perform Multiple linear regression (R) on a used car prices dataset.
This repo explores correlations between NBA player salary and on-court performance data, using stats between 2000 to 2022 regular season. It also illustrates how you can conduct web-scraping and simple data analysis in Python.
This repo consist of projects on: Data Wrangling, Data Visualization and Machine Learning
Add a description, image, and links to the datacleaning topic page so that developers can more easily learn about it.
To associate your repository with the datacleaning topic, visit your repo's landing page and select "manage topics."