We are tasked with analyzing a website that publishes blogs about data analytics tools and working mindset. The website currently suffers from data quality issues, where the data is misplaced and not in a clean format. Our objective is to clean this data and provide a cleaned CSV file output that can be further analyzed to gain insights.
- Gather raw data from crawling the website
- Perform inspection to identify issues such as missing values, misplaced data, and inconsistencies in formatting.
- Remove or Fill Missing Values: Identify and handle missing values appropriately.
- Correct Misplaced Data: Adjust any data that is not in the correct format or column.
- Standardize Formats: Ensure consistent data formats (e.g., date formats, text casing).
Save the cleaned and structured data into a CSV file for further analysis.