This page highlights three Python-based data analytics projects, each focusing on a different part of the data workflow — cleaning, visualization, and web scraping. These projects demonstrate my ability to use Python to transform messy raw data into analysis-ready datasets, generate meaningful visualizations, and collect data programmatically from the web.
This project showcases the complete data cleaning lifecycle using the Pandas library. Starting from a raw CSV/Excel file, I inspected, cleaned, formatted, and standardized fields to produce an organized dataset ready for analysis or dashboard building.
- Loaded local CSV/Excel into Pandas for inspection.
- Identified missing values, errors, and inconsistencies using
.info(),.describe(),.isnull(). - Removed duplicate records + unnecessary columns.
- Cleaned & standardized name fields.
- Re-formatted phone numbers into consistent structure.
- Split address into multiple columns (Street, State, Zip).
- Replaced ambiguous abbreviations with meaningful strings.
- Real-world problem solving in Python.
- Effective choice of Pandas functions for cleaning.
- Transforming messy data into structured, analysis-ready tables.
- Faster manipulation vs Excel-based cleanup.
This project focuses on converting data into clear visual insights using Python visualization libraries. Charts and graphs were generated to communicate trends and comparisons in a visually appealing way.
Project Status: Prepared & completed locally — code uploaded to GitHub repository (not hosted live as HTML yet)
- Created bar charts, line charts and comparative visuals.
- Applied data grouping & aggregations before plotting.
- Focused on visual storytelling using charts (not just code).
This project demonstrates the ability to collect live data directly from web pages using Python.
Project Status: Completed locally — repository uploaded, not live-hosted.
- Used requests + parsing logic to extract data from websites.
- Converted scraped lists into structured Pandas DataFrames.
- Exported cleaned scraped data for analysis or storage.
- Python
- Pandas
- Data Cleaning / Data Wrangling
- String Processing & Regex
- Column Formatting & Structuring
- Matplotlib
- Seaborn
These Python-based projects demonstrate the essential stages of the data analytics pipeline — data acquisition, cleaning, transformation, visualization, and interpretation. They show my ability to use Python and Pandas to:
- Understand messy datasets
- Clean & structure raw data
- Visualize insights through charts & metrics
- Generate real analytical value