Welcome to my Retail-Data-Analytics-Project-Python-SQL-Integration! This repository demonstrates a complete data analysis workflow using Python and SQL, focused on retail order data. It highlights my ability to handle real-world datasets, clean and preprocess data, and derive actionable insights.
This project showcases working with large datasets from extraction to analysis and visualization. I performed data preprocessing in Python, and then loaded the cleaned dataset into an SQL database to perform detailed analysis using SQL queries.
- Used Python and Pandas to clean and preprocess the dataset by:
- Handling missing values
- Formatting and transforming columns
- Removing duplicates
- Saved the cleaned dataset as
cleaned_orders.csv
for database integration.
- Loaded the cleaned CSV file into an MYSQL database.
- Ensured the dataset was ready for querying and analysis.
- Performed exploratory data analysis (EDA) and derived actionable insights using SQL queries, such as:
- Aggregating sales data
- Identifying top-selling products
- Segmenting customers
- Determining peak sales periods
- Python: Proficient use of Pandas for data cleaning and manipulation.
- SQL: Strong command of queries for filtering, aggregation, and analysis.
- ETL Workflow: Implemented a complete Extract-Transform-Load process.
- Problem-Solving: Identified and resolved data quality issues to ensure reliable analysis.
-
Order Data Analysis(EDA).ipynb – Jupyter Notebook for data cleaning and preprocessing.
-
orders_SQL queries.sql – Collection of SQL queries for data analysis.
-
orders.csv – Raw dataset containing retail order information.
-
cleaned_orders_dataset.csv – Preprocessed dataset ready for database loading.
-
project_architecture.png – Visual representation of the project workflow.
-
README.md – Project documentation.
-
Identified top-selling products and their revenue contributions.
-
Analyzed customer purchasing patterns to inform marketing strategies.
-
Determined peak sales periods for inventory management optimization.
-
Segmented customers based on order frequency and value for targeted promotions.
This project demonstrates a strong understanding of the end-to-end data analytics lifecycle—from raw data to actionable insights. It highlights my technical skills, attention to detail, and ability to work with multiple tools, which are essential for a career in data analytics.
Feel free to explore the project and reach out with any questions or feedback. I’m excited to connect with like-minded professionals and recruiters in the data analytics field.