Overview This script is a Flask-based application that performs data analysis on an Excel file containing sales data. It dynamically identifies relevant columns, cleans the data, and performs various analyses, saving the results in an Excel file with multiple sheets.
Prerequisites
- Python 3.x installed.
- Required Python libraries: o Flask o pandas o openpyxl o xlsxwriter
- An Excel file (sales_data.xlsx) located at C:/Users/raghu/OneDrive/Desktop/data_analysis/.
- data_cleaning.py module containing the functions clean_data and clean_dates.
Installation and Setup
- Install dependencies using pip: pip install flask pandas openpyxl xlsxwriter
- Ensure the data_cleaning.py module is in the same directory or accessible in the Python path.
- Save your sales data file in the specified directory.
Key Functionalities
- Flask Route • Endpoint: / • Methods: GET, POST • Purpose: Reads the input Excel file, cleans the data, performs analyses, and generates an output Excel file containing the results.
- Data Cleaning Uses the clean_data and clean_dates functions from data_cleaning.py to: • Remove duplicates. • Normalize and standardize date columns.
- Dynamic Column Detection Automatically identifies key columns based on partial matches (e.g., city, total, invoice ID). Raises an error if any required column is missing.
- Analytical Functions • Sales Analysis by Branch: Groups data by branch to calculate total sales, transaction count, and average rating. • Sales Analysis by Product Line: Groups data by product line to calculate total sales, transaction count, gross income, and average rating. • Top 3 Product Lines: Identifies the top 3 product lines by total sales. • Date-Wise Sales: Aggregates total sales and transactions for each date. • City-Wise Sales: Analyzes total sales, transaction count, and average rating by city. • Customer Type Analysis: Groups data by customer type to calculate total sales and average rating. • Top Performing Products: Analyzes product performance by quantity sold and total sales. • Customer Insights: Provides insights based on gender and customer type, including total sales and average rating. • Sales in Date Range: Filters sales data within a specified date range. • Sales on Specific Day: Calculates total sales for a specific date. • Branch Sales on Specific Day: Analyzes branch-level sales for a given date.
- Output Generation • Results are saved in an Excel file (sales_analysis_output.xlsx) with multiple sheets for each analysis. • If a file with the same name exists, a versioned file name is created (e.g., sales_analysis_output(1).xlsx).
Execution Run the script using the following command: python script_name.py • Open your browser and navigate to http://127.0.0.1:5000/. • The results are returned as a JSON response and saved in the output Excel file.
Error Handling The script gracefully handles errors and returns a JSON response with the error message if any issue occurs during execution.
Key Functions and Their Purpose Function Name Description analyze_sales_by_branch Analyzes sales by branch. analyze_sales_by_product_line Analyzes sales by product line. get_top_3_product_lines Identifies top 3 product lines based on sales. analyze_date_wise_sales Aggregates sales data by date. analyze_city_sales Analyzes sales data by city. analyze_customer_type_sales Analyzes sales data by customer type. analyze_top_performing_products Analyzes top-performing products based on quantity and sales. analyze_customer_insights Provides customer insights by gender and type. Filtered_data_by_date_range Filters sales data for a specific date range. specific_day_sale Calculates total sales for a specific date. Filter_branch_sales_by_specific_date Analyzes branch-level sales on a specific date.
Output Example The output file will contain the following sheets:
- Overall Sales by Branches
- Product Line Sales
- Top 3 Product Lines
- Date-Wise Sales
- City-Wise Sales
- Customer Type
- Top Selling Products
- Customer Purchase Behavior
- Date Range Filtered Data
- Specific Day Overall Sales
- Branch Sales on Specific Day
Contact Information For any queries or issues, contact the developer at raghul.officialmail@gmail.com