This project implements an XGBoost regression model to forecast sales for various products across different stores in the BigMart retail chain.
The goal is to create a model that helps BigMart understand which product and store characteristics have the biggest impact on sales, allowing them to optimize their strategies. π
The project utilizes publicly available product and store data, accessible on Kaggle: BigMart Sales Data.The dataset includes information on:
- 1559 unique items
- 10 stores
- Product Information:
- Item_Identifier (Unique product ID)
- Item_Weight (Weight of product)
- Item_Fat_Content (Low fat or not)
- Item_Visibility (% of display area allocated to the product)
- Item_Type (Category of the product)
- Item_MRP (Maximum Retail Price)
- Store Information:
- Outlet_Identifier (Unique store ID)
- Outlet_Establishment_Year (Year the store opened)
- Outlet_Size (Square footage of the store)
- Outlet_Location_Type (Type of city)
- Outlet_Type (Grocery store or supermarket)
- Sales Data:
- Item_Outlet_Sales (Sales figures for each product at each store)
The project employs XGBoost, a powerful gradient boosting library, to build a regression model that predicts future sales.
This project offers BigMart valuable insights into their sales data, allowing them to:
- Optimize product placement and promotions based on factors influencing sales.
- Develop targeted sales strategies considering store location and customer demographics.
- Make data-driven decisions for improved profitability.
- Clone this repository:
git clone https://github.com/amangupta143/BigMart-Sales-Prediction.git
- Install required dependencies:
pip install numpy pandas matplotlib seaborn xgboost
- Run the analysis script:
jupyter notebook BigMart_Sales_Prediction.ipynb
Note: This command assumes you have Jupyter Notebook installed. If you don't, you can install it using pip install jupyter
and then run the script by opening it in your web browser using jupyter notebook
.
Happy coding! π