Skip to content

baked-bytes/Rossmann-Stores

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rossmann Stores

Rossmann operates over 3,000 drug stores in 7 European countries. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied.

To know more about the dataset, check https://www.kaggle.com/c/rossmann-store-sales

As the dataset lacked in certain cases, like providing information about the location and weather, information about the location was inferred based on holidays, and knowing the location, a weather dataset was merged accordingly.

Link to the datasets used: https://drive.google.com/drive/folders/1XC2Q6fZ58DclicGXP1ajgZD_nW0C9Dyp?usp=sharing

This project is split into 3 phases:

Phase 01

Phase 01 of the project dealt with Data Cleaning, EDA and feature Engineering.

Phase 02

Phase 02 of the project dealt with using various ML models (Multi Linear Regression, Lasso Regression, Gradient Boosted Trees, RNN) to predict Sales of the Rossmann Stores. Among all the models used the gradient boosted trees models (LGBM model) shows most promise, with score of 98%

Phase 03

Phase 03 dealt with using all the data inferred, from the previous phases, to create a simple business dashboard using tableau.

Run the project locally

Requirements

  • Python 3.x
  • Jupyter
  • Required ML libraries & visualisation libraries (scikit-learn, keras, tenserflow, numpy, pandas, seaborn, matplotlib)
  • Tableau Desktop
ipython files
  • Download the ipython files present under code folder of the repo
  • Make sure to change the paths used for reading the datasets accordingly
  • Run all the cells of the jupyter notebook

Note that the first ipython file creates 3 .csv files namely, location.csv, cleaned_weather.csv & final_RossmannSales.csv. final_RossmannSales.csv is used as input for the second ipython file, alternatively, you can download this file from the provided drive link too.

Run the project on Colab

  • Rossmann_Stores_cleaning_EDA_feature_engg.ipynb on Open In Colab
  • predict_RossmannSales.ipynb on Open In Colab

Run tableau playbook locally

  • Download the tableau playbook present under the tableau folder of the repo & final_RossmannSales.csv from the provided drive link.
  • Establish a live data source connection & run

Alternatively, you can check it out on Tableau Online

image

image

image