Skip to content

This is a Time Series Analysis project where time series based data is used to extract patterns for predictions and other characteristics of the data. It uses a model for forecasting future values in a small time frame based on previous observations.

Notifications You must be signed in to change notification settings

kp3283/SuperStore-Sale-Predictor

Repository files navigation

SuperStore Sales

This is a project where I I want to find :

  1. What is the overall trend of the sales?
  2. What are the top 10 products by sales?
  3. Which is the most performing Segment?
  4. What is the most preferred Ship Mode?
  5. Which are the most profitable category and Sub-category?

And predict my future sales.

Data Collection:

This Data is collected from Kaggle Website. (https://www.kaggle.com/rohitsahoo/sales-forecasting)

Data Exploration and Cleaning:

  1. We drop the following columns : 'Row ID', 'Customer Name', 'Country', 'Product Name',Order ID' and 'Customer ID'.
  2. Date is converted from character to Date format.
  3. For using few of the categorical values , I have used them by implementing onehot_encode technique.

Data Visualization and Analysis:

  1. We extract the month and year from the date column to independent columns.

  2. Investigating and study the trend of the sale and grouping the data to get monthly sales.Sales per month for each year is plotted. image

  3. Top 10 products getting higher sales. image

  4. Plot most performing segments.

image

5.Plot most prefered Shipment mode.

image

6.Most sold products per state.

image

  1. Most profitable categories and sub categories.

image

Developing Model:

  1. Split and scale the dataset into 4 data frame of : X_train, X_test,Y_train, Y_test.
  2. We have used 2 Input layers of keras tensor (Neural Network) to train the model and predict the Sales value.

Results:

The value of prediction is near to the observed value. The value of coefficient of determination(i.e how much disperse the data is there in our dataset) is greater than 0 , this indicates that the prediction has not failed, but it is not the best.

Key Learnings:

  1. Keras , one of the efficient library to build and train models.
  2. Our model can be improved by increasing the number of neurons that can increase the value of coefficient of determination.
  3. Use of Standard Scaler to scale the dataset while building the model.

About

This is a Time Series Analysis project where time series based data is used to extract patterns for predictions and other characteristics of the data. It uses a model for forecasting future values in a small time frame based on previous observations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published