This is a project where I I want to find :
- What is the overall trend of the sales?
- What are the top 10 products by sales?
- Which is the most performing Segment?
- What is the most preferred Ship Mode?
- Which are the most profitable category and Sub-category?
And predict my future sales.
This Data is collected from Kaggle Website. (https://www.kaggle.com/rohitsahoo/sales-forecasting)
- We drop the following columns : 'Row ID', 'Customer Name', 'Country', 'Product Name',Order ID' and 'Customer ID'.
- Date is converted from character to Date format.
- For using few of the categorical values , I have used them by implementing onehot_encode technique.
-
We extract the month and year from the date column to independent columns.
-
Investigating and study the trend of the sale and grouping the data to get monthly sales.Sales per month for each year is plotted.
-
Plot most performing segments.
5.Plot most prefered Shipment mode.
6.Most sold products per state.
- Most profitable categories and sub categories.
- Split and scale the dataset into 4 data frame of : X_train, X_test,Y_train, Y_test.
- We have used 2 Input layers of keras tensor (Neural Network) to train the model and predict the Sales value.
The value of prediction is near to the observed value. The value of coefficient of determination(i.e how much disperse the data is there in our dataset) is greater than 0 , this indicates that the prediction has not failed, but it is not the best.
- Keras , one of the efficient library to build and train models.
- Our model can be improved by increasing the number of neurons that can increase the value of coefficient of determination.
- Use of Standard Scaler to scale the dataset while building the model.