Link: https://www.kaggle.com/code/kseniyavishnevskaya/predict-future-sales-eda
We are provided with daily historical sales data. Our task is to analyse the data and highlight interesting features.
File descriptions:
sales_train.csv
- the training set. Daily historical data from January 2013 to October 2015.test.csv
- the test set. You need to forecast the sales for these shops and products for November 2015.sample_submission.csv
- a sample submission file in the correct format.items.csv
- supplemental information about the items/products.item_categories.csv
- supplemental information about the items categories.shops.csv
- supplemental information about the shops.
Data fields:
ID
- an Id that represents a (Shop, Item) tuple within the test setshop_id
- unique identifier of a shopitem_id
- unique identifier of a productitem_category_id
- unique identifier of item categoryitem_cnt_day
- number of products sold. You are predicting a monthly amount of this measureitem_price
- current price of an itemdate
- date in format dd/mm/yyyydate_block_num
- a consecutive month number, used for convenience. January 2013 is 0, February 2013 is 1,..., October 2015 is 33item_name
- name of itemshop_name
- name of shopitem_category_name
- name of item category
Tools: Python🐍 - pandas, numpy, plotly, matplotlib, seaborn