-
Designed data preprocessing pipeline to join, clean, and aggregate about 420,000 samples of sales data from Kaggle using Spark SQL. Prepared data into coherent time-series format;
-
Established pipeline for time-series seasonality differencing, feature engineering, dense vectorization using customized helper functions and Spark ML packages in Scala on Databricks;
-
Developed stochastic SMA model, validated and tuned multiple ML models (such as RF, LR, GBT) using rolling k-fold CV. GBT achieved best SMAPE scores of less than 5% with a 1-day prediction window.
-
Notifications
You must be signed in to change notification settings - Fork 0
liyongh1/Department-Store-Sales-Time-Series-Forecasting
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description or website provided.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published