Skip to content

MarcellaHarr/favorita-store-sales

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Corporación Favorita Analysis

source: https://marcasecuador.club/corporacion-favorita/


Introduction

The purpose of this repository is to showcase a time-series forecasting model that predicts store sales of a major supermarket corporation based in South America. This Kaggle "Getting Started" competition runs indefinitely with a rolling leaderboard. My goal is to build a model that more accurately predicts the unit sales for thousands of products sold at various Favorita stores. I am also looking to gain a deeper understanding of the data and the corporation through a full statistical analysis. This will give me the opportunity to practice my machine-learning skills, enter my first competition on Kaggle, and gain real-world experience.


Data

As an individual participating in this competition, I will need to predict sales for the thousands of product families sold at Favorita stores located in Ecuador. The training dataset includes information on the dates, stores, and products sold, whether a product was being promoted, and the sales numbers. Additionally, there are supplementary files that may be useful in building my models. These are the datasets used:

1. train.csv dataset:
Column Name Description
store_nbr Identifies the store at which the products are sold
family Identifies the type of product sold
sales Provides the total sales for a product family at a particular store on a given date. Note that fractional values are possible since products can be sold in fractional units (e.g., 1.5 kg of cheese)
onpromotion Indicates the total number of items in a product family that were being promoted at a store on a given date



2. test.csv dataset:
Column name Description
id Unique ID for the row
date Date of the sales forecast
store_nbr Identifies the store at which the products are sold
family Identifies the type of product sold
onpromotion Total number of items in a product family that were being promoted at a store on a given date



3. sample_submission.csv dataset:
Column name Description
id Unique ID for the row
sales Predicted sales for the corresponding id in test.csv



4. stores.csv dataset:
Column name Description
store_nbr Identifies the store
city City where the store is located
state State where the store is located
type Type of store (A, B, or C)
cluster Grouping of similar stores



5. oil.csv dataset:
Column name Description
date Date of the oil price
dcoilwtico Daily oil prices (West Texas Intermediate)



6. holidays_events.csv dataset:
Column name Description
date Date of the holiday/event
type Type of holiday/event (Holiday, Additional, Bridge, or Transfer)
locale Locale of the holiday/event (National, Regional, or Local)
locale_name Name of the holiday/event locale
description Description of the holiday/event
transferred Indicates whether a holiday was officially transferred to another date by the government

Citation     Alexis Cook, DanB, inversion, Ryan Holbrook. (2021). Store Sales - Time Series Forecasting. Kaggle.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published