Skip to content
/ DDAM Public

"Distributed Data Analysis and Mining" Class' Team Project - MSc in Data Science and Business Informatics @ University of Pisa

Notifications You must be signed in to change notification settings

Grade0/DDAM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DDAM

"Distributed Data Analysis and Mining" Class' Team Project - MSc in Data Science and Business Informatics @ University of Pisa


Logo

eCommerce behavior Dataset

End of Course Project A.Y. 2023/24
Read the report »

Data source · View code

Team members

About the Dataset

eCommerce behavior data from multi-category store

The dataset taken into account contains behavior data for only 1 month (March 2020) from a large multi-category online store.

Each row in the file represents an event. All events are related to products and users. Each event is like many-to-many relations between products and users.

There are different types of events. Semantics (or how to read it):
User user_id during session user_session added to shopping cart (property event_type is equal cart) product product_id of brand brand of category category_code with price price at event_time.

About the project

This project consists of analyzing a large amount of eCommerce data in order to predict the users' behavior with data mining and Hadoop (Spark) tools.

The project is divided into four parts as follows:

  • Data Reduction
  • Understanding & Preparation
  • Features Extraction
  • Classification

About

"Distributed Data Analysis and Mining" Class' Team Project - MSc in Data Science and Business Informatics @ University of Pisa

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published