BGBB model - Data structures to fit models #88

pschil · 2020-04-27T20:30:55Z

While implementing the BGBB model, it became clear that the transaction history does not suffice as input for all models. Because the BGBB model is for a discrete-time setting, it requires additional information on the transaction opportunities, potentially for each customer differently.

Providing this functionality through the existing clv.data object which represents the full transaction history blurs the lines of responsibility (=Single Responsibility Principle). It would require all kinds of internal case differentiations in clv.data (ie has transaction opportunities or not?) that hamper maintenance. It would also create a much more challenging user-interface although this functionality is in fact only used for a single model.

Rather one class should do one thing only and do it well. Therefore, to simplify usage and encapsulate distinct functionality into separate objects, I suggest to separate the transaction opportunity functionality from the transaction history:

clv.transactions
This is the full transaction history of each customer which allows to add static and dynamic covariate data. This is what clv.data currently is.

clv.transaction.opportunities
A separate data structure that contains the transactions opportunities for every customer, potentially a duration in case TOs stretch over a period (ie a TO is a week). In combination with clv.transactions this can be used to fit a discrete-time model.

Usage
Fitting a continuous time model remains the same while for discrete time models, it would required an additional input.

clv.trans <- clv.transactions(cdnow, "ymd", "w", 37)
pnbd(clv.trans)

clv.TO <- clv.transaction.opportunities(table)
bgbb(clv.trans, clv.TO)

Another common use case is that end users do not have the full transaction history because it can be huge. Rather users are given a summary of all transactions pulled from some DB (last transaction, number of transaction, mean spending, etc). To support this use case, I suggest to add data structures:

clv.transaction.summary
Contains the minimal information per customer to create the model cbs. Notably, this differs from the cbs as that the values given to create it do not imply a time unit already: The recency is not given as a number (ie 34) what rather calculated based on dates to allow for different time units. It allows to add static covariates but not dynamic.

clv.cbs
In order to reproduce results from papers such as for the BGBB or for expert users familiar with the models, it provides an additional way to fit a model. It allows to add static covariates but not dynamic. This could replace the current way that the cbs is currently stored internally (ie as simple data.table). Note, that they are specific to one model only (ie required columns).

Usage

trans.summary <- data.table(Id=1, last.trans="2005-03-01", first.trans="2007-08-21", n.trans=8, mean.spending=41)
clv.summary <- clv.transaction.summary(trans.summary, "ymd", "weeks")
pnbd(clv.summary)

cbs.pnbd <- clv.pnbd.cbs(data.table(Id=1, recency=1, frequency=8, mean.spending=41))
pnbd(cbs.pnbd)

@bachmannpatrick @mmeierer @niels89 critique and comments?

The text was updated successfully, but these errors were encountered:

pschil · 2020-04-29T20:54:18Z

As discussed with Patrick, it might be more desirable to have entirely distinct classes for continuous- and discrete-time data. Reasons are to sensitize users for the differences and that the plots and summary statistics to produce are inherently different.

mmeierer · 2020-05-01T21:22:58Z

I see the reasons for having two different classes and agree with your line of argumentation.

pschil mentioned this issue Apr 27, 2020

Add BG/BB model #8

Open

mmeierer assigned pschil, niels89, mmeierer and bachmannpatrick Apr 27, 2020

mmeierer added the enhancement New feature or request label Apr 27, 2020

mmeierer added this to To do in v0.6 via automation Apr 27, 2020

mmeierer added this to the v0.6 milestone Apr 27, 2020

pschil removed this from To do in v0.6 Jun 2, 2020

pschil added this to To do in v0.7 via automation Jun 2, 2020

pschil modified the milestones: v0.6, v0.7 Jun 2, 2020

mmeierer removed this from To do in v0.7 Jun 16, 2020

mmeierer added this to To do in v1.0 via automation Jun 16, 2020

mmeierer modified the milestones: v0.7, v1.0 Jun 16, 2020

mmeierer moved this from To do to In progress in v1.0 Jun 16, 2020

mmeierer changed the title ~~Data structures to fit models~~ BGBB model - Data structures to fit models Jun 16, 2020

mmeierer modified the milestones: v1.0, v0.9 Oct 2, 2020

mmeierer removed this from In progress in v1.0 Oct 2, 2020

mmeierer added this to To do in v1.1 via automation Oct 2, 2020

mmeierer unassigned niels89 Oct 2, 2020

mmeierer modified the milestones: v0.9, v1.1 Jan 30, 2021

mmeierer added the help wanted Extra attention is needed label Mar 2, 2021

mmeierer removed this from the v1.1 milestone Apr 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BGBB model - Data structures to fit models #88

BGBB model - Data structures to fit models #88

pschil commented Apr 27, 2020 •

edited

pschil commented Apr 29, 2020 •

edited

mmeierer commented May 1, 2020

BGBB model - Data structures to fit models #88

BGBB model - Data structures to fit models #88

Comments

pschil commented Apr 27, 2020 • edited

pschil commented Apr 29, 2020 • edited

mmeierer commented May 1, 2020

pschil commented Apr 27, 2020 •

edited

pschil commented Apr 29, 2020 •

edited