You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While implementing the BGBB model, it became clear that the transaction history does not suffice as input for all models. Because the BGBB model is for a discrete-time setting, it requires additional information on the transaction opportunities, potentially for each customer differently.
Providing this functionality through the existing clv.data object which represents the full transaction history blurs the lines of responsibility (=Single Responsibility Principle). It would require all kinds of internal case differentiations in clv.data (ie has transaction opportunities or not?) that hamper maintenance. It would also create a much more challenging user-interface although this functionality is in fact only used for a single model.
Rather one class should do one thing only and do it well. Therefore, to simplify usage and encapsulate distinct functionality into separate objects, I suggest to separate the transaction opportunity functionality from the transaction history:
clv.transactions
This is the full transaction history of each customer which allows to add static and dynamic covariate data. This is what clv.data currently is.
clv.transaction.opportunities
A separate data structure that contains the transactions opportunities for every customer, potentially a duration in case TOs stretch over a period (ie a TO is a week). In combination with clv.transactions this can be used to fit a discrete-time model.
Usage
Fitting a continuous time model remains the same while for discrete time models, it would required an additional input.
Another common use case is that end users do not have the full transaction history because it can be huge. Rather users are given a summary of all transactions pulled from some DB (last transaction, number of transaction, mean spending, etc). To support this use case, I suggest to add data structures:
clv.transaction.summary
Contains the minimal information per customer to create the model cbs. Notably, this differs from the cbs as that the values given to create it do not imply a time unit already: The recency is not given as a number (ie 34) what rather calculated based on dates to allow for different time units. It allows to add static covariates but not dynamic.
clv.cbs
In order to reproduce results from papers such as for the BGBB or for expert users familiar with the models, it provides an additional way to fit a model. It allows to add static covariates but not dynamic. This could replace the current way that the cbs is currently stored internally (ie as simple data.table). Note, that they are specific to one model only (ie required columns).
As discussed with Patrick, it might be more desirable to have entirely distinct classes for continuous- and discrete-time data. Reasons are to sensitize users for the differences and that the plots and summary statistics to produce are inherently different.
While implementing the BGBB model, it became clear that the transaction history does not suffice as input for all models. Because the BGBB model is for a discrete-time setting, it requires additional information on the transaction opportunities, potentially for each customer differently.
Providing this functionality through the existing
clv.data
object which represents the full transaction history blurs the lines of responsibility (=Single Responsibility Principle). It would require all kinds of internal case differentiations inclv.data
(ie has transaction opportunities or not?) that hamper maintenance. It would also create a much more challenging user-interface although this functionality is in fact only used for a single model.Rather one class should do one thing only and do it well. Therefore, to simplify usage and encapsulate distinct functionality into separate objects, I suggest to separate the transaction opportunity functionality from the transaction history:
clv.transactions
This is the full transaction history of each customer which allows to add static and dynamic covariate data. This is what
clv.data
currently is.clv.transaction.opportunities
A separate data structure that contains the transactions opportunities for every customer, potentially a duration in case TOs stretch over a period (ie a TO is a week). In combination with
clv.transactions
this can be used to fit a discrete-time model.Usage
Fitting a continuous time model remains the same while for discrete time models, it would required an additional input.
Another common use case is that end users do not have the full transaction history because it can be huge. Rather users are given a summary of all transactions pulled from some DB (last transaction, number of transaction, mean spending, etc). To support this use case, I suggest to add data structures:
clv.transaction.summary
Contains the minimal information per customer to create the model cbs. Notably, this differs from the cbs as that the values given to create it do not imply a time unit already: The recency is not given as a number (ie 34) what rather calculated based on dates to allow for different time units. It allows to add static covariates but not dynamic.
clv.cbs
In order to reproduce results from papers such as for the BGBB or for expert users familiar with the models, it provides an additional way to fit a model. It allows to add static covariates but not dynamic. This could replace the current way that the cbs is currently stored internally (ie as simple data.table). Note, that they are specific to one model only (ie required columns).
Usage
@bachmannpatrick @mmeierer @niels89 critique and comments?
The text was updated successfully, but these errors were encountered: