You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering what will the "label times" look like for a problem statement like "Likelihood of making a purchase in next 7 days". The data at hand is the Customer data and all the previous transactions (Orders).
This guide (https://compose.alteryx.com/en/stable/examples/predict_next_purchase.html) is almost similar, but with a very big difference, in that it focuses on whether a customer will buy a "particular" product, and not any product. It is because of this difference that the guide gets both positive/negative labels.
If i use the same labeling function (with a slight modification to return true if purchase happened in that slice), i will get all positive labels, because our original data contains only data where the user did purchase something!
One solution i can think of is using drop_empty=False in the lm.search, so i get all slices even though no purchases happened in that slice. If this is the right approach, i have 2 related queries:
Will this dataset be more biased towards Negative labels? Since there might be long periods when no purchase happened for a customer.
Since we are dealing with 7 days window, the order_date might randomly fall in any one of the days of the slice. Is this ok?
The inference will happen every 2-3 days and the data will be handed over to marketing team.
Sorry if this seems like a very trivial problem, somehow i was not able to find enough resources online on how to handle this.
The text was updated successfully, but these errors were encountered:
I was wondering what will the "label times" look like for a problem statement like "Likelihood of making a purchase in next 7 days". The data at hand is the Customer data and all the previous transactions (Orders).
This guide (https://compose.alteryx.com/en/stable/examples/predict_next_purchase.html) is almost similar, but with a very big difference, in that it focuses on whether a customer will buy a "particular" product, and not any product. It is because of this difference that the guide gets both positive/negative labels.
If i use the same labeling function (with a slight modification to return true if purchase happened in that slice), i will get all positive labels, because our original data contains only data where the user did purchase something!
One solution i can think of is using
drop_empty=False
in thelm.search
, so i get all slices even though no purchases happened in that slice. If this is the right approach, i have 2 related queries:The inference will happen every 2-3 days and the data will be handed over to marketing team.
Sorry if this seems like a very trivial problem, somehow i was not able to find enough resources online on how to handle this.
The text was updated successfully, but these errors were encountered: