-
Notifications
You must be signed in to change notification settings - Fork 374
Predicted LTV Too High #313
Comments
what's the correlation between frequency and monetary value in your dataset? |
Please, whenever posting code, don't use screenshots, copy-paste the code itself. It's not only easier on the eyes but it will most llikely help those who are trying to help. Even the example table you showed can be pasted as code.
@orenshk's question is relevant and is something you could show us before we arrive at any conclusions. Another thing you could plot in order to better diagnose a problem is the |
Here's the correlation data: import scipy
scipy.stats.spearmanr(rfm_data['frequency'],rfm_data['monetary_value'])
SpearmanrResult(correlation=0.0305697204235341, pvalue=0.11448729124506023)
scipy.stats.pearsonr(rfm_data['frequency'],rfm_data['monetary_value'])
(-0.04303217429907956, 0.02626314703027851)
import matplotlib
matplotlib.pyplot.scatter(rfm_data['frequency'],rfm_data['monetary_value']) Apologies for copy-pasting an image — here's the code: # Perform average monetary value analysis for repeat purchasers only
from lifetimes import GammaGammaFitter
returning_customers_summary = rfm_data[rfm_data['frequency']>0]
ggf = GammaGammaFitter(penalizer_coef = 0)
ggf.fit(
returning_customers_summary['frequency'],
returning_customers_summary['monetary_value']
)
# Calculate expected average order value for each customer
rfm_data['average_order_revenue'] = ggf.conditional_expected_average_profit(
rfm_data['frequency'],
rfm_data['monetary_value']
)
# Predict LTV over period
# There appears to be an issue in this method
# see https://github.com/CamDavidsonPilon/lifetimes/issues/313
rfm_data['90_day_predicted_ltv'] = ggf.customer_lifetime_value(bgf,
rfm_data['frequency'],
rfm_data['recency'],
rfm_data['monetary_value'],
rfm_data['T'],
time=3, # Number of months to predict
discount_rate=0.01,
freq='D'
)
rfm_data['90_day_predicted_ltv_manual'] = (
rfm_data['90_day_expected_purchases'] * rfm_data['average_order_revenue']
).round(2)
rfm_data
It seems like there must be an issue in the Lines 472 to 478 in 95b4d2a
|
This does seem pretty weird indeed.
But isn't what the I will have to do some more digging, but, unfortunately, I don't have much time for that right now (maybe in a week or two). At any rate, it is necessary that you check the fit of your Gamma-Gamma distributions. The first thing you could do is maybe check the negative log likelihood and see if by varying the |
I've a good idea of what's causing this, which I've summed up here and will be working on in the |
Hi Cameron,
Thanks for creating this great library!
I'm having a few problems with the values predicted using
customer_lifetime_value( )
.I would expect the LTV to be a slightly discounted version of the
predicted purchases * average order value
, but as shown in the screen shot below it's actually coming out much higher:In one example, a customer who has only spent $180 over 2 years is predicted to have a 12 month value of $2670.
Is there an error somewhere in the calulate being caused by the fix for Issue #180?
Cheers,
Duncan
The text was updated successfully, but these errors were encountered: