# 23 Long Term Treatment Effects on Lifetime Value

## Climbing the Right Hill

Let's take a moment to review where we are at and what we've learned. First, we've learned to estimate the average treatment effect

$$
E[Y_1 - Y_0] \ \text{or} \ E[\partial{Y(t)}]
$$

This alowed us to say with confidence what was the effect size of our interventions on some target outcome we wished to change. The ATE tells us what would hapen, on average, if we change the treatment for everyone. For example, say the ATE of increasing prices in BRL 1.00 is decreasing products sold per month in 2 units. This would mean that, on average, customers would by 2 fewer products per month than they would have in the price didn't change. This **does not** mean that every customer will by 2 fewer units. Some might still by the same ammount. Some might by 10 units less. But, averaging all that out, sales should decrease 2 units. 

[IMG]

After we covered the ATE, we learned about the conditional average treatment effect, or treatment effect heterogeneity

$$
E[Y_1 - Y_0 | X] \ \text{or} \ E[\partial{Y(t)}|X]
$$

where \\(X\\) is unit level caracteritics. The CATE alowed us to take into account that units respond differently to the same treatment. For example, some customers might not be very sensitive to price increases while others might be super sensitive. Modeling the conditional average treatment effect allow us to estimate this difference in sensitivity to the treatment on a unit level. 

[IMG]

CATE estimation is also incredably powerfull because it allws us to personalise the treatment in a sound manner. We can give different treatments to different customers based on how well they respond to it. For example, we can increase price only for customers with high income level that are not very sensitive to price increases.

All in all, we can see that causal inference is an amazing tool to optimise any business strategy. With the ATE, we can understand which course of action or intervention would be better on average. With CATE we can personalise different interventions for different customers. 

So, we've covered how to estimate the impact of a treatment \\(T\\), how that impact can be differentiated by covariates \\(X\\), but we still haven't talked much about \\(Y\\). In my head, that is mildly concerning. We saw how causal inference can be an incredible tool for hill climbing a business objective, but all of it can be for nothing if we pick the wrong hill to climb.

[IMG]

Deciding the right thing to optimize is probably the most chalanging part of any new data science project, so I can't promisse an ultimate solution for everything. But I do hope that by discussing it, you will get a valuable intuition about how to approach the problem and the most common tools at your disposal. In this chapter, I'll walk you through the process of thinking about a solid \\(Y\\) to optimise and the causal inference chalanges that will come with it.

## Valuation Crash Course

I'm going to argue that the objective of any company (or firm as we call them in Economics), is profit maximisation. You can reply that some companies also care about social good, enviorement, ESG, yada yada. That might be true. Or not. (It could be only an elaborate marketing scheme to, once again, maximise profit). The point is, in this book, I'll go with the standard Economic theory which says that ["the social responsibility of business is to increase its profits"](https://en.wikipedia.org/wiki/Friedman_doctrine).

If profit is THE thing we are interesting, that makes it a good candidate for the outcome \\(Y\\) we wish to optimise with our tratment. This seems fairly obvious. What is still not obvious is how do we define profits. Let's keep in mind we want to optimize is in agregate, by estimating the ATE \\(E[\partial{Y_i(t)}]\\), or personalising with the CATE \\(E[\partial{Y_i(t)}|X_i]. To do that, it would be great if we could define profit on the unit \\(i\\) level. Ok, we can do that. If we think about it, **profit is revenues - costs**. So, if we can compute how much revenue each unit is generating and at what cost, we can place a number on the unit level profit. 

But there is another issue. What is the time frame we will consider? One month? One year? This is not an inocuous question. Lots of conpanies have lots of upfront cost to atract new customers that will only pay off in the long run. One way to visualise this is to compute that **unit level cash flow**. We've seen this before. 


![img](./data/img/industry-ml/cashflow-1.png)

A tipical customer cash flows usually starts with lots of costs (red), generaly marketing costs. Then, with time, the customer generates revenues (blue) and some costs. Hopefully, the revenues are bigger than the costs and, eventually (an hopefully), the customer will **break even**, meaning that the revenues it generated finaly compensated previous costs. We can see that by ploting the cumulative cash flow.

![img](./data/img/industry-ml/cascade-1.png)

If everything goes well, the customer will keep generating more and more revenue. Eventually, the customer will stop buying for whatever reason, which will make the cashflow equal to zero from then on. 

You can probably see where this is going. We need to define a time frame over wihch we will compute customer's profit, but, if we too short of a time frame, customers that will eventually be profitable will look like not profitable. So, we should probably preffer a long time frame rather than a short one. But how long exacly? Well... Idealy, infinite. 


### Discount Rates and Net Present Value

That's crazy! We can't possibly compute an infinite time frame. What where you thinking?! The trick is to realize the value of money over time. Think about it this way: whould you rather have BLR 1000.00 now or next year? Not a particularly hard to answer right? Ok, next question: why do you prefer it now rather than latter? That's a more interesting. 

The answer lies in the fact that money changes value over time. Money rewards your for your patience and for delaying gratification. That's why if you take BRL 1000.00 now, invest it for one year, you will end up with more than BRL 1000.00. For instance, if the interest rate is 10% a.a., you will end up with BRL 1100.00. This also means that BRL 1100.00 one year from now is worth BRL 1000.00 now. Take a moment to apreciate how amazing this is. We've just taken a monetary value in the future and converted it to a monetary value now. Which also means we can take a cashflow of money over time and convert it to a single value now.


[IMG]

The value of a future monetary amount now is called the **present value**. To bring any value in the future to its present value, all we have to do is scale it down by a discount rate \\(r\\) that is applied over a sequence of \\(t\\) periods. 

$$
\dfrac{1}{(1+r)^t}
$$

In our example above, the discount rate was 10% a.a. because that's the rate we could invest our money at. Simply put, the discount rate is the rate at wich you can take your money and make it grow with time. Picking the correct discount rate is not that simple, though. Companies tend to use the **cost of capital** as a discount rate. That is, how much do they have to pay back in order to get investments. This will depend on how risky the business is and how is the general economy going. My advice: go talk to people in your accounting/finance departament. Not only will they tell you the right rate to use, but they will also be thriled that you are interested in this sort of stuff.


## We are Here for the Long Run

Talk about cash flows, long vs short term and NPV.

## Short Term Metrics

Talk abut the variance in LT. Talk about short term proxies.
