## Initial considerations

Through the Low Probability Form tool, we aim to offer a proposal to the customer that increases the chances of staying with the company.<br>
The tool is added to the usual attention procedures and makes recommendations based on the probability value. The ranges will be defined by costs, available capacity, and the strategy to be adopted.<br>
The argumentation will be quantitative in parts and qualitative in others since it is a hypothetical exercise and we do not have additional data/guidelines that would ultimately define the scenarios<br>
As we saw in the exploratory analysis, the reasons fall into two categories: "Competitor," with 33% of the total (the competitor offers a better service or offer), and "Attention" (the support and attitude provided), which accounts for 20% of cases. The actions will be aimed in those directions, and we should not confuse the attributes of the model that enhance the probability of churn with the causes.

## Key ideas for the solution

The reasons for churn listed above allow the teams responsible for commercial offerings, technology, training, and human resources to act with structural solutions.<br>
Our approach will be focused on using current resources to offer a better scenario in the short term, reducing the propensity for churn.<br>

**Service and offer**<br>
Regarding service, "Fiber optic" has the highest churn rate at 42%, followed by "DSL" at 19% and 7% for those without internet. Fiber optic is the best technology available in the company for customers, and improvement will depend on medium and long-term infrastructure solutions. We will not focus on actions in this department.<br>
Regarding the offer, we can consider value proposals for the short and medium term that allow us to retain customers while waiting for a better proposal from the involved areas. A percentage discount on the current subscription and/or free services will be our two considered alternatives.<br>

**Customer care**<br>
Here, the strategy is to patch up insufficient quality attention on the front line of technical support with the "Tech Support" attention. Remember that those who do not have this service showed a churn of 42% compared to 15% for those who have it. We assume that this difference is based on three axes: service quality, training, and tools.<br><br>

Note: We will focus on managing inquiries/complaints that come in through the technical channel, but we could apply a similar solution for the commercial channel.

## Data import and preparation

In [None]:
# Libraries import
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Dataset imported from github 
url='https://raw.githubusercontent.com/marcelobour/telco_churn/main/data/Telco_customer_churn.csv'
telco = pd.read_csv(url, sep=';')

# Define the list of columns to convert to number
cols = ['Longitude', 'Latitude', 'Monthly Charges']

# Replace the comma with a point and the space with nothing and convert to a number
telco[cols] = telco[cols].apply(lambda x: x.str.replace(',', '.')).apply(lambda x: x.str.replace(' ', '0')).apply(lambda x: pd.to_numeric(x))

# Display data
pd.options.display.max_columns = None
display(telco.head())


Unnamed: 0,CustomerID,Count,Country,State,City,Zip Code,Lat Long,Latitude,Longitude,Gender,Senior Citizen,Partner,Dependents,Tenure Months,Phone Service,Multiple Lines,Internet Service,Online Security,Online Backup,Device Protection,Tech Support,Streaming TV,Streaming Movies,Contract,Paperless Billing,Payment Method,Monthly Charges,Total Charges,Churn Label,Churn Value,Churn Score,CLTV,Churn Reason
0,3668-QPYBK,1,United States,California,Los Angeles,90003,"33.964131, -118.272783",33.964131,-118.272783,Male,No,No,No,2,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,10815,Yes,1,86,3239,Competitor made better offer
1,9237-HQITU,1,United States,California,Los Angeles,90005,"34.059281, -118.30742",34.059281,-118.30742,Female,No,No,Yes,2,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,15165,Yes,1,67,2701,Moved
2,9305-CDSKC,1,United States,California,Los Angeles,90006,"34.048013, -118.293953",34.048013,-118.293953,Female,No,No,Yes,8,Yes,Yes,Fiber optic,No,No,Yes,No,Yes,Yes,Month-to-month,Yes,Electronic check,99.65,8205,Yes,1,86,5372,Moved
3,7892-POOKP,1,United States,California,Los Angeles,90010,"34.062125, -118.315709",34.062125,-118.315709,Female,No,Yes,Yes,28,Yes,Yes,Fiber optic,No,No,Yes,Yes,Yes,Yes,Month-to-month,Yes,Electronic check,104.8,304605,Yes,1,84,5003,Moved
4,0280-XJGEX,1,United States,California,Los Angeles,90015,"34.039224, -118.266293",34.039224,-118.266293,Male,No,No,Yes,49,Yes,Yes,Fiber optic,No,Yes,Yes,No,Yes,Yes,Month-to-month,Yes,Bank transfer (automatic),103.7,50363,Yes,1,89,5340,Competitor had better devices


##  Churn impact: changing from a monthly to an annual contract and adding tech support

The reduced model with 8 variables, which we demonstrated has precision and recall values just below the complete model, presents the following as determinants of propensity to churn:
*   Tenure Months
*   Monthly Charges
*   Latitude
*   Longitude
*   Dependents
*   Contract
*   Paperless Billing
*   Internet Service

Regarding 'Tenure Months', in general, the longer a customer has been with the company, the lower their propensity to churn. Customers who stay between 1 and 5 months have a 54% churn rate (double the company's average).

Regarding 'Contract', customers on a monthly contract have a 43% churn rate, while annual contracts have an 11% churn rate, and biennial contracts have a 3% churn rate. While this is not a cause, it can help us reduce churn by offering promotions to change contract type.

The trained model recognizes or values a higher propensity to churn when people do not have dependents, have fiber optic internet service, and when their bill is digital (Paperless billing).

Let's see how some scenarios of these variables relate to Tech Support:

In [None]:
df = telco.loc[(telco['Dependents']=='No')& 
          (telco['Internet Service']=='Fiber optic')&
          (telco['Paperless Billing']=='Yes')]
data = pd.pivot_table(df, values='Churn Value', index=['Contract'], columns=['Tech Support'], aggfunc=np.mean).round(2) 
display(data)

Tech Support,No,Yes
Contract,Unnamed: 1_level_1,Unnamed: 2_level_1
Month-to-month,0.64,0.44
One year,0.24,0.27
Two year,0.14,0.11


Our actions will be focused, as appropriate in each case, on offering Tech Support, converting contracts to annual, and offering bonuses/discounts.<br>
The detail of the procedures: https://docs.google.com/presentation/d/1eqBfX3qzZ_CABcjgoiHDySgU-elu64DJCei-NLWb4uM/edit#slide=id.p

## Capture of preventive retention

Let's assume some values that we could have obtained by consulting company references or that we could have estimated from a pilot test.<br>
*  Monthly contact probability with technical reasons: 70%.
*  Reiteration index: 60%.
*  Fraction of technical cases that the Tech Support team can handle: 40%.
*  Average idle capacity of the Tech Support team: 30% on average. 
*  Desired average idle capacity of the Tech Support team: 10% on average.
*  Customers who mention they will cancel the service: 1 out of 3.
*  Customers who accept the offer of one month free Tech Support + annual contract: 30%.

From the above data, it follows that there is a 20% capacity of the Tech Support team that can be used for our implementation without generating additional costs. (Actual idle capacity - Desired idle capacity for service level agreements)

In [None]:
# How many active customers do we have?
activos = telco.loc[telco['Churn Value']==0].shape[0]
activos

5174

In [None]:
# How many technical support contacts?
motivo_tec = 0.7
contac_tec = round(activos * motivo_tec)
contac_tec

3622

In [None]:
# How many have Tech Support contracted?
with_tech_support = telco.loc[telco['Tech Support']=='Yes'].shape[0]
percen_with_tech_support = round(with_tech_support/activos, 2)
percen_with_tech_support

0.4

In [None]:
# How many are referred to Tech Support?
competencia_tech_support = 0.4
derivados_tech_support = round(contac_tec * competencia_tech_support * percen_with_tech_support)
derivados_tech_support

580

In [None]:
# How much capacity, in monthly contacts, can we take advantage of?
capactual = 0.7
capmax = 0.9
contactos_sin_costo = round(derivados_tech_support / capactual * capmax - derivados_tech_support)
contactos_sin_costo

166

In [None]:
# How many unique customers does this represent?
reitero = 0.6
casos_sin_costo = round(contactos_sin_costo / (1 + reitero))
casos_sin_costo

104

Therefore, our goal is to capture 104 customers monthly with the highest probability of churn, who do not have Tech Support contracted nor an annual contract. We will offer them a one-month service bonus and immediate referral if they agree to an annual contract with a 10% discount.

In [None]:
# How many cases enter monthly without Tech Support and monthly contract?
contactos_pool = round(telco.loc[(telco['Tech Support']=='No') & (telco['Contract']=='Month-to-month')].shape[0] * motivo_tec)
contactos_pool

1876

In [None]:
# How many unique clients do those contacts represent?
clientes_pool = round(contactos_pool / (1 + reitero))
clientes_pool

1172

In [None]:
# What percentage should we capture of those who enter?
recall_objetivo = round(casos_sin_costo / clientes_pool, 2)
recall_objetivo

0.09

In [None]:
# What percentage should we offer, knowing that only 30% of them will accept?
efectividad_ofrecimiento = 0.3
recall_objetivo_ofrecimiento = round(recall_objetivo / efectividad_ofrecimiento, 2)
recall_objetivo_ofrecimiento

0.3

Based on the values we had analyzed in: <br> https://colab.research.google.com/drive/1mAljIO4qR_3l-ufKnFnwRDu70YZ7J7I0 <br> we use the table that relates recall, precision, and threshold:

In [None]:
# Import data from the chosen model of 8 variables from GitHub.
url_data_8var='https://raw.githubusercontent.com/marcelobour/telco_churn/main/data/aucpr-precision-8var.csv'
data_8var = pd.read_csv(url_data_8var)
display(data_8var)

Unnamed: 0.1,Unnamed: 0,re_avg,pre_avg,re_std,pre_std,re_rel_error,pre_rel_error,thresh
0,0,0.58,0.69,0.01784,0.012149,0.060288,0.034509,0.5
1,1,0.49,0.72,0.019678,0.015916,0.078714,0.043328,0.55
2,2,0.41,0.75,0.018495,0.016709,0.088416,0.043667,0.6
3,3,0.3,0.81,0.017499,0.025042,0.114325,0.060597,0.65
4,4,0.23,0.88,0.016761,0.029449,0.14283,0.065591,0.7
5,5,0.14,0.9,0.015421,0.036141,0.215898,0.078708,0.75
6,6,0.08,0.95,0.011885,0.034032,0.291194,0.070213,0.8
7,7,0.03,0.96,0.006477,0.044851,0.423177,0.091571,0.85
8,8,,0.67,0.001826,0.479463,inf,1.402609,0.9
9,9,,,,,,,0.95


With this table, we can relate the thresh (churn probability) with the recall (capture rate).<br>

If we only follow the previous table and do not take into account that the expressed intention to churn increases the probabilities, we will chose a threshold of 0.65, which gives us a recall of 30% (row 3 of the table)<br> 
But as we mentioned earlier, we know that 1 out of 3 clients expresses the intention to churn and that when they do and the churn prediction is positive (thresh >= 0.5), the probability increases, and we also want to capture them, so we must capture in 2 installments: <br>
*  For those who do not express churn (2/3), we capture them by their probability value. That is, 20%, a churn probability of 70% gives us approximately that recall, 23% (row 4 of the table). We refer to this group as "high probability." <br>
*  For those who express churn, we select them by a probability value that gives us precisely the remaining third, another 10% recall. Since we already have 23% from row 4, we need 7%, and since actually, 1 out of 3 will express churn, we should get an additional recall of triple, that is 21%. 23% + 21% = 44%, by approximation and to be conservative with the capacity of Tech Support island (considering that we have an error of 14% and 9%, respectively) we select the probability 0.6, which provide us a recall of 41% (row 2 of the table). We refer to this group as "protesters."


This results in 2 levels: the first, the critical probability level, between 0.7 and 1 within which we will always make the offer (Tech Support for 1 month + annual contract offer), and another level, from 0.6 to 0.7, where we will only make the offer if they express the intention to churn (Only Tech Support for one time). Below 0.6, we will not offer any changes in the preventive retention procedure.

## Scope and impact

In [None]:
# High probability group
cuota_alta_prob = 0.23
precision_alta_prob = 0.88

# Protesters group
cuota_manif = (0.41 - 0.23)/3
precision_manif = 0.75

# Reached Customers
alcanzados_total = round(clientes_pool * efectividad_ofrecimiento * (cuota_alta_prob + cuota_manif), 0)
alcanzados_total

102.0

Remember that we had a goal of 104, practically achieved.

To calculate the final accuracy, that is, the resulting weighted accuracy, we must compose the accuracies of both quotas:<br>

In [None]:
# Accuracy weighted average
precision_final = round(((cuota_alta_prob * precision_alta_prob) + (cuota_manif * precision_manif)) / (cuota_alta_prob + cuota_manif), 2)
precision_final

0.85

So, 85% of the cases that accept the offer would actually have canceled.<br>
But what is the cost of offering a bonus/deal to someone who ultimately would not have canceled?<br>
The protesters group doesn't generate any cost, we offer a one-time transfer to Tech Support and we capture it from the available idle capacity. The high probability group that would not have canceled (15%) generates a cost (10% reduction in the rate).

In [None]:
# Participation of the high probability quota
part_cuota_alta_prob = round(cuota_alta_prob/(cuota_alta_prob + cuota_manif), 2)
part_cuota_alta_prob

0.79

That is, 15% of 79% generates a decrease in the fee (10% annual discount). But we cover that with 85% (of 100%) of the cases that accept the offer.

From here, it would be necessary to have preliminary results or a pilot to calculate repayment and evaluate if offering a higher or lower discount could offer us better scenarios. We have 2 major uncertainties:<br>
*  What is the churn rate of the customers we retained preventively?
*  The interaction of this action with other structural actions offered by the company, such as improving first-line service or proactively offering more competitive packages.

Regarding the first question, we could hypothesize (based on the block table [4]) that with the action we lowered churn:
*  Approximately 60% for the "High Probability" group (from 64% to 24%)
*  Approximately 30% for the "Protesters" group (from 64% to 44%)


Therefore, we would have the following final balance:

In [None]:
# Will not cancel, but we apply a 10% discount
descuento = 0.10
costo = part_cuota_alta_prob * (1-precision_final) * descuento

# Accepts discount but we reduced the chance of cancellation by 60% (High Probability Group)
ganancia_alta_prob = part_cuota_alta_prob * precision_final * (1-descuento) * 0.60 

# Accepts transfer to Tech (Protesters Group)
ganancia_manif = (1-part_cuota_alta_prob) * 0.30

neto = round(ganancia_alta_prob + ganancia_manif - costo, 2)
neto

0.41

That is, with this action, we manage to retain 41% of the fees of the people who accept the corresponding offer according to the scenario.<br>
One option for implementation would be to expand the capacity of Tech Support by analyzing the monthly demand costs per client and the margin of profit in the fee, multiplying the reach.