# Revolut - Part II of Challenge - REENGAGEMENT_ACTIVE_FUNDS
## weaves

This uses the output of anal3.q to apply statistical tests on the effective of the REENGAGEMENT_ACTIVE_FUNDS campaign of notifications.

In [1]:
import pandas as pd

%load_ext autoreload
%autoreload 2

# If you turn this feature on, you can display each 
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

pd.__version__

'0.24.2'

## Method

anal3.q created a rolling average of days between those days where at least one transaction took place - a transaction day.

An as-of join between the reengagement notification and the transaction was carried out. The number of transaction-days following the notification should increase; the average number of days between transaction-days should decrease. That is, of course, if the notification has re-engaged the user.

## Tabulated Results

The results are given below. The field case0 explains the count n. The rdrate1 column is used to indicate whethe the percentage increase of daily transaction-day average has reduced, stayed the same, or increased - meaning, relatively more transaction-days, no-change, or more.

(I've re-used the rdrate1 column for the context counts: rdrate1 is zero for these and the case0 name explains the metric.)

 - post or pre indicates whether the count is after a notification or before it. The before cases are the control (or expected) group.
 - all is over all time, near is within a response of .ntfy.near.days days after the notification.
 - the counts are all of distinct users. Except for post.ntfy which the number of notifications.
 - there are some summary counts for users: users0 is all the user accounts, and users0.reengaged is for those who did or did not receive a notification.

In [2]:
# The results file is a collation of results.
res0 = pd.read_csv("cache/out/xres0.csv")
res0

Unnamed: 0,rdrate1,case0,n
0,-1,post.all,10741
1,0,post.all,2288
2,1,post.all,3581
3,-1,post.near,5800
4,0,post.near,806
5,1,post.near,2634
6,0,post.ntfy.all,28139
7,0,post.ntfy.noresp.all,11529
8,0,post.ntfy.noresp.near,18899
9,-1,pre.all,1285


### Tabulation explanation

A narrative for this is:

28139 notifications sent to 11117  users. 18899 notifications were not responded to all in the 7 day window.

To compare: the number of users whose transaction-day rate decreased after receiving a notification in the 7 day window was 5800; whereas, of the same users in the period a month before, 814 transacted and reduced their transaction-day average.

In the comparison, the post numbers are across all the notifications (a period of over nearly a year), whilst the pre numbers are for the month before up to the date of the first notification, and pre.\*.1 numbers are for three months before + 10 days up to the first's date.

What follows is a quick proportion analysis.

In [3]:
res0.iloc[4]['n']
res0.iloc[5]['n']
nneg0 = _ + __
neg0 = res0.iloc[3]['n']
total0 = nneg0 + neg0
total0
neg0 / total0
nneg0 / total0

806

2634

9240

0.6277056277056277

0.3722943722943723

In [4]:
res0.iloc[12]['n']
res0.iloc[13]['n']
nneg0 = _ + __
neg0 = res0.iloc[14]['n']
total0 = nneg0 + neg0
total0
neg0 / total0
nneg0 / total0

814

130

1520

0.37894736842105264

0.6210526315789474

# Conclusion

So looking at the near window results. After notifications, of 9240 users, 67% responded in such a way that their average days between transactions was reduced; 3 months + 10 days before, 1520 of the 9240 who responded after, only 37% transacted in the 3 month period in such a way that they reduced their transaction-day rate.

I think that is so conclusive that there is no need to put it to a statistical test, which would be a Chi-square of log-likelihood on a 3-way contingency table with the pre results as the expected and post as the observed.

It does make me question whether the control group has been correctly constructed and I should make sure with another tabulation of the before data in a different way.

I shan't be doing that just now.

# Recommendation

From this preliminary investigation, it's clear without a statistical test that the Reengagement notification has had the desired effect: users have logged proportionally more transaction days.

There are further investigations that could be conducted. The "pre" or before notifications control group (expected counts) could be tabulated differently as noted in the conclusion above.

It would also be worthwhile carrying ut the analysis using a different measure from transaction-days. The number of transactions could be used and also the value or volume of transactions could be used.

Also this analysis has only used on the counts of the transaction-days rate that have decreased. It could also address the average change in magnitude of the average transaction-days rate - a regression analysis.

But this preliminary result should be convincing enough.