# A/B testing performance analysis

In [1]:
import pandas as pd

In [3]:
df = pd.read_csv("../Data/Processed/df_kpi_clients.csv")

Variations:
- Control group - traditional online process
- Test group - new digital process

In [15]:
participants = (df.groupby("variation")["client_id"].nunique().reset_index(name="num_clients"))
participants


Unnamed: 0,variation,num_clients
0,Control,23532
1,Test,26968


## KPI - 1 : Completion Rate

### Completion rate by group

In [6]:
completion_rates = (df.groupby("variation")["completion_status"].mean().mul(100).round(2).reset_index(name="completion_rate"))
completion_rates

Unnamed: 0,variation,completion_rate
0,Control,65.59
1,Test,69.29


### Absolute difference in Completion Rate between two experiment groups

In [11]:
control_rate = completion_rates.loc[
    completion_rates["variation"] == "Control", "completion_rate"
].iloc[0]

test_rate = completion_rates.loc[
    completion_rates["variation"] == "Test", "completion_rate"
].iloc[0]

absolute_diff = test_rate - control_rate

absolute_diff = round(absolute_diff, 2)
print(f"Absolute difference (Test - Control): {absolute_diff} percentage points")


Absolute difference (Test - Control): 3.7 percentage points


**Findings:**
- Test group achieved a higher completion rate (69.29%) compared to the Control group (65.59%).

- The absolute increase in completion rate is of 3.7 percentage points. 

- This suggests that the redesigned interface improved usersâ€™ ability to **complete** the process

_________________________________________________________________________________________________________________________________________________________________________________________

## KPI - 2 : Time spent on each step

In [None]:
# visit-level dataframe with per-step timings (seconds)
df_exp_visits = pd.read_csv("../Data/Processed/df_exp_visits.csv")

  df_exp_visits = pd.read_csv("../Data/Processed/df_exp_visits.csv")


In [None]:
# per-step time columns
step_time_cols = [
    "time_in_start_sec",
    "time_in_step_1_sec",
    "time_in_step_2_sec",
    "time_in_step_3_sec",
    "time_in_confirm_sec"
]

# average time spent on each step by group (in seconds)
avg_step_time = (
    df_exp_visits.groupby("variation")[step_time_cols]
    .mean(numeric_only=True)
    .round(2)
    .reset_index()
)

avg_step_time


Unnamed: 0,variation,time_in_start_sec,time_in_step_1_sec,time_in_step_2_sec,time_in_step_3_sec,time_in_confirm_sec
0,Control,77.13,55.7,110.84,151.6,20.13
1,Test,85.72,76.05,106.98,139.78,45.73


**Findings:**
- The Test group has completed the **step 2 and 3** faster, compared to the control group. In reality the middle steps / pages usually ask for the important and the most relevant info and these quicker times indicate clarity in that particular stage of the process.

- However, the Test group is slower in completing **Start, Step_1 and Confirmation page**. 

- While these *could* be UI/UX issues, the improved process has **cues, messages, hints, or instructions** provided to users directly within the context of their current task or action, so the additional few seconds could be due to clients having spent time reading those.

- To summarise, it is likely due to increased engagement rather than friction.

- Though the process is slower in 3 steps, we already know that the Test group has better completion rate, so it is safe to say that the instructions aided better completion.

_________________________________________________________________________________________________________________________________________________________________________________________

## KPI - 3 : Error rate

### Error / Backtracking rate by Group

In [23]:
# error (backtracking) rate per group
backtrack_rate = (
    df
    .groupby("variation")["backtrack_flag"]
    .mean()
    .round(4)
    .mul(100)
    .reset_index(name="backtrack_rate_pct")
)

backtrack_rate


Unnamed: 0,variation,backtrack_rate_pct
0,Control,26.19
1,Test,33.44


### Error / Backtracking rate by Group - **Completed** participants

In [26]:
backtrack_rate_completed = (
    df[df["completion_status"] == 1]
    .groupby("variation")["backtrack_flag"]
    .mean()
    .round(4)
    .mul(100)
    .reset_index(name="backtrack_rate_pct")
)

backtrack_rate_completed


Unnamed: 0,variation,backtrack_rate_pct
0,Control,26.44
1,Test,28.45


**Findings:**

- Among all users, the Test group participants have a higher backtracking rate. The difference to that of the control group error rate is significant.

- But when it comes to users who completed the experiment, the error rate of the Test group is still higher, but the gap to that of the control group is much smaller. 

- While this shows that backtracking / error rate does not prevent completion, it suggests that the new digital process has friction and the clients revisited earlier steps more frequently.

_________________________________________________________________________________________________________________________________________________________________________________________

## KPI - 4 : Completion Time (Avg. per group)

In [28]:
# completed users only
df_completed = df[df["completion_status"] == 1]

avg_completion_time = (
    df_completed
    .groupby("variation")["total_time_sec"]
    .mean()
    .round(2)
    .reset_index(name="avg_completion_time_sec")
)

avg_completion_time


Unnamed: 0,variation,avg_completion_time_sec
0,Control,393.11
1,Test,357.95


**Findings:**

- Among the users who completed the process, Test group achieved a shorter average completion time compared to the Control group. 

- Taken together with the previous findings, this suggests that despite having a higher backtracking rate and increased time spent in the initial and the final stages, the test group or the redesigned digital process has achieved a lower overall completion time. 

_________________________________________________________________________________________________________________________________________________________________________________________

## KPI - 5 : Drop-off Rate per Step

In [30]:
# consider only incomplete visits for drop-off analysis
df_incomplete = df_exp_visits[df_exp_visits["completion_status"] == 0].copy()

# drop-off distribution by group
dropoff = (
    df_incomplete
    .groupby(["variation", "drop_off_step"])
    .size()
    .reset_index(name="dropoff_count")
)

dropoff["dropoff_%_within_group"] = (
    dropoff["dropoff_count"] /
    dropoff.groupby("variation")["dropoff_count"].transform("sum") * 100
)

# largest drop-off step per group
largest_dropoff = (
    dropoff.sort_values(["variation", "dropoff_count"], ascending=[True, False])
          .groupby("variation")
          .head(1)
          .reset_index(drop=True)
)

dropoff, largest_dropoff


(  variation drop_off_step  dropoff_count  dropoff_%_within_group
 0   Control         start           9446               58.471062
 1   Control        step_1           3119               19.306716
 2   Control        step_2           1460                9.037450
 3   Control        step_3           2130               13.184773
 4      Test         start           9329               60.526828
 5      Test        step_1           2994               19.425161
 6      Test        step_2           1327                8.609615
 7      Test        step_3           1763               11.438396,
   variation drop_off_step  dropoff_count  dropoff_%_within_group
 0   Control         start           9446               58.471062
 1      Test         start           9329               60.526828)

**Findings:**

- Drop-off analysis shows that in both the Control and Test groups, the majority of abandonment occurs at the start of the process.

- While the Test group shows a slightly higher rate of drop-offs at the initial step, drop-offs at later steps, particularly Step 3, are lower compared to the Control group.

- This is also backed up by previous findings of the lesser time spent in these steps by the Test group users.

- Higher drop-offs at the start could be due to any of the following:
    - information overload, gap in design expectations, did not have the required information, longer process than expected, etc.

- Despite all the above, in the new digital process involving test group users, those who proceeded past the start are more likely to finish.  