title: A/B test for card tracking button
author: Vladas Jankus 
date: 2021-04-30
region: EU  
tags: cards, card, product, ab test, split test, ab
summary: The purpose of this A/B test was to find out whether adding card tracking information with update tracking information button would reduce the volume of CS contacts by customers who are contacting for reassurance about their delivery. The sample size was around 30k customers. Around 10% of customers from the treatment group interacted with the button. The experiment showed, that having the new information added did not have a significant impact on the overall CS contacts. Customers who did not contact CS remained at exactly 89.6% of both control and treatment groups. Analysis of tag dynamics showed, that the button was clicked more often by customers who had card delivery issues, but they still ended up contacting CS.

In [1]:
!pip install duckdb

import pandas as pd
import altair as alt
from utils.datalib_database import df_from_sql
from IPython.display import HTML, Markdown as md
from datetime import datetime
import duckdb
from scipy.stats import binom_test

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.[0m


  r"(.*)\[(.*)]", r"\1"
  vault_params["key"] = vault_params["index"].str.replace(r"(.*)\[(.*)]", r"\2")


In [3]:
user_contacts_df = df_from_sql(
    "redshiftreader",
    """
        with cards as (
            select
                user_created,
                order_date,
                card_activated
            from dbt.zrh_cards
            where order_date >= '2021-04-15'
                and is_digital is false
                and delivery_method = 'STANDARD'
        ),
        --app version part
        app_version as (
            select
                distinct user_created
            from dbt.stg_logins
            where collector_tstamp >= date('2021-04-15')
                and stg_logins.app_version in ('n26-android_3.60', 'n26-ios_3.60')
        ),
        users as (
            select
                zrh_users.user_id,
                zrh_users.user_created,
                zrh_users.contact_id,
                order_date,
                card_activated,
                is_user_in_tg(zrh_users.user_id, 'rutherfordium.update_tracking_info', 50) as treatment_group
            from dbt.zrh_users
            inner join cards
                using (user_created)
            inner join app_version
                using (user_created)
        ),
        user_info as (
            select users.user_created,
                   users.treatment_group,
                   users.contact_id,
                   users.order_date,
                   users.card_activated,
                   coalesce(sum((sn.se_action = 'cards_tab_viewed')::int) > 0, false)                              as viewed_cards_tab,
                   coalesce(sum((sn.se_action = 'card_action_update_card_tracking_info_clicked')::int) > 0,
                            false)                                                                                 as button_clicked
            from users
                     left join dbt.snowplow as sn
                               on sn.user_created = users.user_created
                                   and sn.collector_date >= '2021-04-15' -- experiment start
                                   and
                                  sn.se_action in ('card_action_update_card_tracking_info_clicked', 'cards_tab_viewed')
            group by 1, 2, 3, 4, 5
        )
        select
            user_created,
            treatment_group,
            viewed_cards_tab,
            button_clicked,
            coalesce(sum((sf_all_contacts.id is not null)::int) > 0, false) as contacted_cs,
            coalesce(sum((sf_all_contacts.cs_tag in ('card_delivery', 'not_tagged', 'contact_data', 'legal_data'))::int) > 0, false) as contacted_for_cards,
            sum((sf_all_contacts.id is not null)::int) as contacts,
            sum((cs_tag = 'card_delivery')::int) as card_delivery_contacts,
            sum((cs_tag = 'not_tagged')::int) as not_tagged_contacts,
            sum((cs_tag = 'contact_data')::int) as contact_data_contacts,
            sum((cs_tag = 'legal_data')::int) as legal_data_contacts
        from user_info
        left join dbt.sf_all_contacts
            on user_info.contact_id = sf_all_contacts.related_contact
            and sf_all_contacts.initiated_date >= order_date
            and sf_all_contacts.initiated_date < coalesce(card_activated, order_date + interval '5 days')
            --and cs_tag in ('card_delivery', 'not_tagged', 'card_delivery', 'contact_data', 'legal_data')
        group by 1, 2, 3, 4
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 121, "funcName": "df_from_sql", "created": "20210430T115529", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "90f207dc-cf6a-4e47-8663-864402367739", "hostname": "172.19.0.4"}


INFO:datalib-logger:{'message': 'started', 'db': 'redshiftreader'}


{"message": "success", "db": "redshiftreader", "duration": 7.2961, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 131, "funcName": "df_from_sql", "created": "20210430T115536", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "90f207dc-cf6a-4e47-8663-864402367739", "hostname": "172.19.0.4"}


INFO:datalib-logger:{'message': 'success', 'db': 'redshiftreader', 'duration': 7.2961}


In [4]:
con = duckdb.connect(database=":memory:", read_only=False)

In [5]:
# con.execute('drop table raw_data')
con.register("user_contacts", user_contacts_df)
con.execute("create table raw_data as select * from user_contacts")

<duckdb.DuckDBPyConnection at 0x7f32454ada40>

# A/B test analysis for added card tracking information
Vladas Jankus<br/>
2021-04-30

### Contents:
* [1. Test description](#1)
    * [1.1. Defining the sample](#11)
    * [1.2. Sample criteria](#12)
    * [1.3. CS contacts and their tags](#13)
* [2. Effect on CS contacts](#2)
  * [2.1. Customers who reached out to CS](#21)
  * [2.2. Customers who clicked and reached out to CS](#22)
  * [2.3. CS tag dynamics](#23)
* [3. Conclusion](#3)

# TL;DR

The purpose of this A/B test was to find out whether adding card tracking information with <em>'update tracking information'</em> button would reduce the volume of CS contacts by customers who are contacting for reassurance about their delivery.

The sample size was around 30k customers. Around 10% of customers from the treatment group interacted with the button.

The experiment showed, that having the new information added did not have a significant impact on the overall CS contacts. Customers who did not contact CS remained at exactly 89.6% of both control and treatment groups. Analysis of tag dynamics showed, that the button was clicked more often by customers who had card delivery issues, but they still ended up contacting CS.

# 1. Test description <a class="anchor" id="1"></a>

The purpose of this test was to roll out a 'update tracking information' button to a number of customers and reduce their CS contacts. The idea was that there is a number of customers who contact CS for reassurance about their card shipping, and having a convenient way to check their shipping information would divert them from contacting CS.

Experiment start - 2021-04-15. This is the date when the button was rolled out to 50% of customers.

Experiment end - 2021-04-30. Until this date the data was taken in the analysis.

Customers eligible to participate in the experiment:
    <ul>
    <li><b>Has ordered a new physical card with standard shipment since the experiment start.</b> This targets only the customers who comply with scenario of contacting CS because of card shipping address change. If there was no card ordered, customer is not targeted with this experiment.</li>
    <li>Since experiment start has connected to N26 app using versions <b>n26-android_3.60 or n26-ios_3.60</b>. At the moment of experiment this is the newest public app version.</em></li>
    </ul>

### 1.1. Defining the sample <a class="anchor" id="11"></a>

Not all eligible customers will be included in the sample, because not all of them are relevant for the test. There is a small pool of customers that have ordered a card without looking at a cards view on the app. This could possibly be the customers that either ordered a card by other means (webapp/cs) or their data is missing on snowplow. Since these customers did not visit the cards view they would have had no chance to see the button and therefore are not the target for this experiment. Chart below displays the situation visually.

In [29]:
# See customer split
visited_screen = con.execute(
    """
        select 
            case when treatment_group = true then 'Treatment' else 'Control' end as user_group,
            case 
                when button_clicked = true then 'Clicked on button'
                when viewed_cards_tab = true then 'Viewed cards tab'
                else 'Did not view cards tab'
            end as customer_action,
            case 
                when button_clicked = true then 2
                when viewed_cards_tab = true then 1
                else 3
            end as order,
            count(*) as customers
        from raw_data
        group by 1, 2, 3
    """
).fetchdf()

# define colors
color_scale = alt.Scale(
    domain=["Clicked on button", "Viewed cards tab", "Did not view cards tab"],
    range=["#5396D5", "#58B99D", "#EAC645"],
)

# build chart
alt.Chart(visited_screen).mark_bar().encode(
    x=alt.X("sum(customers)", title="Customers"),
    y=alt.Y("user_group", title=None),
    color=alt.Color(
        "customer_action",
        legend=alt.Legend(title="Customer Action"),
        scale=color_scale,
    ),
    order=alt.Order("order", sort="ascending"),
).properties(
    width=600, height=300, title="Customers with app version 3.6 and ordered card"
).display()

In [31]:
current_date = datetime.today().strftime("%Y-%m-%d")
ovr_customers = round(user_data("ovr_customers"))
treatment_ratio = user_data("ovr_treatment_perc")
card_no_view_rate = round(user_data("no_card_view_perc"), 2)
click_cst = round(user_data("button_clicked"))
click_ratio = round(user_data("button_click_rate"), 1)

md(
    f"As of {current_date} there were {ovr_customers} users that have a required app version and ordered a card since \
    the 15th of April. {treatment_ratio}% of the customers are in the treatment group, the rest in control group.</br>\
    </br>\
    Around {card_no_view_rate}% of customers in both groups did not view the cards tab.\
    This part of customers will also be dismissed from the experiment, because they are\
    not the target group for the testing. These customers have probably ordered a card using\
    webapp.</br>\
    </br>\
    In the treatment group, {click_cst} customers clicked on the button for updating tracking information.\
    This was {click_ratio}% of treatment customers in the experiment."
)

As of 2021-04-30 there were 30493 users that have a required app version and ordered a card since     the 15th of April. 49.99836027940839% of the customers are in the treatment group, the rest in control group.</br>    </br>    Around 10.72% of customers in both groups did not view the cards tab.    This part of customers will also be dismissed from the experiment, because they are    not the target group for the testing. These customers have probably ordered a card using    webapp.</br>    </br>    In the treatment group, 1348 customers clicked on the button for updating tracking information.    This was 9.9% of treatment customers in the experiment.

### 1.2. Sample criteria <a class="anchor" id="12"></a>

In [37]:
sample_size = round(user_data("total_sample"))
treatment_ratio = round(
    user_data("treatment_sample") / user_data("total_sample") * 100, 1
)

md(
    f"Here is the final requirements for customers to be included in the experiment:\
    <ol>\
        <b>\
        <li>App version 3.60</li>\
        <li>Ordered a new card</li>\
        <li>Visited cards tab</li>\
        </b>\
    </ol>\
    This leaves us with {sample_size} customers, {treatment_ratio}% in the treatment group and the rest in control."
)

Here is the final requirements for customers to be included in the experiment:    <ol>        <b>        <li>App version 3.60</li>        <li>Ordered a new card</li>        <li>Visited cards tab</li>        </b>    </ol>    This leaves us with 27224 customers, 50.1% in the treatment group and the rest in control.

### 1.3. CS contacts and their tags <a class="anchor" id="13"></a>

This experiment includes only these CS contacts that were done between card order time and card activation. If there is no card activation then a 5 day period time from order date is taken. This method should capture most of CS contacts that happened during the shipment period, because average order-activation period was 5 days during the 15th and 30th of April.

In this experiment, several tags were selected as being relevant to this test. Their results will be displayed separately from other CS tags to have a better understanding on how customers contacted CS. Below are these tags and reasons for their selection:
1. `card_delivery_contacts` - this tag is the primary one for card delivery
2. `contact_data_contacts` - shipping address changes/questions should be tagged with contact data
3. `legal_data_contacts` - there is a possibility that this tag would be selected for shipping address or that the customer needed reassurance on a related field
4. `not_tagged_contacts` - there is a fair share of not tagged contacts. It was decided to include them because customers who contact for reassurance could have a vague description of their issue resulting in no tag. Also since we are taking only 5 days since the card order on average, this tag could contain a lot of our target customers.

All other tags will be sorted under `other cs contact` label.

# 2. Effect on CS contacts <a class="anchor" id="2"></a>

### 2.1. Customers who reached out to CS <a class="anchor" id="21"></a>

In the first place, let's look at the number of customers who reached out to CS. Below chart displays the percentage of customers who reached out to CS either with a possible test related issue or with another type of issue. Possible test related issues here are defined by CS tags 'card_delivery', 'contact_data', 'legal_data' and 'not_tagged'.

In [38]:
data = con.execute(
    """
        with a as (
            select 
                case when treatment_group = true then 'Treatment' else 'Control' end as user_group,
                case 
                    when contacted_for_cards = true then 'Related CS Contact'
                    when contacted_cs = true then 'Other CS Contact'
                    else 'Did not contact CS'
                end as contacted_cs,
                case 
                    when contacted_for_cards = true then 1
                    when contacted_cs = true then 2
                    else 3
                end as row_order,
                count(*) as customers
            from raw_data
            where viewed_cards_tab = True
            group by 1, 2, 3
        )
        select 
            user_group,
            contacted_cs,
            row_order,
            customers,
            customers::numeric / sum(customers) over (partition by user_group) as percentage
        from a 
    """
).fetchdf()

# define colors
color_scale = alt.Scale(
    domain=["Related CS Contact", "Other CS Contact", "Did not contact CS"],
    range=["#58B99D", "#EAC645", "#98A5A5"],
)

bars = (
    alt.Chart(data)
    .mark_bar()
    .encode(
        x=alt.X("customers", title="Customers"),
        y=alt.Y("user_group", title=None),
        color=alt.Color(
            "contacted_cs",
            legend=alt.Legend(title="Cst-CS interaction"),
            scale=color_scale,
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(width=800, height=300, title="Treatment vs Control CS contacts")
)

text = (
    alt.Chart(data)
    .mark_text(align="center", fontSize=12, dx=-17, dy=3, color="white")
    .encode(
        x=alt.X("sum(customers)", stack="zero"),
        y=alt.Y("user_group"),
        detail="contacted_cs",
        text=alt.Text("sum(percentage)", format=".1%"),
        order=alt.Order("row_order", sort="ascending"),
    )
)

bars + text

In [45]:
customer_contacts_pp_change = round(
    (user_data("treatment_no_cs") / user_data("treatment_sample") * 100)
    - (user_data("control_no_cs") / user_data("control_sample") * 100),
    1,
)

# do a binomial test
p_val = round(
    binom_test(
        x=user_data("treatment_no_cs"),
        n=user_data("treatment_sample"),
        p=user_data("control_no_cs") / user_data("control_sample"),
        alternative="greater",
    ),
    3,
)

md(
    f"Customer contacts reduced only by {customer_contacts_pp_change} percentage points when comparing treatment vs\
    control grups. P-value of {p_val}. It is evident that the number of customers who contacted CS is identical\
    in both groups."
)

Customer contacts reduced only by -0.1 percentage points when comparing treatment vs    control grups. P-value of 0.624. It is evident that the number of customers who contacted CS is identical    in both groups.

### 2.2. Customers who clicked and reached out to CS <a class="anchor" id="22"></a>

In [46]:
control_sample = round(user_data("control_sample"))

md(
    f"We could also look at the customers who actually interacted with the button. It is possible, that these\
    concentrated customers had less CS contacts, but their size was only {click_ratio}% of the overall treatment\
    group. Below chart displays CS contacts of {click_cst} customers who clicked on the button vs all {control_sample}\
    customers in the control group."
)

We could also look at the customers who actually interacted with the button. It is possible, that these    concentrated customers had less CS contacts, but their size was only 9.9% of the overall treatment    group. Below chart displays CS contacts of 1348 customers who clicked on the button vs all 13586    customers in the control group.

In [47]:
data = con.execute(
    """
        with a as (
            select 
                case 
                    when button_clicked = true then 'Clicked Button' 
                    when treatment_group = false then 'Control' 
                    else null 
                end as user_group,
                case 
                    when contacted_for_cards = true then 'Related CS Contact'
                    when contacted_cs = true then 'Other CS Contact'
                    else 'Did not contact CS'
                end as contacted_cs,
                case 
                    when contacted_for_cards = true then 1
                    when contacted_cs = true then 2
                    else 3
                end as row_order,
                count(*) as customers
            from raw_data
            where viewed_cards_tab = True
                and (
                    button_clicked = true
                    or treatment_group = false
                )
            group by 1, 2, 3
        )
        select 
            user_group,
            contacted_cs,
            row_order,
            customers,
            customers::numeric / sum(customers) over (partition by user_group) as percentage
        from a 
    """
).fetchdf()

# define colors
color_scale = alt.Scale(
    domain=["Related CS Contact", "Other CS Contact", "Did not contact CS"],
    range=["#58B99D", "#EAC645", "#98A5A5"],
)

bars = (
    alt.Chart(data)
    .mark_bar()
    .encode(
        x=alt.X("percentage", title="% of Customers"),
        y=alt.Y("user_group", title=None),
        color=alt.Color(
            "contacted_cs",
            legend=alt.Legend(title="Cst-CS interaction"),
            scale=color_scale,
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(width=800, height=300, title="Control vs. Clicked button CS contacts")
)

text = (
    alt.Chart(data)
    .mark_text(align="center", fontSize=12, dx=-17, dy=3, color="white")
    .encode(
        x=alt.X("percentage", stack="zero"),
        y=alt.Y("user_group"),
        detail="contacted_cs",
        text=alt.Text("percentage", format=".1%"),
        order=alt.Order("row_order", sort="ascending"),
    )
)

bars + text

In [56]:
customer_contacts_pp_change = round(
    (user_data("button_no_cs") / user_data("button_clicked") * 100)
    - (user_data("control_no_cs") / user_data("control_sample") * 100),
    1,
)

related_contacts_pp_change = round(
    (user_data("button_related_cs") / user_data("button_clicked") * 100)
    - (user_data("control_related_cs") / user_data("control_sample") * 100),
    1,
)

unrelated_contacts_pp_change = round(
    (user_data("button_unrelated_cs") / user_data("button_clicked") * 100)
    - (user_data("control_unrelated_cs") / user_data("control_sample") * 100),
    1,
)

# do a binomial test
p_val = round(
    binom_test(
        x=user_data("button_no_cs"),
        n=user_data("button_clicked"),
        p=user_data("control_no_cs") / user_data("control_sample"),
        alternative="less",
    ),
    3,
)

md(
    f"Looking at the chart, it appears that a higher percentage of customers who interacted with the button ended \
    up contacting CS. Change in customers without CS contacts was {customer_contacts_pp_change} percentage points. \
    With a p-value of {p_val} the difference is significant enough to confirm that the customers who clicked on\
    the button actually contacted CS more than the ones in the control group.\
    </br>\
    </br>\
    Additionally, customers who clicked on the button contacted CS much more often using the tags relevant to this\
    research (card_delivery, contact_data, legal_data and not_tagged). Percentage point change among these contacts\
    was {related_contacts_pp_change}, which is significantly more than {unrelated_contacts_pp_change} pp change in\
    other type of CS contacts.\
    </br>\
    </br>\
    This suggests, that the button is used more by customers who are actually facing a problem, but apparently\
    it does not really solve it."
)

Looking at the chart, it appears that a higher percentage of customers who interacted with the button ended     up contacting CS. Change in customers without CS contacts was -4.6 percentage points.     With a p-value of 0.0 the difference is significant enough to confirm that the customers who clicked on    the button actually contacted CS more than the ones in the control group.    </br>    </br>    Additionally, customers who clicked on the button contacted CS much more often using the tags relevant to this    research (card_delivery, contact_data, legal_data and not_tagged). Percentage point change among these contacts    was 3.8, which is significantly more than 0.8 pp change in    other type of CS contacts.    </br>    </br>    This suggests, that the button is used more by customers who are actually facing a problem, but apparently    it does not really solve it.

### 2.3. CS tag dynamics <a class="anchor" id="23"></a>

Finally let us take a look at the dynamics of separate tags for three groups of customers: a) control group b) treatment group c) treatment who clicked on button. This could give us some insight on what issues the customers were facing.

Since the customers who clicked on button are much fewer, side-by-side charts are displayed in contacts per customer. It also should be noted, that one customer here could have several contacts with different contact reasons. All contacts will be taken in the numerator.

In [57]:
tags = [
    "card_delivery_contacts",
    "contact_data_contacts",
    "legal_data_contacts",
    "not_tagged_contacts",
]
charts = []
test_dict = {}
colors = [
    "#63BA9D",
    "#EAC645",
    "#5296D6",
    "#925CB0",
    "#549E87",
    "#E6A03B",
    "#447EB4",
    "#8648A7",
]
for idx, val in enumerate(tags):
    test_setup = con.execute(
        f"""
            select
                case 
                    when treatment_group = False then 'Control group'
                    when button_clicked = True then 'Treatment (click)'
                    else 'Treatment (no click)'
                end as user_group,
                sum({val}) as contacts,
                sum(contacts) as customers_in_group,
                sum({val}) / sum(contacts) contacts_per_cst
            from raw_data
            where contacts >= 1
            group by 1
            order by 1
        """
    ).fetchdf()

    bars = (
        alt.Chart(test_setup)
        .mark_bar()
        .encode(
            x=alt.X(
                "user_group",
                title=None,
                axis=alt.Axis(labels=False),
                sort=["Treatment (click)", "Control group", "Treatment (no click"],
            ),
            y=alt.Y("contacts_per_cst:Q", title="Contacts per customer"),
            color=alt.condition(
                alt.datum.user_group == "Control group",
                alt.value(colors[idx + 4]),
                alt.value(colors[idx]),
            ),
        )
        .properties(width=150, height=400, title=f"{val}")
    )

    text = (
        alt.Chart(test_setup)
        .mark_text(
            align="left",
            fontSize=14,
            dy=6,
            dx=5,
            color="white",
            angle=270,
            baseline="bottom",
        )
        .encode(
            x=alt.X(
                "user_group",
                sort=["Treatment (click)", "Control group", "Treatment (no click"],
            ),
            y=alt.value(400),
            text=alt.Text("user_group"),
        )
    )

    charts.append(bars + text)

    # perform tests for all values
    click_contacts = round(test_setup.iloc[1]["contacts"])
    click_size = round(test_setup.iloc[1]["customers_in_group"])
    no_click_contacts = round(test_setup.iloc[2]["contacts"])
    no_click_size = round(test_setup.iloc[2]["customers_in_group"])
    control_rate = test_setup.iloc[0]["contacts_per_cst"]

    click_less = round(
        binom_test(x=click_contacts, n=click_size, p=control_rate, alternative="less"),
        2,
    )
    click_more = round(
        binom_test(
            x=click_contacts, n=click_size, p=control_rate, alternative="greater"
        ),
        2,
    )
    no_click_less = round(
        binom_test(
            x=no_click_contacts, n=no_click_size, p=control_rate, alternative="less"
        ),
        2,
    )
    no_click_more = round(
        binom_test(
            x=no_click_contacts, n=no_click_size, p=control_rate, alternative="greater"
        ),
        2,
    )

    test_dict[val] = {
        "click": {
            "contacts": click_contacts,
            "group_size": click_size,
            "p_value_less": click_less,
            "p_value_more": click_more,
        },
        "no_click": {
            "contacts": no_click_contacts,
            "group_size": no_click_size,
            "p_value_less": no_click_less,
            "p_value_more": no_click_more,
        },
    }

test_setup = con.execute(
    f"""
        select
            case 
                when treatment_group = False then 'Control group'
                when button_clicked = True then 'Treatment (click)'
                else 'Treatment (no click)'
            end as user_group,
            sum(contacts - card_delivery_contacts - contact_data_contacts - legal_data_contacts - not_tagged_contacts) as contacts,
            sum(contacts) as customers_in_group,
            sum(contacts - card_delivery_contacts - contact_data_contacts - legal_data_contacts - not_tagged_contacts) / sum(contacts) contacts_per_cst
        from raw_data
        where contacts >= 1
        group by 1
        order by 1
    """
).fetchdf()

bars = (
    alt.Chart(test_setup)
    .mark_bar()
    .encode(
        x=alt.X(
            "user_group",
            title=None,
            axis=alt.Axis(labels=False),
            sort=["Treatment (click)", "Control group", "Treatment (no click"],
        ),
        y=alt.Y("contacts_per_cst:Q", title="Contacts per customer"),
        color=alt.condition(
            alt.datum.user_group == "Control group",
            alt.value("#B24334"),
            alt.value("#D65746"),
        ),
    )
    .properties(width=150, height=400, title="other_cs_contacts")
)

text = (
    alt.Chart(test_setup)
    .mark_text(
        align="left",
        fontSize=14,
        dy=6,
        dx=5,
        color="white",
        angle=270,
        baseline="bottom",
    )
    .encode(
        x=alt.X(
            "user_group",
            sort=["Treatment (click)", "Control group", "Treatment (no click"],
        ),
        y=alt.value(400),
        text=alt.Text("user_group"),
    )
)
charts.append(bars + text)

click_contacts = round(test_setup.iloc[1]["contacts"])
click_size = round(test_setup.iloc[1]["customers_in_group"])
no_click_contacts = round(test_setup.iloc[2]["contacts"])
no_click_size = round(test_setup.iloc[2]["customers_in_group"])
control_rate = test_setup.iloc[0]["contacts_per_cst"]

click_less = round(
    binom_test(x=click_contacts, n=click_size, p=control_rate, alternative="less"), 2
)
click_more = round(
    binom_test(x=click_contacts, n=click_size, p=control_rate, alternative="greater"), 2
)
no_click_less = round(
    binom_test(
        x=no_click_contacts, n=no_click_size, p=control_rate, alternative="less"
    ),
    2,
)
no_click_more = round(
    binom_test(
        x=no_click_contacts, n=no_click_size, p=control_rate, alternative="greater"
    ),
    2,
)

test_dict["other_cs_contacts"] = {
    "click": {
        "contacts": click_contacts,
        "group_size": click_size,
        "p_value_less": click_less,
        "p_value_more": click_more,
    },
    "no_click": {
        "contacts": no_click_contacts,
        "group_size": no_click_size,
        "p_value_less": no_click_less,
        "p_value_more": no_click_more,
    },
}


alt.hconcat(*charts)

##### Control group vs. treatment (no click)

In [58]:
var_1 = test_dict["contact_data_contacts"]["no_click"]["p_value_more"]
var_2 = test_dict["not_tagged_contacts"]["no_click"]["p_value_less"]

md(
    f"Differences between the control group and treatment who <b>did not click</b> on the button show no significant\
    differences on all contact reasons except for card delivery and not tagged. P-vales are all higher than 0.05.\
    </br>\
    </br>\
    Contact data showed a p-value of {var_1} and not tagged contacts also showed a p-value of {var_2}. This\
    observation is interesting because the overall sum of customers who contacted CS were not different\
    among these groups. This would suggest that the presence of the tracking information allowed customers to express their\
    problem clearly and this would lead to more contacts tagged with a contact data tag. This, however, is\
    just a speculation, because the nature of untagged contacts is not entirely clear."
)

Differences between the control group and treatment who <b>did not click</b> on the button show no significant    differences on all contact reasons except for card delivery and not tagged. P-vales are all higher than 0.05.    </br>    </br>    Contact data showed a p-value of 0.01 and not tagged contacts also showed a p-value of 0.01. This    observation is interesting because the overall sum of customers who contacted CS were not different    among these groups. This would suggest that the presence of the tracking information allowed customers to express their    problem clearly and this would lead to more contacts tagged with a contact data tag. This, however, is    just a speculation, because the nature of untagged contacts is not entirely clear.

##### Control group vs. treatment (click)

In [59]:
var_1 = test_dict["contact_data_contacts"]["click"]["p_value_more"]
var_2 = test_dict["legal_data_contacts"]["click"]["p_value_more"]
var_3 = test_dict["not_tagged_contacts"]["click"]["p_value_less"]
var_4 = test_dict["contact_data_contacts"]["click"]["contacts"]
var_5 = test_dict["legal_data_contacts"]["click"]["contacts"]
var_6 = test_dict["card_delivery_contacts"]["click"]["p_value_more"]
var_7 = test_dict["other_cs_contacts"]["click"]["p_value_less"]

md(
    f"When comparing control group against the ones <b>who clicked</b> on the button, there are no significant \
    differences among contact data, legal data and not tagged contacts with respective p-values of {var_1}, {var_2}\
    and {var_3}. Contact data and legal data tags do not have a high confidence also because their observations are\
    low, {var_4} and {var_5} respectively.</br>\
    </br>\
    Card delivery and other cs contacts had p-values of {var_6} and {var_7}, showing significant differences. This\
    could suggest, that this group has a higher concentration of customers with card delivery problems. They clicked \
    on the button - were looking for a solution to their issue, but the button could not solve it."
)

When comparing control group against the ones <b>who clicked</b> on the button, there are no significant     differences among contact data, legal data and not tagged contacts with respective p-values of 0.09, 0.24    and 0.44. Contact data and legal data tags do not have a high confidence also because their observations are    low, 20 and 7 respectively.</br>    </br>    Card delivery and other cs contacts had p-values of 0.0 and 0.01, showing significant differences. This    could suggest, that this group has a higher concentration of customers with card delivery problems. They clicked     on the button - were looking for a solution to their issue, but the button could not solve it.

# 3. Conclusion <a class="anchor" id="3"></a>

Adding card tracking information did not affect the volume of CS contacts. Control and treatment groups had the same proportion of customers who went to CS. Adding the card tracking information and the button to the app would not reduce the overall number of CS contacts.

Looking at the customers who clicked the button, we saw a higher concentration of clients, who were looking to a solution to their card delivery problem. These customers had a higher percentage of CS contacts. They were more likely to contact CS about card delivery issues and less likely to contact about other contact reasons. It must again be emphasized, that this did not effect the overall sum of CS contacts - just showed that the customers who clicked on the button were looking for a solution.

In [60]:
HTML(
    """
<script>
    code_show=true; 
    function code_toggle() {
        if (code_show){
            $('div.input').hide();
        } else {
            $('div.input').show();
            }
        code_show = !code_show
 
        $('div.output_subarea').css("text-align", "center"); 
        $('body').css("font-family", "Montserrat, sans-serif");
        $('h1').css("font-family", "Karla, sans-serif");
        $('h2').css("font-family", "Karla, sans-serif");
    } 
    $( document ).ready(code_toggle);
    
    </script>
    <form action="javascript:code_toggle()">
        <input type="submit" value="Click here to toggle on/off the raw code.">
    </form>
"""
)

In [55]:
def user_data(col):
    return cst_data.iloc[0][col]


cst_data = con.execute(
    f"""
        select
           count(*) as ovr_customers,
           avg((treatment_group = True)::int) * 100 as ovr_treatment_perc,
           avg((viewed_cards_tab = False)::int) * 100 as no_card_view_perc,
           sum((button_clicked = True)::int) as button_clicked,
           avg(
                case
                    when viewed_cards_tab = True
                        and treatment_group = True
                    then (button_clicked = True)::int
                    else null
                end
            ) * 100 as button_click_rate,
            sum((viewed_cards_tab = True)::int) as total_sample,
            sum((viewed_cards_tab = True and treatment_group = True)::int) as treatment_sample,
            sum((viewed_cards_tab = True and treatment_group = False)::int) as control_sample,
            sum((viewed_cards_tab = True and treatment_group = True and contacted_cs = False)::int) as treatment_no_cs,
            sum((viewed_cards_tab = True and treatment_group = False and contacted_cs = False)::int) as control_no_cs,
            sum((button_clicked = True and contacted_cs = False)::int) as button_no_cs,
            sum((button_clicked = True and contacted_for_cards = True)::int) as button_related_cs,
            sum((button_clicked = True and contacted_cs = True and contacted_for_cards = False)::int) as button_unrelated_cs,
            sum((viewed_cards_tab = True and treatment_group = False and contacted_for_cards = True)::int) as control_related_cs,
            sum((viewed_cards_tab = True and treatment_group = False and contacted_cs = True and contacted_for_cards = False)::int) as control_unrelated_cs
            
        from raw_data
    """
).fetchdf()
cst_data

Unnamed: 0,ovr_customers,ovr_treatment_perc,no_card_view_perc,button_clicked,button_click_rate,total_sample,treatment_sample,control_sample,treatment_no_cs,control_no_cs,button_no_cs,button_related_cs,button_unrelated_cs,control_related_cs,control_unrelated_cs
0,30493,49.99836,10.720493,1348.0,9.884147,27224.0,13638.0,13586.0,12215.0,12179.0,1147.0,112.0,89.0,612.0,795.0
