# Add an Image Funnel Analysis

The phabricator task for this work is [T300683](https://phabricator.wikimedia.org/T300683)

We're doing a similar analysis as we did for Add a Link. In the process, we'll apply some of our learnings from that analysis. Things to keep in mind are:

* Carefully define the steps of the funnel in such a way that it translates into a straightforward SQL query.
* Figure out if there are certain questions that are answered by calculating user proportions (e.g. the overarching key question of task completion is not necessarily covered by the funnel, but should instead allow for users to start the task multiple times.
* The funnel does not end with a completed edit, but with the open question of whether the edit was reverted.
* The number of users on mobile vs desktop was unexpected. Turns out that it might be because desktop users are asked to set topics and tasks to activate the module, whereas mobile users have immediate access to random tasks. We might therefore want to consider starting the funnel earlier (e.g. Visits Homepage) so that we can understand the path to clicking a task.

From the Add a Link discussion, we get a lot of notes about possible paths and such, "how many users do X?", "how many do X then Y?", etc… These are always interesting, but will be either outside the scope of this analysis, or need to be answered with specific queries.



In [1]:
import datetime as dt

import pandas as pd
import numpy as np

from collections import defaultdict

from wmfdata import spark, mariadb

from scipy import stats

In [2]:
## We'll gather data from December 2021 and January 2022, as we per the data gathering
## time have complete data for both months.

start_date = dt.date(2021, 12, 1)
end_date = dt.date(2022, 1, 31)

## List of wikis that we're gathering data from:
wikis = ['arwiki', 'bnwiki', 'cswiki']

## Lists of known users to ignore (e.g. test accounts and experienced users)
known_users = defaultdict(set)
known_users['cswiki'].update([14, 127629, 303170, 342147, 349875, 44133, 100304, 307410, 439792, 444907,
                              454862, 456272, 454003, 454846, 92295, 387915, 398470, 416764, 44751, 132801,
                              137787, 138342, 268033, 275298, 317739, 320225, 328302, 339583, 341191,
                              357559, 392634, 398626, 404765, 420805, 429109, 443890, 448195, 448438,
                              453220, 453628, 453645, 453662, 453663, 453664, 440694, 427497, 272273,
                              458025, 458487, 458049, 59563, 118067, 188859, 191908, 314640, 390445,
                              451069, 459434, 460802, 460885, 79895, 448735, 453176, 467557, 467745,
                              468502, 468583, 468603, 474052, 475184, 475185, 475187, 475188, 294174,
                              402906, 298011])

known_users['kowiki'].update([303170, 342147, 349875, 189097, 362732, 384066, 416362, 38759, 495265,
                              515553, 537326, 566963, 567409, 416360, 414929, 470932, 472019, 485036,
                              532123, 558423, 571587, 575553, 576758, 360703, 561281, 595100, 595105,
                              595610, 596025, 596651, 596652, 596653, 596654, 596655, 596993, 942,
                              13810, 536529])

known_users['viwiki'].update([451842, 628512, 628513, 680081, 680083, 680084, 680085, 680086, 355424,
                              387563, 443216, 682713, 659235, 700934, 705406, 707272, 707303, 707681, 585762])

known_users['arwiki'].update([237660, 272774, 775023, 1175449, 1186377, 1506091, 1515147, 1538902,
                              1568858, 1681813, 1683215, 1699418, 1699419, 1699425, 1740419, 1759328, 1763990])

## Grab the user IDs of known test accounts so they can be added to the exclusion list

def get_known_users(wiki):
    '''
    Get user IDs of known test accounts and return a set of them.
    '''
    
    username_patterns = ["MMiller", "Zilant", "Roan", "KHarlan", "MWang", "SBtest",
                         "Cloud", "Rho2019", "Test"]

    known_user_query = '''
SELECT user_id
FROM user
WHERE user_name LIKE "{name_pattern}%"
    '''
    
    known_users = set()
    
    for u_pattern in username_patterns:
        new_known = mariadb.run(known_user_query.format(
            name_pattern = u_pattern), wiki)
        known_users = known_users | set(new_known['user_id'])

    return(known_users)
        
for wiki in wikis:
    known_users[wiki] = known_users[wiki] | get_known_users(wiki)

## Helper Functions

In [3]:
def make_known_users_sql(kd, wiki_column, user_column):
    '''
    Based on the dictionary `kd` mapping wiki names to sets of user IDs of known users,
    create a SQL expression to exclude users based on the name of the wiki matching `wiki_column`
    and the user ID not matching `user_column`
    '''
    
    wiki_exp = '''({w_column} = '{wiki}' AND {u_column} NOT IN ({id_list}))'''
    
    expressions = list()

    ## Iteratively build the expression for each wiki
    for wiki_name, wiki_users in kd.items():
        expressions.append(wiki_exp.format(
            w_column = wiki_column,
            wiki = wiki_name,
            u_column = user_column,
            id_list = ','.join([str(u) for u in wiki_users])
        ))
    
    ## We then join all the expressions with an OR, and we're done.
    return(' OR '.join(expressions))
    

In [4]:
def make_partition_statement(start_ts, end_ts, prefix = ''):
    '''
    This takes the two timestamps and creates a statement that selects
    partitions based on `year`, `month`, and `day` in order to make our
    data gathering not use excessive amounts of data. It assumes that
    `start_ts` and `end_ts` are either in the same year, or if spanning
    a year boundary are within a month apart.
    This assumption simplifies the code and output a lot.
    
    An optional prefix can be set to enable selecting partitions for
    multiple tables with different aliases.
    
    :param start_ts: start timestamp
    :type start_ts: datetime.datetime
    
    :param end_ts: end timestamp
    :type end_ts: datetime.datetime
    
    :param prefix: prefix to use in front of partition clauses, "." is added automatically
    :type prefix: str
    '''
    
    if prefix:
        prefix = f'{prefix}.' # adds "." after the prefix
    
    # there are three cases:
    # 1: month and year are the same, output a "BETWEEN" statement with the days
    # 2: the years are the same, and the months differ by 1: output a statement for each month
    # 3: the years are the same: create a list of statements from start_ts.month to end_ts.month,
    #    return them OR'ed together
    # 4: the years differ by 1, start_ts is December and end_ts is January, do the same as #2
    # 5: anything else, raise an exception because this isn't implemented yet.
    
    if start_ts.year == end_ts.year and start_ts.month == end_ts.month:
        return(f'''{prefix}year = {start_ts.year}
AND {prefix}month = {start_ts.month}
AND {prefix}day BETWEEN {start_ts.day} AND {end_ts.day}''')
    elif start_ts.year == end_ts.year and (end_ts.month - start_ts.month) == 1:
        return(f'''
(
    ({prefix}year = {start_ts.year}
     AND {prefix}month = {start_ts.month}
     AND {prefix}day >= {start_ts.day})
 OR ({prefix}year = {end_ts.year}
     AND {prefix}month = {end_ts.month}
     AND {prefix}day <= {end_ts.day})
)''')
    elif start_ts.year == end_ts.year:
        # do the start month as a list
        parts = [f'''({prefix}year = {start_ts.year}
     AND {prefix}month = {start_ts.month}
     AND {prefix}day >= {start_ts.day})''']
        # for month +1 to end month, add each month
        for m in range(start_ts.month+1, end_ts.month):
            parts.append(f'''({prefix}year = {start_ts.year}
            AND {prefix}month = {m})''')
        # then append the end month and return a parenthesis OR'ed together of all of it
        parts.append(f'''({prefix}year = {end_ts.year}
     AND {prefix}month = {end_ts.month}
     AND {prefix}day <= {end_ts.day})''')
        return('({})'.format(
            '\nOR\n'.join(parts)
        ))
    elif (end_ts.year - start_ts.year) == 1 and start_ts.month == 12 and end_ts.month == 1:
        return(f'''
(
    ({prefix}year = {start_ts.year}
     AND {prefix}month = {start_ts.month}
     AND {prefix}day >= {start_ts.day})
 OR ({prefix}year = {end_ts.year}
     AND {prefix}month = {end_ts.month}
     AND {prefix}day <= {end_ts.day})
)''')
    else:
        raise Exception('Difference between start and end timestamps is not implemented. See code for details.')


In [5]:
def get_variant_data(wikis, variant_property = 'growthexperiments-homepage-variant'):
    '''
    Connects to the given wikis and queries for the value of the user property that
    defines what experiment variant the users are in. This can later be used to
    filter out all users who are not in a specific variant
    (e.g. "imagerecommendation" for Add an Image)
    '''
    
    variant_query = f'''
    SELECT
      DATABASE() AS wiki,
      up_user AS user_id,
      up_value AS variant_name
    FROM user_properties
    WHERE up_property = "{variant_property}"
    '''
    
    return(mariadb.run(variant_query, wikis))

## Funnel Definition

Similarly as we did for Add a Link, we want to understand what happens the first time a user encounters the Add an Image task. This is because it's the only time we're guaranteed that they'll see the onboarding screen and thus help us learn their behaviour related to it.

One thing we miss with this approach is that users might click a task only to find that they weren't interested in it and go back to find another, thus this first encounter isn't the session where they're actually working on the task. We could mitigate that by identifying the session where they take a certain action, but that again would mean that they could've seen onboarding previously and skipped it. In other words, that means we're back to "users do X (in any session) followed by doing Y (in any session) within some amount of time", and that's not a funnel analysis.

```
Step 1: Homepage visit: HomepageVisit entry + at least one HomepageModule impression entry.
Step 2: Add an Image task click
Step 3: Add an Image impression
Step 4: Onboarding impression
Step 5.1: Skipping onboarding, move to "On task"
Step 5.2: Moving to Onboarding Step 2
Step 6: Onboarding Step 2 impression
Step 6.1: Skipping onboarding, move to "On task"
Step 6.2: Moving to Onboarding Step 3
Step 7: Onboarding Step 3 impression
Step 7.1: Skipping onboarding, move to "On task"
Step 7.2: Moving to onboarding Step 4
Step 8: Onboarding Step 4 impression
Step 8.1: Clicking "Get started", completing onboarding, move to "On task"
Step 9: On task
```

I'd like to figure out a way to flag users clicking the "Don't show this again" checkbox. The state of that will be captured in 5.1, 6.1, 7.1, and 8.1.

**Note:** None of the "Moving to Onboarding Step…" events can currently be part of the funnel, because the "back" and "next" buttons in the onboarding dialogue was not instrumented (filed as [T301486](https://phabricator.wikimedia.org/T301486)). We therefore use "Onboarding Step… impression" as the next step, and note that we won't be able to understand if we have a loss of users/data between clicking "next" and the impression of the next onboarding screen.

In [6]:
onboarding_funnel_query = '''
WITH hp_visits AS (
    SELECT
        -- HomepageVisit is the authoritative source here, and we're grouping by
        -- homepage_pageview_token to deduplicate the multiple module impressions
        hpv.event.homepage_pageview_token,
        FIRST_VALUE(hpv.wiki) AS wiki,
        FIRST_VALUE(hpv.event.user_id) AS user_id,
        FIRST_VALUE(hpv.event.is_mobile) AS is_mobile,
        FIRST_VALUE(hpv.dt) AS visit_dt,
        FIRST_VALUE(1) AS homepage_visit,
        FIRST_VALUE(IF(ssac.event.userid IS NOT NULL, 1, 0)) AS is_newcomer,
        FIRST_VALUE(IF(unix_timestamp(hpv.dt, "yyyy-MM-dd'T'HH:mm:ss.SSSS'Z'") -
                        unix_timestamp(ssac.dt, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") < 60*60*24, 1, 0))
            AS is_24hr_visit
    FROM event.homepagevisit AS hpv
    JOIN event.homepagemodule AS hpm
    ON hpv.event.homepage_pageview_token = hpm.event.homepage_pageview_token
    LEFT JOIN event.serversideaccountcreation AS ssac
    ON hpv.wiki = ssac.wiki
    AND hpv.event.user_id = ssac.event.userid
    WHERE {hpv_partition_statement}
    AND hpv.wiki IN ({wiki_list})
    AND {hpv_known_user_id_expression}
    AND {hpm_partition_statement}
    AND {ssac_partition_statement}
    AND ssac.wiki IN ({wiki_list})
    AND hpv.event.is_mobile = true
    AND hpm.event.action = "impression"
    GROUP BY hpv.event.homepage_pageview_token
),
newcomer_tasks AS (
    -- grab unique task token/task type data from newcomer tasks
    SELECT
        DISTINCT event.newcomer_task_token, event.task_type, event.page_id
    FROM event.newcomertask
    WHERE {partition_statement}
),
addimage_task_clicks AS (
    -- clicks to Add an Image tasks
    SELECT
        event.homepage_pageview_token,
        dt AS click_dt,
        row_number() OVER (PARTITION BY hpm.wiki, hpm.event.user_id ORDER BY hpm.dt) AS click_number
    FROM hp_visits AS hpv
    JOIN event.homepagemodule AS hpm
    ON hpv.homepage_pageview_token = hpm.event.homepage_pageview_token
    JOIN newcomer_tasks AS nt
    ON str_to_map(hpm.event.action_data, ";", "=")["newcomerTaskToken"] = nt.newcomer_task_token
    WHERE {partition_statement}
    AND hpm.wiki IN ({wiki_list})
    AND event.action IN ("se-task-click", "se-edit-button-click")
    AND nt.task_type = "image-recommendation"
    AND dt > hpv.visit_dt
),
first_task_click AS (
    SELECT
        *
    FROM addimage_task_clicks
    WHERE click_number = 1
),
nosuggestion_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN first_task_click AS ftc
    ON stimg.homepage_pageview_token = ftc.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "nosuggestions_dialog"
    AND action = "impression"
    AND stimg.dt > ftc.click_dt
    GROUP BY stimg.homepage_pageview_token
),
addimage_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN first_task_click AS ftc
    ON stimg.homepage_pageview_token = ftc.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface IN ("machinesuggestions_mode", "recommendedimagetoolbar_dialog")
    AND action = "impression"
    AND stimg.dt > ftc.click_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step1_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN addimage_impression AS ai
    ON stimg.homepage_pageview_token = ai.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_1_dialog"
    AND action = "impression"
    -- not limiting by timestamp because I'm unsure if it always occurs after
    -- the first impression of the interface
    GROUP BY stimg.homepage_pageview_token 
),
onb_step1_skip AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step1_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_1_dialog"
    AND action = "skip_all"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step2_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step1_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_2_dialog"
    AND action = "impression"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step2_skip AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step2_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_2_dialog"
    AND action = "skip_all"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step3_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step2_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_3_dialog"
    AND action = "impression"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step3_skip AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step3_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_3_dialog"
    AND action = "skip_all"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step4_impression AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step3_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_4_dialog"
    AND action = "impression"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
onb_step4_getstarted AS (
    SELECT
        stimg.homepage_pageview_token,
        MIN(dt) AS event_dt
    FROM event.mediawiki_structured_task_article_image_suggestion_interaction AS stimg
    JOIN onb_step4_impression AS onsi
    ON stimg.homepage_pageview_token = onsi.homepage_pageview_token
    WHERE {partition_statement}
    AND active_interface = "onboarding_step_4_dialog"
    AND action = "get_started"
    AND stimg.dt > onsi.event_dt
    GROUP BY stimg.homepage_pageview_token
),
on_task AS (
    SELECT
        homepage_pageview_token,
        MIN(event_dt) AS event_dt
    FROM (
        SELECT homepage_pageview_token, event_dt
        FROM onb_step1_skip
        UNION ALL
        SELECT homepage_pageview_token, event_dt
        FROM onb_step2_skip
        UNION ALL
        SELECT homepage_pageview_token, event_dt
        FROM onb_step3_skip
        UNION ALL
        SELECT homepage_pageview_token, event_dt
        FROM onb_step4_getstarted
    ) AS ontask_events
    GROUP BY homepage_pageview_token
)
SELECT
    hpv.*,
    ftc.click_dt,
    ftc.click_number,
    IF(nosuggestion_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS nosuggestion_impression,
    nosuggestion_impression.event_dt AS nosuggestion_impression_dt,
    IF(addimage_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS addimage_impression,
    addimage_impression.event_dt AS addimage_impression_dt,
    IF(onb_step1_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step1_impression,
    onb_step1_impression.event_dt AS onboarding_step1_impression_dt,
    IF(onb_step1_skip.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step1_skipall,
    onb_step1_skip.event_dt AS onboarding_step1_skipall_dt,
    IF(onb_step2_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step2_impression,
    onb_step2_impression.event_dt AS onboarding_step2_impression_dt,
    IF(onb_step2_skip.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step2_skipall,
    onb_step2_skip.event_dt AS onboarding_step2_skipall_dt,
    IF(onb_step3_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step3_impression,
    onb_step3_impression.event_dt AS onboarding_step3_impression_dt,
    IF(onb_step3_skip.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step3_skipall,
    onb_step3_skip.event_dt AS onboarding_step3_skipall_dt,
    IF(onb_step4_impression.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step4_impression,
    onb_step4_impression.event_dt AS onboarding_step4_impression_dt,
    IF(onb_step4_getstarted.homepage_pageview_token IS NOT NULL, 1, 0) AS onboarding_step4_getstarted,
    onb_step4_getstarted.event_dt AS onboarding_step4_getstarted_dt,
    IF(on_task.homepage_pageview_token IS NOT NULL, 1, 0) AS on_task,
    on_task.event_dt AS on_task_dt
FROM hp_visits AS hpv
LEFT JOIN first_task_click AS ftc
ON hpv.homepage_pageview_token = ftc.homepage_pageview_token
LEFT JOIN nosuggestion_impression
ON hpv.homepage_pageview_token = nosuggestion_impression.homepage_pageview_token
LEFT JOIN addimage_impression
ON hpv.homepage_pageview_token = addimage_impression.homepage_pageview_token
LEFT JOIN onb_step1_impression
ON hpv.homepage_pageview_token = onb_step1_impression.homepage_pageview_token
LEFT JOIN onb_step1_skip
ON hpv.homepage_pageview_token = onb_step1_skip.homepage_pageview_token
LEFT JOIN onb_step2_impression
ON hpv.homepage_pageview_token = onb_step2_impression.homepage_pageview_token
LEFT JOIN onb_step2_skip
ON hpv.homepage_pageview_token = onb_step2_skip.homepage_pageview_token
LEFT JOIN onb_step3_impression
ON hpv.homepage_pageview_token = onb_step3_impression.homepage_pageview_token
LEFT JOIN onb_step3_skip
ON hpv.homepage_pageview_token = onb_step3_skip.homepage_pageview_token
LEFT JOIN onb_step4_impression
ON hpv.homepage_pageview_token = onb_step4_impression.homepage_pageview_token
LEFT JOIN onb_step4_getstarted
ON hpv.homepage_pageview_token = onb_step4_getstarted.homepage_pageview_token
LEFT JOIN on_task
ON hpv.homepage_pageview_token = on_task.homepage_pageview_token
'''

In [7]:
mobile_onb_funnel_data = spark.run(
    onboarding_funnel_query.format(
        wiki_list = ','.join(['"{}"'.format(w) for w in wikis]),
        hpv_known_user_id_expression = make_known_users_sql(known_users, 'hpv.wiki', 'hpv.event.user_id'),
        hpv_partition_statement = make_partition_statement(start_date, end_date, 'hpv'),
        hpm_partition_statement = make_partition_statement(start_date, end_date, 'hpm'),
        ssac_partition_statement = make_partition_statement(start_date, end_date, 'ssac'),
        partition_statement = make_partition_statement(start_date, end_date),
    ), session_type = 'yarn-large'
)

PySpark executors will use /usr/lib/anaconda-wmf/bin/python3.


In [None]:
mobile_onb_funnel_data[mobile_onb_funnel_data.duplicated(subset = ['homepage_pageview_token'], keep = False)]

In [9]:
variant_data = get_variant_data(wikis)

In [10]:
variant_data['variant_name'] = variant_data['variant_name'].apply(lambda v: v.decode('utf-8'))

In [None]:
variant_data.head()

In [11]:
mobile_onb_funnel_data = mobile_onb_funnel_data.merge(variant_data,
                                                     on = ['wiki', 'user_id'])

## Discarding Invalid Sessions

Later investigation of task completion based on whether users skipped or completed onboarding revealed that we have some invalid sessions where users did both. We'll discard those sessions.

In [None]:
mobile_onb_funnel_data.loc[
    ((mobile_onb_funnel_data['onboarding_step1_skipall'] == 1) |
     (mobile_onb_funnel_data['onboarding_step2_skipall'] == 1) |
     (mobile_onb_funnel_data['onboarding_step3_skipall'] == 1)
    ) &
    (mobile_onb_funnel_data['onboarding_step4_getstarted'] == 1),
    'homepage_pageview_token'
]

# Homepage Visit to Onboarding

In [12]:
def addimage_mobile_newcomers(df):
    return(df.loc[
        (df['variant_name'] == 'imagerecommendation') &
        (df['is_newcomer'] == 1) &
        (df['is_24hr_visit'] == 1) &
        (df['is_mobile'] == True)])

In [13]:
def round_perc(x, y, prec = 1):
    return(round(100.0 * x / y, prec))

In [14]:
def round_perc_df(df_x, df_y, prec = 1):
    return(round(100.0 * len(df_x) / len(df_y), prec))

In [74]:
mob_newcomers = addimage_mobile_newcomers(mobile_onb_funnel_data.loc[
    ~mobile_onb_funnel_data['homepage_pageview_token'].isin(
        mobile_onb_funnel_data.loc[
            ((mobile_onb_funnel_data['onboarding_step1_skipall'] == 1) |
             (mobile_onb_funnel_data['onboarding_step2_skipall'] == 1) |
             (mobile_onb_funnel_data['onboarding_step3_skipall'] == 1)
            ) &
            (mobile_onb_funnel_data['onboarding_step4_getstarted'] == 1),
            'homepage_pageview_token'
        ]
    )
])

Number of visitors to the Homepage is the whole dataset:

In [75]:
len(mob_newcomers)

15031

Number of users who clicked a task:

In [77]:
len(mob_newcomers.loc[mob_newcomers['click_number'] > 0])

1490

In [78]:
round_perc_df(
    mob_newcomers.loc[mob_newcomers['click_number'] > 0],
    mob_newcomers
)

9.9

Number of users who got the no suggestions dialogue:

In [79]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 1)
])

36

Proportion out of users who clicked a task:

In [80]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 1)
    ],
    mob_newcomers.loc[mob_newcomers['click_number'] > 0]
)

2.4

Proportion out of all visitors:

In [81]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 1)
    ],
    mob_newcomers
)

0.2

Number of users who saw onboarding Step 1:

In [82]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1)
]) 

1214

Proportion out of users who clicked a task:

In [83]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0)  &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ],
    mob_newcomers.loc[mob_newcomers['click_number'] > 0]
)

81.5

Proportion out of all visitors:

In [84]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1)
    ],
    mob_newcomers
)

8.1

Bounces are the ones who are left:

In [85]:
(
    len(mob_newcomers.loc[(mob_newcomers['click_number'] > 0)]) -
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 1)
    ]) -
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ]) 
)

240

Proportion out of users who clicked a task:

In [86]:
round_perc(
    (
        len(mob_newcomers.loc[(mob_newcomers['click_number'] > 0)]) -
        len(mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 1)
        ]) -
        len(mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]) 
    ),
    len(mob_newcomers.loc[mob_newcomers['click_number'] > 0])
)

16.1

Proportion out of all visitors:

In [87]:
round_perc(
    (
        len(mob_newcomers.loc[(mob_newcomers['click_number'] > 0)]) -
        len(mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 1)
        ]) -
        len(mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]) 
    ),
    len(mob_newcomers)
)

1.6

## Onboarding Step 1

Just a reminder of how many users saw Onboarding Step 1:

In [98]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1)
])

1214

How many of them clicked "Skip all" at this stage?

In [99]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 1)
])

357

Proportion out of users who saw onboarding step 1:

In [90]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

29.4

Proportion out of users who clicked a task:

In [91]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

24.6

Number of users who clicked "next" (meaning they didn't click "skip all" *and* saw step 2):

In [92]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1)
])

763

Proportion out of users who saw onboarding step 1:

In [93]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

62.9

Proportion out of users who clicked a task:

In [94]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

52.5

Users who bounced are those who didn't do either of the two preceeding actions:

In [95]:
(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    )
)

94

Proportion out of users who saw onboarding step 1:

In [96]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]
    )
)

7.7

Proportion out of users who clicked a task:

In [97]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0)
        ]
    )
)

6.5

Number of users who saw Step 2 (after not skipping step 1), and who there clicked "Skip all":

In [100]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1) &
    (mob_newcomers['onboarding_step2_skipall'] == 1)
])

71

Proportion out of users who saw Step 2 (without skipping on Step 1):

In [101]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1)
    ]
)

9.3

Proportion out of users who clicked a task:

In [102]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

4.9

Number of users who saw Step 3 (after not skipping step 1 or 2):

In [103]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1) &
    (mob_newcomers['onboarding_step2_skipall'] == 0) &
    (mob_newcomers['onboarding_step3_impression'] == 1)
])

683

Proportion out of users who saw Step 2 (without skipping on Step 1):

In [104]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1)
    ]
)

89.5

Proportion out of users who clicked a task:

In [105]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

47.0

Users who bounced are those who didn't do either of the two preceeding actions:

In [106]:
(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    )
)

9

Proportion out of users who saw onboarding step 2:

In [107]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    )
)

1.2

Proportion out of users who clicked a task:

In [108]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0)
        ]
    )
)

0.6

Number of users on Step 3 (after not skipping step 1 or 2), who then clicked "Skip all":

In [109]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1) &
    (mob_newcomers['onboarding_step2_skipall'] == 0) &
    (mob_newcomers['onboarding_step3_impression'] == 1) &
    (mob_newcomers['onboarding_step3_skipall'] == 1)
])

37

Proportion out of users who saw Step 3:

In [110]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1)
    ]
)

5.4

Proportion out of users who clicked a task:

In [111]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

2.5

Number of users on Step 3 who clicked "Next" (meaning they didn't skip and saw Step 4):

In [112]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1) &
    (mob_newcomers['onboarding_step2_skipall'] == 0) &
    (mob_newcomers['onboarding_step3_impression'] == 1) &
    (mob_newcomers['onboarding_step3_skipall'] == 0) &
    (mob_newcomers['onboarding_step4_impression'] == 1)
])

643

Proportion out of users who saw Step 3:

In [113]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1)
    ]
)

94.1

Proportion out of users who clicked a task:

In [114]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

44.2

Users who bounced are those who didn't do either of the two preceeding actions:

In [115]:
(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 1)
        ]    
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 0) &
            (mob_newcomers['onboarding_step4_impression'] == 1)
        ]
    )
)

3

Proportion out of users who saw onboarding step 3:

In [116]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 1)
        ]    
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 0) &
            (mob_newcomers['onboarding_step4_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    )
)

0.4

Proportion out of users who clicked a task:

In [117]:
round_perc(
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1)
        ]
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 1)
        ]    
    ) -
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0) &
            (mob_newcomers['onboarding_step1_impression'] == 1) &
            (mob_newcomers['onboarding_step1_skipall'] == 0) &
            (mob_newcomers['onboarding_step2_impression'] == 1) &
            (mob_newcomers['onboarding_step2_skipall'] == 0) &
            (mob_newcomers['onboarding_step3_impression'] == 1) &
            (mob_newcomers['onboarding_step3_skipall'] == 0) &
            (mob_newcomers['onboarding_step4_impression'] == 1)
        ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0)
        ]
    )
)

0.2

Number of users on Step 4 (after not skipping step 1, 2, or 3), who then completed onboarding by clicking "Get started":

In [118]:
len(mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0) &
    (mob_newcomers['onboarding_step1_impression'] == 1) &
    (mob_newcomers['onboarding_step1_skipall'] == 0) &
    (mob_newcomers['onboarding_step2_impression'] == 1) &
    (mob_newcomers['onboarding_step2_skipall'] == 0) &
    (mob_newcomers['onboarding_step3_impression'] == 1) &
    (mob_newcomers['onboarding_step3_skipall'] == 0) &
    (mob_newcomers['onboarding_step4_impression'] == 1) &
    (mob_newcomers['onboarding_step4_getstarted'] == 1)
])

635

Proportion out of users who saw Step 4:

In [119]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ]
)

98.8

Proportion out of users who clicked a task:

In [120]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ],
    mob_newcomers.loc[
    (mob_newcomers['click_number'] > 0) &
    (mob_newcomers['nosuggestion_impression'] == 0)]
)

43.7

Number of users on Step 4 who bounced are those who reached that step but didn't click "Get started":

In [121]:
(
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ]) -
    len(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ]
    )
)

8

Proportion out of users who saw onboarding Step 4:

In [122]:
round_perc(
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ]) -
    len(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ]
    ),
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ])
)

1.2

Proportion out of users who clicked a task:

In [123]:
round_perc(
    len(mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1)
    ]) -
    len(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ]
    ),
    len(
        mob_newcomers.loc[
            (mob_newcomers['click_number'] > 0) &
            (mob_newcomers['nosuggestion_impression'] == 0)
        ]
    )
)

0.6

Number of users reaching "On Task":

In [124]:
len(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['on_task'] == 1)
    ]
)

1100

Proportion of users reaching "On task" out of users who saw Step 1:

In [125]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['on_task'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

90.6

Proportion of users reaching "On task" out of all users who clicked a task:

In [126]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['on_task'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0)
    ]
)

75.7

Proportion of users skipping on Step 2 out of users who saw Step 1, and out of all users who clicked a task:

In [127]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

5.8

In [128]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0)
    ]
)

4.9

Proportion of users skipping on Step 3 out of users who saw Step 1, and out of all users who clicked a task:

In [68]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

3.0

In [129]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0)
    ]
)

2.5

Proportion of users clicking "Get started" on Step 4 out of users who saw Step 1, and out of all users who clicked a task:

In [130]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1)
    ]
)

52.3

In [131]:
round_perc_df(
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0) &
        (mob_newcomers['onboarding_step1_impression'] == 1) &
        (mob_newcomers['onboarding_step1_skipall'] == 0) &
        (mob_newcomers['onboarding_step2_impression'] == 1) &
        (mob_newcomers['onboarding_step2_skipall'] == 0) &
        (mob_newcomers['onboarding_step3_impression'] == 1) &
        (mob_newcomers['onboarding_step3_skipall'] == 0) &
        (mob_newcomers['onboarding_step4_impression'] == 1) &
        (mob_newcomers['onboarding_step4_getstarted'] == 1)
    ],
    mob_newcomers.loc[
        (mob_newcomers['click_number'] > 0) &
        (mob_newcomers['nosuggestion_impression'] == 0)
    ]
)

43.7