In [1]:
# code for loading the format for the notebook
import os

# path : store the current path to convert back to it later
path = os.getcwd()
os.chdir('../../notebook_format')
from formats import load_style
load_style()

In [2]:
os.chdir(path)

# Template For AB Test 


## Generate High Level Business Goal

- **Define your business objectives.** e.g. A business objective for an online flower store is to "Increase our sales by receiving online orders for our bouquets."
- **Define your Key Performance Indicators.** e.g. Our flower store’s business objective is to sell bouquets. Our KPI could be number of bouquets sold online.
- **Define your target metrics.** e.g. For our imaginary flower store, we can define a monthly target of 175 bouquets sold.


## Understand the Whys

After defining the high level goal, find out (not guess) which parts of your business are underperforming or trending and why. Quantitative methods do a much better job answering how many and how much types of questions. Whereas qualitative methods such as User Experience Group (you go really deep with a few users. It can take form of observing users doing tasks or you ask users to self-document their behaviors) and surveys are much better suited for answering questions about why or how to fix a problem.

**Take a look at your conversion funnel** Examine the flow from the persuasive end (top of the funnel) and the transactional end (bottom of the funnel). During the examination, segment to spot underlying underperformance or trends.

- **Segment by source:** Separate people who arrive on your website from e-mail campaigns, google, twitter, youtube, etc. Find answers to questions like: Is there a difference between bounce rates for those segments? Is there a difference in Visitor Loyalty between those who came from Youtube versus those who came from Twitter? What products do people who come from Youtube care about more than people who come from Google?
- **Segment by behavior:** Focus on groups of people who have similar behaviors For example, you can separate out people who visit more than ten times a month versus those that visit only twice. Do these people look for products in different price ranges? Are they from different regions? Or separate people out by the products they purchase, by order size, by people who have signed up.

e.g. You’re looking at your metric of total active users over time and you see a spike in one of the timelines. After confirming that this is not caused by seasonal variation, we can look at different segment of our visitors to see if one of the segment is causing the spike. Suppose we have chosen segment to be geographic, it might just happen that we’ve identify a large proportion of the traffic is generated by a specific region and it might be best for us to dig deeper and understand why.

**Three simple ideas for gathering qualitative data to understand the why**

- Add an exit survey on your site, asking why your visitors did/didnt complete the goal of the site.
- Send out a feedback surveys to your clients to find out more about them and their motives.
- Simply track what your customers are saying in social media and on review sites.

## Generate a Well-Defined Metric

**Now that you've identify the overall business goal and the possible problem (e.g. Less than one percent of visitors sign up for our newsletter). It's time to prioritize your website goals. Three categories of goals include:**

- Do x: Add better product images.
- Increase y: Increase click-through rates.
- Reduce z: Reduce our shopping cart abandonment rate.


**Define the Subject**

What you need to do is to decide how to assign users to either the control or the experiment. There’re three commonly used categories, namely user id, anonymous id (cookie) and event.

- **user id:** e.g. Log in user names. Choosing this as the proxy for your user means that all the events that corresponds to the same user id are either in the control or experiment group, regardless of whether that user is switching between a mobile phone or desktop. This also means that if the user has not log in then he / she will neither be assgined to a control or experiment group.
- **anonymous id (cookie):** The cookie is specific for a browser and device, thus if the user switches from Chrome to Firefox, they’ll be assigned to a different cookie. Also note that users can clear the cookie, in which case the next time they visit the website they’ll get assigned to a new cookie even if they’re still using the same browser and device. For experiments that will be crossing the sign-in border, using cookie is preferred. e.g. Suppose you’re changing the layout of the page or locations of the sign in bar then you should use a cookie.
- **event:** Should only be used when you’re testing a non-user-visible change. e.g. page load time. If not, what will happen is : The user will see the change when they first visit the page and after reloading the page, the user will not see the change, leading to confusion.


**Define the Population**

If you think you can identify which population of your users will be affected by your experiment, you might want to target your experiment to that traffic (e.g. changing features specific to one language’s users) so that the rest of the population won’t dilute the effect.

Next, depending on the problem you’re looking at, you might want to use a cohort instead of a population. A cohort makes much more sense than looking at the entire population when testing out learning effects, examining user retention or anything else that requires the users to be established for some reason.

A quick note on cohort. The gist of cohort analysis is basically putting your customers into buckets so you can track their behaviours over a period of time. The term cohort stands for a group of customers grouped by the timeline (can be week, month) where they first made a purchase (can be a different action that’s valuable to the business). In a cohort, you can have roughly the same parameters in your two user group, which makes them more comparable.

e.g. You’re an educational platform has an existing course that’s already up and running. Some of the students have completed the course, some of them are midway through and there’re students who have not yet started. If you want to change the structure of of one of the lessons to see if it improves the completion rate of the entire course and they started the experiment at time X. For students who have started before the experiment initiated they may have already finished the lesson already leading to the fact that they may not even see the change. So taking the whole population of students and running the experiment on them isn’t what you want. Instead, you want to segment out the cohort, the group of customers, that started the lesson are the experiment was launched and split that into an experiment and control group.

**Define the Size and Duration**

**When do I want to run the experiment and for how long.**

e.g. Suppose we’ve chosen the goal to increase click-through rates, which is defined by the unique number of people who click the button versus the number of users who visited the page that the button was located. But to actually use the definition, we’ll also have to address some other questions. Such as, if the same user visits the page once and comes back a week or two later, do we still only want to count that once? Thus we’ll also need to specify a time period

To account for this, if 99% of your visitors convert after 1 week, then you should do the following.

- Run your test for two weeks.
- Include in the test only users who show up in the first week. If a user shows up on day 13, you have not given them enough time to convert (click-through).
- At the end of the test, if a user who showed up on day 2 converts more than 7 days after he first arrived, he must be counted as a non-conversion.

**So one version of the fully-defined metric will be: For each week, the number of cookies that clicked divided by the number of cookies that interacted with the page (also add the population definition).**

Running the test for a least a week is adviced since it'll make sure that the experiment captures the different user behaviour of weekdays, weekends. Try to avoid holidays ....

If your population is defined and you have a large enough traffic, another consideration is what fraction of the traffic are you going to send through the experiment. There’re some reasons that you might not want to run the experiment on all of your traffic to get the result faster.

- The first consideration might be you’re just uncertained of how your users will react to the new feature, so you might want to test it out a bit before you get users blogging about it. 
- The same notion applies to riskier cases, such as you’re completely switching your backend system, if it doesn’t work well, then the site might go down.

## Prioritize

**After collating all the ideas, prioritize them based on three simple metrics:** (give them scores)

- **Potential** How much potential for a conversion rate increase? You can check to see if this kind of idea worked before.
- **Importance** How many visitors will be impacted from the test?
- **Ease** How easy is it to implement the test? Go for the low-hanging fruit first. 

Every test that's developed is documented so that we can review and prioritize ideas that are inspired by winning tests.

Some ideas worth experimenting is:

- Wording. e.g. Call to action or value proposition.
- Image. e.g. Replacing a general logistics image with the image of an actual employee.
- Layout. e.g. Increased the size of the contact form or amount of content on the page.

# Template for Problem Solving

It's all about reviewing the completed work, the current work in progress, planned work, specific annual metrics, performance gap.

## Review Current Status

- List key metrics you're tracking, where they're at, and compare with last few weeks (to measure how are thing trending).
- What did you learn last week or what was accomplished? And is everything on track?

## List Out Top Problems/Experiments

- List and prioritize the top (new) problems / experiments.
- What do we want to solve / learn and why? When asking the why, it can be thought of what steps will be taken given the result?
- List out who's feeling the pain? (figure out who are your priorities, this may be tied to the why)

## For All Problems, List Corresponding Hypothesized Solution

- Written in the form "[Specfic action] will create [expected result]."
- List out why do you believe each solution will help solve the problem?
- List metrics (quantitative) or proof (qualitative) you'll use to measure whether or not the solutions are doing what you expected (solving the problem). Meaning how will we conclude that the experiment succeeded. If it's a metric, set goals for it.
- How long will it take for you to run the experiment / get the problem solved.
- (optional) Include what measures will indicate the experiment isn't safe to continue.

## Reference

- [How to Build a Strong A/B Testing Plan That Gets Results](http://conversionxl.com/how-to-build-a-strong-ab-testing-plan-that-gets-results/)
- [The CRE Methodology](http://www.conversion-rate-experts.com/our-methodology/) (includes tools for checking user experience, maybe check later)