# Travel Website Recommendation System: A/B Testing
A new feature is being tested on a website, and the company wants to understand the downstream effects on revenue due to this feature addition. This feature corresponds to a new offering to customers/users, that the company doesn't want to offer yet. Hence a direct A/B test would not work. This notebook looks at libraries DoWhy and EconML to build a causal model and understand this area better. 

![Travel Website](Images/philipp-kammerer-6Mxb_mZ_Q8E-unsplash.jpg)

The company in question is a travel website, and they would like to know if joining their membership program would increase website engagement and increase in product purchases. 

Conducting a straightforward A/B test is impractical as the website lacks the capability to compel users into membership. Similarly, the travel company cannot directly analyze existing data by contrasting members and non-members, since those who opt for membership are probably already more actively involved compared to other users.

The solution presents itself through an earlier experiment by the company: Previously, the company had conducted a trial to assess the efficacy of a quicker registration procedure. This experimental nudge towards membership can be used to generate random variation in the likelihood of membership, using a particular class of estimators called Instrument Variable (IV) estimators. IV estimators work by assigning causal relationships to the independent variable using 'a proxy', or an instrument variable. This nudge is called an $intent-to-treat$ setting: the intention is to give a random group of users the "treatment" (access to the easier sign-up process), but not all users will actually take it.

In [1]:
!pip install dowhy lightgbm networkx econml numpy==1.25 



In [2]:
# Some imports to get us started
import warnings
warnings.simplefilter('ignore')

# Utilities
import os
import urllib.request
import numpy as np
import pandas as pd
from networkx.drawing.nx_pydot import to_pydot
from IPython.display import Image, display

# Generic ML imports
import lightgbm as lgb
from sklearn.preprocessing import PolynomialFeatures

# DoWhy imports 
import dowhy
from dowhy import CausalModel

In [3]:
# EconML imports
from econml.iv.dr import LinearIntentToTreatDRIV
from econml.cate_interpreter import SingleTreeCateInterpreter, \
                                    SingleTreePolicyInterpreter

import matplotlib.pyplot as plt
%matplotlib inline

### Data 
The data has been obtained


In [4]:
# Import the sample AB data
file_url = "https://msalicedatapublic.z5.web.core.windows.net/datasets/RecommendationAB/ab_sample.csv"   
ab_data = pd.read_csv(file_url)

In [5]:
ab_data

Unnamed: 0,days_visited_exp_pre,days_visited_free_pre,days_visited_fs_pre,days_visited_hs_pre,days_visited_rs_pre,days_visited_vrs_pre,locale_en_US,revenue_pre,os_type_osx,os_type_windows,easier_signup,became_member,days_visited_post
0,1,9,7,25,6,3,1,0.01,0,1,0,0,1
1,10,25,27,10,27,27,0,2.26,0,0,0,0,15
2,18,14,8,4,5,2,1,0.03,0,1,0,0,17
3,17,0,23,2,3,1,1,418.77,0,1,0,0,6
4,24,9,22,2,3,18,1,1.54,0,0,0,0,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...
99995,27,27,8,4,25,20,1,0.02,1,0,1,1,23
99996,22,21,15,27,24,18,0,6.98,1,0,1,1,23
99997,13,5,5,25,28,24,1,0.01,0,1,0,0,7
99998,21,13,5,24,14,2,0,0.07,0,0,1,1,9


All information has been derived and modified for learning purposes from [EconML Use Cases](https://github.com/py-why/EconML/blob/main/notebooks/CustomerScenarios/Case%20Study%20-%20Recommendation%20AB%20Testing%20at%20An%20Online%20Travel%20Company%20-%20EconML%20+%20DoWhy.ipynb)