# Drill: What can data science do? 

Below we have a series of questions for you to translate into a technical plan. For each question, describe how you would make it testable and translate it from a general question into something statistically rigorous.


## Question 1
_You work at an e-commerce company that sells three goods: widgets, doodads, and fizzbangs. The head of advertising asks you which they should feature in their new advertising campaign. You have data on individual visitors' sessions (activity on a website, pageviews, and purchases), as well as whether or not those users converted from an advertisement for that session. You also have the cost and price information for the goods._

Steps: 
- The ultimate goal is to maximize profit. If we calculate the cost function for each product, as in the formula below, we see there are two ways maximize the cost function. 

> profit = number of purchases*(price - cost) - advt_money spent  #cost fn for one product 

- (1) Increase number of purchases. For this, we calculate the number of pageviews and purchases to find out the conversion ratio for each product. We then target the product that has a lower ratio, and explore the activities on the website for that product to. There could be several indicators, for example, if the page views of the least selling product are much less than other products, it is probably an indicator that we do not have targeted advertisement for that product. Another example would be to find out the time spent on the webpage and if the visitor converted for that session. If visitors spend time on the page but do not convert, it is a good indicator that the price is higher than the customer's budget. Calculating profit margin for the product, the next step would be to offer discounts or free shipping to incentivise customers to buy the product. A further step from this would be to test if the conversion rate increased after offering discounts.   


- (2) By analysing previous advertisement data we will be able to see if the visitors converted from an advertisement or not. That is, by comparing the number of conversion from ad for each product, we'll be able to tell if the advertisement was successful or not. Accordingly, we can estimate the budget for each advertisement.  



## Question 2
_You work at a web design company that offers to build websites for clients. Signups have slowed, and you are tasked with finding out why. The onboarding funnel has three steps: email and password signup, plan choice, and payment. On a user level you have information on what steps they have completed as well as timestamps for all of those events for the past 3 years. You also have information on marketing spend on a weekly level._

I am assuming that in "Signups have slowed", "signup" refers to the conversion at the end of step 3, and not the email **sign up**.)
Steps:
- (1) Get data on all non-converting customers and find out the last timestamp before ending the session. This will be a good indicator of the most difficult part of the user's journey, and it may also indicate if the problem was at the level of the plan choices or, whether the customets faced any other errors (e.g., not getting the sign up confirmation email -- a server connection problem). 
- (2) From historical data, compare the converted and non-converted users' reaction times at each of the three steps in the onboading funnel. This will be a predictor, and could use targeted advertisement for each level. For example, if a visitor has a longer reaction time at step 2 (plan choice), offer discount. We could also compare the rate of sign up for each payment plan to see what is the most successful payment plan. 
(3) Plot weekly marketing spend and sign up rate from historical data to find out when the sign ups slowed and what might have impacted the slowdown. 



## Question 3

_You work at a hotel website and currently the website ranks search results by price. For simplicity's sake, let's say it's a website for one city with 100 hotels. You are tasked with proposing a better ranking system. You have session information, price information for the hotels, and whether each hotel is currently available._

Option 1.

A/B testing:
- A. Select all available hotels, rank by price. 
- B. Select all hotels, rank by price. 
- Find higher click-through rate.

Option 2. 
- Incorporate maps and show available hotels ranked by only in the map. 

Option 3. 
- Session information will give us the end-users search criteria. Rank according to the search criterion (e.g., "best hotels in ..." vs. "cheap hotels in ...")

- A map could be incorporated which shows a hotels in the perimeter of the viewing area
- Review scores will be helpful for ranking (as TripAdvisor does for the "popularity" ranking). Within a price range hotels could be ordered from highest to lowest ranking. 


## Question 4


_You work at a social network, and the management is worried about churn (users stopping using the product). You are tasked with finding out if their churn is atypical. You have three years of data for users with an entry for every time they've logged in, including the timestamp and length of session._

**What does it mean -- "atypical churn"?**

Steps: 
- Separate churned users from continuing users. Compare their reaction times for each click and length of each session 
- Increased duration between two log in sessions may indicate the users are not interested in the product anymore, or in other words, there are greater possibilities for churn.   