Seedbox Data Science Application Test

In this test, we would like you to write up an analysis about a mock experiment we've run at Seedbox. We aim to measure your current knowledge in Data Cleaning, Exploratory Data Analysis and Drawing Conclusions from observations.

Experimentation Details

We recently ran an A/B test on the cancellation page of our subscription service. Before running the test, members where able to cancel using a simple web form. The experiment aims measure the impact of forcing members to phone-in to our customer service line in order to cancel.

Information about the test:

control group can cancel using a web form
test group can only cancel by calling in
Users were randomly assigned to a group when they go to the websites cancel page for the first-time
The distribution probabilty between both groups is uneven (you can see this like an unfair coinflip)
We've recored additional transactions generated after users were randomized
REBILLs are Transactions recurring payments that were processed
CHARGEBACKs or REFUNDs transactions represent payments that were cancelled

Information About the Data

You will find the required data-sets for this analysis in the current git repo. This is split in the following 2 csv files:

In testSamples.csv, you will find a list of unique users that were randomized in the A/B test.

sample_id : is the unique identifier for the sample
test_group : is the group in which the sample was placed, 0= control group, 1=test group

In transData.csv, you will find a list of transactions generated by randomized users after their randomization:

transaction_id : is the unique identifier for the transaction
sample_id : is a foreign key that links transactions to test samples
transaction_type : is the transaction type for a transaction, can be REBILL, CHARGEBACK or REFUND
transaction_amount : is the amount generated for a transaction, this can be a negative value

Analysis Requirements

In this analysis we would like you to answer the following questions:

What is the aproximate probability distribution between the test group and the control group
Is a user that must call-in to cancel more likely to generate at least 1 addition REBILL?
Is a user that must call-in to cancel more likely to generate more revenues?
Is a user that must call-in more likely to produce a higher chargeback rate(CHARGEBACKs/REBILLs)?

Technical Requirements:

Analysis must be coded in R or Python
Analysis must be submitted to a github repository
Analysis must be written in markdown format
Please include at least 1 vizualization with your analysis
Please use statistical significance tools to answer the questions we've asked
Include the code you used to perform the analysis in the github repository
Send us the link to your repository once you've completed the analysis

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
testSamples.csv		testSamples.csv
transData.csv		transData.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seedbox Data Science Application Test

Experimentation Details

Information About the Data

Analysis Requirements

About

Releases

Packages

seedboxtech/datasciencetest

Folders and files

Latest commit

History

Repository files navigation

Seedbox Data Science Application Test

Experimentation Details

Information About the Data

Analysis Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages