# Media Attribution
Exploring single and multi-touch attribution

Methods explored:

*   Last-touch Attribution
*   First-touch Attribution
*   U-shaped Attribution
*   Linear Attribution
*   Time-decay Attribution
*   Markov Chains


Installing code from this Github repo for use in Markov Chain Attribution:
<b>

https://github.com/jerednel/markov-chain-attribution/blob/master/readme.md

In [1]:
!pip install markov-model-attribution

Collecting markov-model-attribution
  Downloading markov_model_attribution-0.42-py3-none-any.whl (3.9 kB)
Installing collected packages: markov-model-attribution
Successfully installed markov-model-attribution-0.42


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import figure
import markov_model_attribution as mma

# User Paths

Data must be provided in a DataFrame. The DataFrame must have 1 column named "Paths." Each row is a single user with the path they took through our marketing efforts.
<br>

The user path must start with 'start'. Each step must be separated by a ' > '. And each user path must end with 'null' or 'conv'.
<br>

For example:

```
# This is formatted as code
paths = pd.DataFrame({'Paths': ['start > S > P > null',
                                'start > S > E > null',
                                'start > S > P > E > conv']})
```
Where
*   S: Social
*   P: Paid Search
*   E: Email


## Simple Example Data

In [3]:
paths = pd.DataFrame({'Paths': ['start > S > P > null',
                                'start > S > E > null',
                                'start > P > E > conv']})

df = paths.copy()

# Formatting Data

In [4]:
paths = np.array(paths).tolist()
sublist = []
total_paths = 0
for path in paths:
    for touchpoint in path:
        userpath = touchpoint.split(' > ')
        sublist.append(userpath)
    total_paths += 1
paths = sublist

# Heuristic Methods

*   Last-touch Attribution
*   First-touch Attribution
*   U-shaped Attribution
*   Linear Attribution
*   Time-decay Attribution

In [5]:
unique_touch_list = set(x for element in paths for x in element)
# get total last touch conversion counts
last_conv_dict = {}
first_conv_dict = {}
u_conv_dict = {}
lin_conv_dict = {}
time_decay_conv_dict = {}

total_conversions = 0
for item in unique_touch_list:
    last_conv_dict[item] = 0
    first_conv_dict[item] = 0
    u_conv_dict[item] = 0
    lin_conv_dict[item] = 0
    time_decay_conv_dict[item] = 0

for path in paths:
    if 'conv' in path:
        total_conversions += 1

        # Last Touch Attribution
        last_conv_dict[path[-2]] += 1

        # First Touch Attribution
        first_conv_dict[path[1]] += 1

        # U-shaped Attribution
        if len(path) == 3:
          u_conv_dict[path[1]] += 1
        elif len(path) == 4:
          u_conv_dict[path[1]] += 0.5
          u_conv_dict[path[-2]] += 0.5
        else:
          middle_touch_count = len(path) - 4
          u_conv_dict[path[1]] += 0.4
          u_conv_dict[path[-2]] += 0.4
          percent_to_distribute = 0.2 / middle_touch_count
          for i in range(2, 2 + middle_touch_count):
            u_conv_dict[path[i]] += percent_to_distribute

        # Linear Attribution
        touchpoints = len(path) - 2
        percent_to_distribute = 1 / touchpoints
        for i in range(1, len(path) - 1):
          lin_conv_dict[path[i]] += percent_to_distribute

        # Time Decay Attribution
        touchpoints = len(path) - 2
        total = sum(1 / 2 ** i for i in range(touchpoints))
        denominator = 2 ** (touchpoints - 1)
        for i in range(touchpoints, 0, -1):
          time_decay_conv_dict[path[-i - 1]] += ((1 / total) * (1 / denominator))
          denominator /= 2

last_conv_dict.pop('conv', None)
last_conv_dict.pop('null', None)
last_conv_dict.pop('start', None)
first_conv_dict.pop('conv', None)
first_conv_dict.pop('null', None)
first_conv_dict.pop('start', None)
u_conv_dict.pop('conv', None)
u_conv_dict.pop('null', None)
u_conv_dict.pop('start', None)
lin_conv_dict.pop('conv', None)
lin_conv_dict.pop('null', None)
lin_conv_dict.pop('start', None)
time_decay_conv_dict.pop('conv', None)
time_decay_conv_dict.pop('null', None)
time_decay_conv_dict.pop('start', None)

0

# Algorithmic Attribution

* Markov Chains

In [6]:
model = mma.run_model(paths=df)

# Results

In [7]:
def round_values(test_dict):
  temp = {}
  for key in test_dict:
      temp[key] = round(test_dict[key], 3)
  return temp

In [8]:
print(f'Of all {total_conversions} conversions, how many can be attributed to each channel?\n')
print('Last: \t', round_values(last_conv_dict))
print('First: \t', round_values(first_conv_dict))
print('U-shape:', round_values(u_conv_dict))
print('Linear:\t', round_values(lin_conv_dict))
print('Decay: \t', round_values(time_decay_conv_dict))
print('Markov:\t', round_values(model['markov_conversions']))

Of all 1 conversions, how many can be attributed to each channel?

Last: 	 {'S': 0, 'P': 0, 'E': 1}
First: 	 {'S': 0, 'P': 1, 'E': 0}
U-shape: {'S': 0, 'P': 0.5, 'E': 0.5}
Linear:	 {'S': 0, 'P': 0.5, 'E': 0.5}
Decay: 	 {'S': 0, 'P': 0.333, 'E': 0.667}
Markov:	 {'S': 0.333, 'P': 0.222, 'E': 0.444}


Proportions of Attribution

In [9]:
def proportion_of_attribution(test_dict, total_conversions):
  temp = {}
  for key in test_dict:
      temp[key] = round(test_dict[key] / total_conversions, 3)
  return temp

In [10]:
print(f'What percentage of conversions can be attributed to each channel?\n')
print('Last: \t', proportion_of_attribution(last_conv_dict, total_conversions))
print('First: \t', proportion_of_attribution(first_conv_dict, total_conversions))
print('U-shape:', proportion_of_attribution(u_conv_dict, total_conversions))
print('Linear:\t', proportion_of_attribution(lin_conv_dict, total_conversions))
print('Decay: \t', proportion_of_attribution(time_decay_conv_dict, total_conversions))
print('Markov:\t', proportion_of_attribution(model['markov_conversions'], total_conversions))

What percentage of conversions can be attributed to each channel?

Last: 	 {'S': 0.0, 'P': 0.0, 'E': 1.0}
First: 	 {'S': 0.0, 'P': 1.0, 'E': 0.0}
U-shape: {'S': 0.0, 'P': 0.5, 'E': 0.5}
Linear:	 {'S': 0.0, 'P': 0.5, 'E': 0.5}
Decay: 	 {'S': 0.0, 'P': 0.333, 'E': 0.667}
Markov:	 {'S': 0.333, 'P': 0.222, 'E': 0.444}
