# Using SDBN Click Model To Overcome Position Bias

This section we use the _Simplified Dynamic Bayesian Network_ (SDBN) to overcome the position bias that we saw with direct Click-Through-Rate. We consider the SDBN judgments and how they compare to just the click through rate.

In [1]:
import sys
sys.path.append('..')
from aips import *
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random
from session_gen import SessionGenerator

# if using a Jupyter notebook, includue:
%matplotlib inline

In [2]:
sessions = all_sessions()
sessions

Unnamed: 0,sess_id,query,rank,doc_id,clicked
0,35002,macbook,0.0,650450759814,False
1,35002,macbook,1.0,885909436705,False
2,35002,macbook,2.0,885909463626,False
3,35002,macbook,3.0,885909436002,False
4,35002,macbook,4.0,600603123061,False
...,...,...,...,...,...
149995,45001,star wars,25.0,886971404722,False
149996,45001,star wars,26.0,14633169546,False
149997,45001,star wars,27.0,883929200887,False
149998,45001,star wars,28.0,884116069973,False


In [3]:
products = fetch_products(doc_ids=sessions['doc_id'].unique())
products

Unnamed: 0,image,upc,name,manufacturer,shortDescription,longDescription,id,_version_
0,"<img height=""100"" src=""../data/retrotech/image...",885909472376,Apple&#xAE; - iPad&#xAE; 2 with Wi-Fi - 32GB -...,Apple&#xAE;,"9.7"" widescreen display; 802.11a/b/g/n Wi-Fi; ...",The all-new thinner and lighter design makes i...,ec31addc-241a-4266-9d22-1da243559a92,1787806796394528770
1,"<img height=""100"" src=""../data/retrotech/image...",814916010240,Amazon - Kindle DX - Graphite,Amazon,"9.7"" display with E-Ink technology; supports P...","Store up to 3,500 eBooks on this Kindle digita...",e7994184-e31f-42ab-ba65-cc3023eb29bf,1787806796409208834
2,"<img height=""100"" src=""../data/retrotech/image...",9781400532629,Barnes & Noble - NOOK WiFi eReader - White/White,Barnes & Noble,"6"" eInk display; supports PDF, ePub, JPEG, PNG...",This reader's built-in Wi-Fi wireless networki...,2791c987-912b-4e2b-b254-ec986f91b27e,1787806796441714691
3,"<img height=""100"" src=""../data/retrotech/image...",685387305636,Griffin Technology - PowerBlock Micro Charger ...,Griffin Technology,Compatible with select iPod and iPhone models;...,Keep your iPhone or iPod charged and ready for...,c43ad77d-c240-4a4d-b629-a6833aa1a08c,1787806796608438274
4,"<img height=""100"" src=""../data/retrotech/image...",27242708242,Sony - 900MHz Analog RF Wireless Headphones - ...,Sony,FM stereo sound; induction charging; compatibl...,Listen to your favorite music without worrying...,bf1f0573-6962-43c6-ad5d-5e272dfab677,1787806796634652690
...,...,...,...,...,...,...,...,...
306,"<img height=""100"" src=""../data/retrotech/image...",97361301747,Star Trek: Fan Collectives - DVD,\N,\N,\N,2d24ae31-f290-4b2d-b930-995fe83a51bf,1787806797059325953
307,"<img height=""100"" src=""../data/retrotech/image...",25192073007,Blues Brothers (Rated) (Unrated) - Widescreen ...,\N,\N,\N,401ea3a8-7307-4dc0-a617-da4d99806ed9,1787806797068763168
308,"<img height=""100"" src=""../data/retrotech/image...",25192073007,The Blues Brothers - Widescreen Dubbed Subtitl...,\N,\N,\N,de4d06b3-91c0-47c1-b1aa-69f7bf5888f5,1787806797077151746
309,"<img height=""100"" src=""../data/retrotech/image...",30206696622,Star Trek (Score) - Original Soundtrack - CD,Var&#xBF;se Sarabande (USA),\N,\N,3094a39a-9bf8-4d8f-9a39-f7da9c610749,1787806797719928863


# Listing 11.7

Click models overcome position bias by learning an examine probability on each ranking. SDBN tracks examines relative to the the last click. This code marks last click position per session so we can compute examine probabilities.

In [4]:
# Select all sessions for query "dryer"
query = "dryer"
sdbn_sess = sessions[sessions["query"] == query].copy().set_index("sess_id")

# Mapping of sess_id -> last_click_per_session
last_click_per_session = \
    sdbn_sess.groupby(["clicked", "sess_id"])["rank"].max()[True]

# Mark the last click rank in each session
sdbn_sess["last_click_rank"] = last_click_per_session

# Set each positions examine to true or false
sdbn_sess["examined"] = sdbn_sess["rank"] <= sdbn_sess["last_click_rank"]

# Examine session 3
sdbn_sess.loc[3]

Unnamed: 0_level_0,query,rank,doc_id,clicked,last_click_rank,examined
sess_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
3,dryer,0.0,12505451713,False,9.0,True
3,dryer,1.0,84691226727,False,9.0,True
3,dryer,2.0,883049066905,False,9.0,True
3,dryer,3.0,48231011396,False,9.0,True
3,dryer,4.0,74108056764,False,9.0,True
3,dryer,5.0,77283045400,False,9.0,True
3,dryer,6.0,783722274422,False,9.0,True
3,dryer,7.0,665331101927,False,9.0,True
3,dryer,8.0,14381196320,True,9.0,True
3,dryer,9.0,74108096487,True,9.0,True


# Listing 11.8

Aggregate clicks and examine counts

In [5]:
sdbn = sdbn_sess[sdbn_sess["examined"]].groupby("doc_id")[["clicked", "examined"]].sum()
sdbn

Unnamed: 0_level_0,clicked,examined
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1
12505451713,355,2707
12505525766,268,974
12505527456,110,428
14381196320,217,1202
36172950027,97,971
36725561977,119,572
36725578241,130,477
48231011396,166,423
48231011402,213,818
74108007469,208,708


# Listing 11.9

We compute a grade - a probability of relevance - by dividing the clicks by examines. This is the kind of dynamic 'click thru rate' of SDBN, that accounts for whether the result was actually seen by users, not just whether it was shown on the screen.

In [8]:
# Clicks over examines
sdbn["grade"] = sdbn["clicked"] / sdbn["examined"]
sdbn = sdbn.sort_values("grade", ascending=False)
sdbn

Unnamed: 0_level_0,clicked,examined,grade
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
856751002097,133,323,0.411765
48231011396,166,423,0.392435
84691226727,804,2541,0.316411
74108007469,208,708,0.293785
12505525766,268,974,0.275154
36725578241,130,477,0.272537
48231011402,213,818,0.260391
12505527456,110,428,0.257009
74108096487,235,1097,0.214221
36725561977,119,572,0.208042


# Figure 11.8 source code

In [9]:
render_judged(products, sdbn, grade_col="grade", label=f"SDBN judgments for q={query}")

Unnamed: 0,grade,image,upc,name,shortDescription
0,0.411765,,856751002097,Practecol - Dryer Balls (2-Pack),"Suitable for use on most dry cycles; reduces lint, static and wrinkles; improves heat circulation; 2-pack"
1,0.392435,,48231011396,LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedDigital controls; 7 cycles; SpeedWash cycle; 9 wash options; delay-wash; SenseClean system; 6Motion technology; TrueBalance antivibration system
2,0.316411,,84691226727,GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White,Rotary electromechanical controls; 3 cycles; 3 heat selections; DuraDrum interior; Quiet-By-Design
3,0.293785,,74108007469,Conair - 1875-Watt Folding Handle Hair Dryer - Blue,2 heat/speed settings; cool shot button; dual voltage; professional-length line cord
4,0.275154,,12505525766,Smart Choice - 6' 30 Amp 3-Prong Dryer Cord,Heavy-duty PVC insulation; strain relief safety clamp
5,0.272537,,36725578241,Samsung - 7.3 Cu. Ft. 7-Cycle Electric Dryer - White,Soft-touch dial controls; 7 preset drying cycles; 4 temperature settings; powdercoat drum; noise reduction package
6,0.260391,,48231011402,LG - 7.1 Cu. Ft. 7-Cycle Electric Dryer - White,Electronic controls with LED display; 7 cycles; Dial-A-Cycle option; sensor dry system; 5 temperature levels; 5 drying levels; NeveRust drum; LoDecibel quiet operation
7,0.257009,,12505527456,"Smart Choice - 1/2"" Safety+PLUS Stainless-Steel Gas Dryer Connector","Safety+PLUS automatic shut-off valve; leak detection solution; pipe thread sealant; 60,500 BTU; CSA approved"
8,0.214221,,74108096487,Conair - Infiniti Cord-Keeper Professional Tourmaline Ionic Hair Dryer - Fuchsia,Tourmaline ceramic technology; ionic technology; 1875 watts; Cool Shot function; 3 heat settings; 2 speed settings; 5' retractable cord; includes diffuser
9,0.208042,,36725561977,Samsung - 3.5 Cu. Ft. 6-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedSoft dial touch pad controls; 6 cycles; delay-start; child lock; Vibration Reduction Technology


# Figure 11.9 Source Code

In [8]:
# Mark the last click on each query's session
query = "dryer"
sdbn_sess = sessions[sessions["query"] == query].copy().set_index("sess_id")

last_click_per_session = sdbn_sess.groupby(["clicked", "sess_id"])["rank"].max()[True]

sdbn_sess["last_click_rank"] = last_click_per_session
sdbn_sess["examined"] = sdbn_sess["rank"] <= sdbn_sess["last_click_rank"]

sdbn = sdbn_sess[sdbn_sess["examined"]].groupby("doc_id")[["clicked", "examined"]].sum()
sdbn["grade"] = sdbn["clicked"] / sdbn["examined"]

sdbn = sdbn.sort_values("grade", ascending=False)
render_judged(products, sdbn, grade_col="grade", label=f"SDBN judgments for q={query}")


Unnamed: 0,grade,image,upc,name,shortDescription
0,0.411765,,856751002097,Practecol - Dryer Balls (2-Pack),"Suitable for use on most dry cycles; reduces lint, static and wrinkles; improves heat circulation; 2-pack"
1,0.392435,,48231011396,LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedDigital controls; 7 cycles; SpeedWash cycle; 9 wash options; delay-wash; SenseClean system; 6Motion technology; TrueBalance antivibration system
2,0.316411,,84691226727,GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White,Rotary electromechanical controls; 3 cycles; 3 heat selections; DuraDrum interior; Quiet-By-Design
3,0.293785,,74108007469,Conair - 1875-Watt Folding Handle Hair Dryer - Blue,2 heat/speed settings; cool shot button; dual voltage; professional-length line cord
4,0.275154,,12505525766,Smart Choice - 6' 30 Amp 3-Prong Dryer Cord,Heavy-duty PVC insulation; strain relief safety clamp
5,0.272537,,36725578241,Samsung - 7.3 Cu. Ft. 7-Cycle Electric Dryer - White,Soft-touch dial controls; 7 preset drying cycles; 4 temperature settings; powdercoat drum; noise reduction package
6,0.260391,,48231011402,LG - 7.1 Cu. Ft. 7-Cycle Electric Dryer - White,Electronic controls with LED display; 7 cycles; Dial-A-Cycle option; sensor dry system; 5 temperature levels; 5 drying levels; NeveRust drum; LoDecibel quiet operation
7,0.257009,,12505527456,"Smart Choice - 1/2"" Safety+PLUS Stainless-Steel Gas Dryer Connector","Safety+PLUS automatic shut-off valve; leak detection solution; pipe thread sealant; 60,500 BTU; CSA approved"
8,0.214221,,74108096487,Conair - Infiniti Cord-Keeper Professional Tourmaline Ionic Hair Dryer - Fuchsia,Tourmaline ceramic technology; ionic technology; 1875 watts; Cool Shot function; 3 heat settings; 2 speed settings; 5' retractable cord; includes diffuser
9,0.208042,,36725561977,Samsung - 3.5 Cu. Ft. 6-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedSoft dial touch pad controls; 6 cycles; delay-start; child lock; Vibration Reduction Technology


In [9]:
sdbn

Unnamed: 0_level_0,clicked,examined,grade
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
856751002097,133,323,0.411765
48231011396,166,423,0.392435
84691226727,804,2541,0.316411
74108007469,208,708,0.293785
12505525766,268,974,0.275154
36725578241,130,477,0.272537
48231011402,213,818,0.260391
12505527456,110,428,0.257009
74108096487,235,1097,0.214221
36725561977,119,572,0.208042


Up next: [Dealing with Low Confidence Situations](3.sdbn-Confidence-Bias.ipynb)