# Using SDBN Click Model To Overcome Position Bias

This section we use the _Simplified Dynamic Bayesian Network_ (SDBN) to overcome the position bias that we saw with direct Click-Through-Rate. We consider the SDBN judgments and how they compare to just the click through rate.

In [2]:
import sys
sys.path.append('..')
from aips import *
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random
from session_gen import SessionGenerator

# if using a Jupyter notebook, includue:
%matplotlib inline

In [3]:
sessions = all_sessions()
sessions

Unnamed: 0,sess_id,query,rank,doc_id,clicked
0,2,ipad,0.0,92636260712,False
1,2,ipad,1.0,635753493559,True
2,2,ipad,2.0,885909393404,False
3,2,ipad,3.0,843404073153,False
4,2,ipad,4.0,885909457595,False
...,...,...,...,...,...
149995,60001,bluray,25.0,23942973416,False
149996,60001,bluray,26.0,25192107191,False
149997,60001,bluray,27.0,27242809710,False
149998,60001,bluray,28.0,600603132872,False


In [4]:
products = fetch_products(doc_ids=sessions['doc_id'].unique())
products

Unnamed: 0,image,upc,name,name_txt_en_split,manufacturer,shortDescription,longDescription,id,_version_,promotion_b
0,"<img height=""100"" src=""../data/retrotech/image...",27242827790,"Sony - 15.5"" VAIO Laptop - 4GB Memory - 640GB ...","Sony - 15.5"" VAIO Laptop - 4GB Memory - 640GB ...",Sony,ENERGY STAR QualifiedWindows 7 Home Premium 64...,"This 15.5"" Sony VAIO VPCEH14FM/W laptop featur...",bf8fe184-4710-483e-93ec-a2fa0a1e0688,1789076339145310208,
1,"<img height=""100"" src=""../data/retrotech/image...",813774010904,Samsung - Refurbished Wi-Fi Ready Blu-ray Pla...,Samsung - Refurbished Wi-Fi Ready Blu-ray Pla...,Samsung,RefurbishedENERGY STAR QualifiedPlays DVD and ...,See movies come to life in brilliant high-defi...,a8fa2255-4a92-4353-8992-38bb39742340,1789076339166281729,
2,"<img height=""100"" src=""../data/retrotech/image...",885909457588,Apple&#xAE; - iPad&#xAE; 2 with Wi-Fi - 16GB -...,Apple&#xAE; - iPad&#xAE; 2 with Wi-Fi - 16GB -...,Apple&#xAE;,"9.7"" widescreen display; 802.11a/b/g/n Wi-Fi; ...",The all-new thinner and lighter design makes i...,17c461c1-d3be-4696-bf27-04d727c8b040,1789076339171524614,
3,"<img height=""100"" src=""../data/retrotech/image...",600603135088,Alpha - eReader Universal AC Charger,Alpha - eReader Universal AC Charger,Alpha,Charge your eReader any way you need,Your eReader is your own personal library tuck...,d60321c7-5706-47e2-8142-a53ad9158860,1789076339196690442,True
4,"<img height=""100"" src=""../data/retrotech/image...",673796100317,Cosmic Headphones [LP] - VINYL,Cosmic Headphones [LP] - VINYL,ePistrophik Peach,\N,\N,80d7291a-95f4-49ce-a400-aee7c731f786,1789076339285819405,
...,...,...,...,...,...,...,...,...,...,...
307,"<img height=""100"" src=""../data/retrotech/image...",27242799127,Sony - Earbud Headphones - Black,Sony - Earbud Headphones - Black,Sony,9mm drivers; high-energy neodymium magnets; hy...,Rock out to your favorite songs with these ear...,2729ca39-4762-4f03-aa33-bddf41e15a2d,1789076340026114049,True
308,"<img height=""100"" src=""../data/retrotech/image...",27242799127,Sony - Earbud Headphones - Black,Sony - Earbud Headphones - Black,Sony,9mm drivers; high-energy neodymium magnets; hy...,Rock out to your favorite songs with these ear...,7f5daf3b-2f50-4428-ae82-e0240a669f55,1789076340026114053,True
309,"<img height=""100"" src=""../data/retrotech/image...",803238004525,Headphones - CD,Headphones - CD,Suicide Squeeze,\N,\N,bd065e9e-662e-4825-9719-be6b7282c77e,1789076340230586380,True
310,"<img height=""100"" src=""../data/retrotech/image...",803238004525,Headphones - CD,Headphones - CD,Suicide Squeeze,\N,\N,a1b1ab80-eb64-4262-bb36-88f3f70bd2de,1789076340230586387,True


# Listing 11.7

Click models overcome position bias by learning an examine probability on each ranking. SDBN tracks examines relative to the the last click. This code marks last click position per session so we can compute examine probabilities.

In [5]:
# Select all sessions for query "dryer"
query = "dryer"
sdbn_sess = sessions[sessions["query"] == query].copy().set_index("sess_id")

# Mapping of sess_id -> last_click_per_session
last_click_per_session = \
    sdbn_sess.groupby(["clicked", "sess_id"])["rank"].max()[True]

# Mark the last click rank in each session
sdbn_sess["last_click_rank"] = last_click_per_session

# Set each positions examine to true or false
sdbn_sess["examined"] = sdbn_sess["rank"] <= sdbn_sess["last_click_rank"]

# Examine session 3
sdbn_sess.loc[3]

Unnamed: 0_level_0,query,rank,doc_id,clicked,last_click_rank,examined
sess_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
3,dryer,0.0,12505451713,False,9.0,True
3,dryer,1.0,84691226727,False,9.0,True
3,dryer,2.0,883049066905,False,9.0,True
3,dryer,3.0,48231011396,False,9.0,True
3,dryer,4.0,74108056764,False,9.0,True
3,dryer,5.0,77283045400,False,9.0,True
3,dryer,6.0,783722274422,False,9.0,True
3,dryer,7.0,665331101927,False,9.0,True
3,dryer,8.0,14381196320,True,9.0,True
3,dryer,9.0,74108096487,True,9.0,True


# Listing 11.8

Aggregate clicks and examine counts

In [6]:
sdbn = sdbn_sess[sdbn_sess["examined"]].groupby("doc_id")[["clicked", "examined"]].sum()
sdbn

Unnamed: 0_level_0,clicked,examined
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1
12505451713,355,2707
12505525766,268,974
12505527456,110,428
14381196320,217,1202
36172950027,97,971
36725561977,119,572
36725578241,130,477
48231011396,166,423
48231011402,213,818
74108007469,208,708


# Listing 11.9

We compute a grade - a probability of relevance - by dividing the clicks by examines. This is the kind of dynamic 'click thru rate' of SDBN, that accounts for whether the result was actually seen by users, not just whether it was shown on the screen.

In [7]:
# Clicks over examines
sdbn["grade"] = sdbn["clicked"] / sdbn["examined"]
sdbn = sdbn.sort_values("grade", ascending=False)
sdbn

Unnamed: 0_level_0,clicked,examined,grade
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
856751002097,133,323,0.411765
48231011396,166,423,0.392435
84691226727,804,2541,0.316411
74108007469,208,708,0.293785
12505525766,268,974,0.275154
36725578241,130,477,0.272537
48231011402,213,818,0.260391
12505527456,110,428,0.257009
74108096487,235,1097,0.214221
36725561977,119,572,0.208042


# Figure 11.8 source code

In [8]:
render_judged(products, sdbn, grade_col="grade", label=f"SDBN judgments for q={query}")

Unnamed: 0,grade,image,upc,name,shortDescription
0,0.411765,,856751002097,Practecol - Dryer Balls (2-Pack),"Suitable for use on most dry cycles; reduces lint, static and wrinkles; improves heat circulation; 2-pack"
1,0.392435,,48231011396,LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedDigital controls; 7 cycles; SpeedWash cycle; 9 wash options; delay-wash; SenseClean system; 6Motion technology; TrueBalance antivibration system
2,0.316411,,84691226727,GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White,Rotary electromechanical controls; 3 cycles; 3 heat selections; DuraDrum interior; Quiet-By-Design
3,0.293785,,74108007469,Conair - 1875-Watt Folding Handle Hair Dryer - Blue,2 heat/speed settings; cool shot button; dual voltage; professional-length line cord
4,0.275154,,12505525766,Smart Choice - 6' 30 Amp 3-Prong Dryer Cord,Heavy-duty PVC insulation; strain relief safety clamp
5,0.272537,,36725578241,Samsung - 7.3 Cu. Ft. 7-Cycle Electric Dryer - White,Soft-touch dial controls; 7 preset drying cycles; 4 temperature settings; powdercoat drum; noise reduction package
6,0.260391,,48231011402,LG - 7.1 Cu. Ft. 7-Cycle Electric Dryer - White,Electronic controls with LED display; 7 cycles; Dial-A-Cycle option; sensor dry system; 5 temperature levels; 5 drying levels; NeveRust drum; LoDecibel quiet operation
7,0.257009,,12505527456,"Smart Choice - 1/2"" Safety+PLUS Stainless-Steel Gas Dryer Connector","Safety+PLUS automatic shut-off valve; leak detection solution; pipe thread sealant; 60,500 BTU; CSA approved"
8,0.214221,,74108096487,Conair - Infiniti Cord-Keeper Professional Tourmaline Ionic Hair Dryer - Fuchsia,Tourmaline ceramic technology; ionic technology; 1875 watts; Cool Shot function; 3 heat settings; 2 speed settings; 5' retractable cord; includes diffuser
9,0.208042,,36725561977,Samsung - 3.5 Cu. Ft. 6-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedSoft dial touch pad controls; 6 cycles; delay-start; child lock; Vibration Reduction Technology


# Figure 11.9 Source Code

In [9]:
# Mark the last click on each query's session
query = "dryer"
sdbn_sess = sessions[sessions["query"] == query].copy().set_index("sess_id")

last_click_per_session = sdbn_sess.groupby(["clicked", "sess_id"])["rank"].max()[True]

sdbn_sess["last_click_rank"] = last_click_per_session
sdbn_sess["examined"] = sdbn_sess["rank"] <= sdbn_sess["last_click_rank"]

sdbn = sdbn_sess[sdbn_sess["examined"]].groupby("doc_id")[["clicked", "examined"]].sum()
sdbn["grade"] = sdbn["clicked"] / sdbn["examined"]

sdbn = sdbn.sort_values("grade", ascending=False)
render_judged(products, sdbn, grade_col="grade", label=f"SDBN judgments for q={query}")


Unnamed: 0,grade,image,upc,name,shortDescription
0,0.411765,,856751002097,Practecol - Dryer Balls (2-Pack),"Suitable for use on most dry cycles; reduces lint, static and wrinkles; improves heat circulation; 2-pack"
1,0.392435,,48231011396,LG - 3.5 Cu. Ft. 7-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedDigital controls; 7 cycles; SpeedWash cycle; 9 wash options; delay-wash; SenseClean system; 6Motion technology; TrueBalance antivibration system
2,0.316411,,84691226727,GE - 6.0 Cu. Ft. 3-Cycle Electric Dryer - White,Rotary electromechanical controls; 3 cycles; 3 heat selections; DuraDrum interior; Quiet-By-Design
3,0.293785,,74108007469,Conair - 1875-Watt Folding Handle Hair Dryer - Blue,2 heat/speed settings; cool shot button; dual voltage; professional-length line cord
4,0.275154,,12505525766,Smart Choice - 6' 30 Amp 3-Prong Dryer Cord,Heavy-duty PVC insulation; strain relief safety clamp
5,0.272537,,36725578241,Samsung - 7.3 Cu. Ft. 7-Cycle Electric Dryer - White,Soft-touch dial controls; 7 preset drying cycles; 4 temperature settings; powdercoat drum; noise reduction package
6,0.260391,,48231011402,LG - 7.1 Cu. Ft. 7-Cycle Electric Dryer - White,Electronic controls with LED display; 7 cycles; Dial-A-Cycle option; sensor dry system; 5 temperature levels; 5 drying levels; NeveRust drum; LoDecibel quiet operation
7,0.257009,,12505527456,"Smart Choice - 1/2"" Safety+PLUS Stainless-Steel Gas Dryer Connector","Safety+PLUS automatic shut-off valve; leak detection solution; pipe thread sealant; 60,500 BTU; CSA approved"
8,0.214221,,74108096487,Conair - Infiniti Cord-Keeper Professional Tourmaline Ionic Hair Dryer - Fuchsia,Tourmaline ceramic technology; ionic technology; 1875 watts; Cool Shot function; 3 heat settings; 2 speed settings; 5' retractable cord; includes diffuser
9,0.208042,,36725561977,Samsung - 3.5 Cu. Ft. 6-Cycle High-Efficiency Washer - White,ENERGY STAR QualifiedSoft dial touch pad controls; 6 cycles; delay-start; child lock; Vibration Reduction Technology


In [10]:
sdbn

Unnamed: 0_level_0,clicked,examined,grade
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
856751002097,133,323,0.411765
48231011396,166,423,0.392435
84691226727,804,2541,0.316411
74108007469,208,708,0.293785
12505525766,268,974,0.275154
36725578241,130,477,0.272537
48231011402,213,818,0.260391
12505527456,110,428,0.257009
74108096487,235,1097,0.214221
36725561977,119,572,0.208042


Up next: [Dealing with Low Confidence Situations](3.sdbn-Confidence-Bias.ipynb)