# A/B Testing Simulation to Active Learning

In this notebook, users have a hidden preference for a single query. We use this to explore A/B testing to see whether a given LTR model actually gives the users what they want.

Then we ask, much like in real life, how can we learn what the user _actually_ wants? We employe active learning to try to escape the 'echo chamber' of presentation bias we learned about at the end of chapter 11. After all users can't click on results that never show up in their search results!

## 🚨 We're putting it all together in this chapter

As this chapter puts together everything from chapters 10 and 11, much of the setup code below wraps up a lot of chapter 11 and 10 into a 'single function' so we can very easily run through the steps in 'one liners'

### Getting training data (Ch 11)

Chapter 11 is all about turning raw clickstream data into search training data (aka judgments). This involves overcoming biases in how users percieve search. But here we put that in one function call `calculate_sdbn`.

### Train a model (Ch 10)

Chapter 10 is about training an LTR model, including interacting with Solr to extract features, how a ranking model works, how to train a model, and how to perform a good test/train split for search. But here we similarly wrap that up into a handful of function calls, `split_training_data`, and `evaluate_model`.

*long story short, if you see a reference to chapter 10 and 11, it's probably omited from chapter 12* - don't expect it to be covered in chapter 12 extensively.


## Setup - gather some sessions (omitted)

To get started, we first load a set of simulated search sessions for all queries. 

Much of this setup is omitted from the chapter. This first part is just loading and synthesizing a bunch of clickstream sessions, like we used in chapter 11.

In [1]:
import sys

sys.path.append('..')
import glob
import time

import numpy
import pandas
import requests
from aips import *
from ltr.client.solr_client import SolrClient

import random; random.seed(0)


ltr = get_ltr_engine()
engine = get_engine()

client = SolrClient(solr_base=SOLR_URL)
products_collection = engine.get_collection("products")

In [2]:
def all_sessions():
    sessions = pandas.concat([pandas.read_csv(f, compression='gzip')
                          for f in glob.glob('retrotech/sessions/*_sessions.gz')])
    sessions = sessions.sort_values(['query', 'sess_id', 'rank'])
    return sessions.rename(columns={'clicked_doc_id': 'doc_id'})
    
sessions = all_sessions()
sessions

Unnamed: 0,sess_id,query,rank,doc_id,clicked
0,50002,blue ray,0.0,600603141003,True
1,50002,blue ray,1.0,827396513927,False
2,50002,blue ray,2.0,24543672067,False
3,50002,blue ray,3.0,719192580374,False
4,50002,blue ray,4.0,885170033412,True
...,...,...,...,...,...
74995,5001,transformers dark of the moon,10.0,47875841369,False
74996,5001,transformers dark of the moon,11.0,97363560449,False
74997,5001,transformers dark of the moon,12.0,93624956037,False
74998,5001,transformers dark of the moon,13.0,97363532149,False


In [3]:
sessions["query"].unique()

array(['blue ray', 'bluray', 'dryer', 'headphones', 'ipad', 'iphone',
       'kindle', 'lcd tv', 'macbook', 'nook', 'star trek', 'star wars',
       'transformers dark of the moon'], dtype=object)

## Setup Part 2 - Add some more query sessions (omitted)

Here we duplicate the simulated queries from above, but we flip a handful of the clicks. This just fills out our data a bit more, gives a bit more data to work with.

In [4]:
random.seed(0)

def copy_query_sessions(sessions, src_query, dest_query, flip=False):
    new_sessions = sessions[sessions["query"] == src_query].copy()  
    new_sessions["draw"] = numpy.random.rand(len(new_sessions), 1)
    new_sessions.loc[new_sessions["clicked"] & (new_sessions["draw"] < 0.04), "clicked"] = False
    new_sessions["query"] = dest_query
    return pandas.concat([sessions, new_sessions.drop("draw", axis=1)])


sessions = copy_query_sessions(sessions, "transformers dark of the moon", "transformers dark of moon")
sessions = copy_query_sessions(sessions, "transformers dark of the moon", "dark of moon")
sessions = copy_query_sessions(sessions, "transformers dark of the moon", "dark of the moon")
sessions = copy_query_sessions(sessions, "headphones", "head phones")
sessions = copy_query_sessions(sessions, "lcd tv", "lcd television")
sessions = copy_query_sessions(sessions, "lcd tv", "television, lcd")
sessions = copy_query_sessions(sessions, "macbook", "apple laptop")
sessions = copy_query_sessions(sessions, "iphone", "apple iphone")
sessions = copy_query_sessions(sessions, "kindle", "amazon kindle")
sessions = copy_query_sessions(sessions, "kindle", "amazon ereader")
sessions = copy_query_sessions(sessions, "blue ray", "blueray")

sessions

Unnamed: 0,sess_id,query,rank,doc_id,clicked
0,50002,blue ray,0.0,600603141003,True
1,50002,blue ray,1.0,827396513927,False
2,50002,blue ray,2.0,24543672067,False
3,50002,blue ray,3.0,719192580374,False
4,50002,blue ray,4.0,885170033412,True
...,...,...,...,...,...
149995,55001,blueray,25.0,22265004517,False
149996,55001,blueray,26.0,885170038875,False
149997,55001,blueray,27.0,786936817232,False
149998,55001,blueray,28.0,600603132872,False


In [5]:
sessions["query"].unique()

array(['blue ray', 'bluray', 'dryer', 'headphones', 'ipad', 'iphone',
       'kindle', 'lcd tv', 'macbook', 'nook', 'star trek', 'star wars',
       'transformers dark of the moon', 'transformers dark of moon',
       'dark of moon', 'dark of the moon', 'head phones',
       'lcd television', 'television, lcd', 'apple laptop',
       'apple iphone', 'amazon kindle', 'amazon ereader', 'blueray'],
      dtype=object)

## Setup Part 3 - Our test query, `transformers dvd`, with hidden, 'true' preferences

We add a new query to our set of queries `transformers dvd` and we note the users' hidden preferences in the variables `desired_movies` as well as what they consider mediocre `meh_transformers_movies` and not at all relevant `irrelevant_transformers_products`. Each holds the UPC of the associated product.

This simulates biased sessions in the data, as if the user never actually sees (and hence never clicks) their actual desired item. If the users desired results are shown, those results get a higher probability of click. Otherwise there is a lower probability of clicks.

In [6]:
next_sess_id = sessions["sess_id"].max()

# For some reason, the sessions only capture examines on the "dubbed" transformers movies
# ie the Japanese shows brought to an English-speaking market. But we'll see this is not what the 
# user wants (ie presentation bias). These are "meh" mildly interesting. There are also many many
# completely irrelevant movies.

# What the user wants, but never visible! Never gets clicked!
# These are the widescreen transformers dvds of the hollywood movies
desired_transformers_movies = ["97360724240", "97360722345", "97368920347"] 

# Bunch of random merchandise
irrelevant_transformers_products = ["708056579739", "93624995012", "47875819733", "47875839090", "708056579746",
                                     "47875332911", "47875842328", "879862003524", "879862003517", "93624974918"] 

# Other transformer movies
meh_transformers_movies = ["97363455349", "97361312743", "97361372389", "97361312804", "97363532149", "97363560449"]

displayed_transformer_products = meh_transformers_movies + irrelevant_transformers_products

new_sessions = []
for i in range(0,5000):
    random.shuffle(displayed_transformer_products)

    # shuffle each session
    for rank, upc in enumerate(displayed_transformer_products):
        draw = random.random()        
        clicked = (upc in meh_transformers_movies and draw < 0.13 or
                   upc in irrelevant_transformers_products and draw < 0.005 or
                   upc in desired_transformers_movies and draw < 0.65)

        new_sessions.append({"sess_id": next_sess_id + i, 
                             "query": "transformers dvd", 
                             "rank": rank,
                             "clicked": clicked,
                             "doc_id": upc})


sessions = pandas.concat([sessions, pandas.DataFrame(new_sessions)])
sessions

Unnamed: 0,sess_id,query,rank,doc_id,clicked
0,50002,blue ray,0.0,600603141003,True
1,50002,blue ray,1.0,827396513927,False
2,50002,blue ray,2.0,24543672067,False
3,50002,blue ray,3.0,719192580374,False
4,50002,blue ray,4.0,885170033412,True
...,...,...,...,...,...
79995,65000,transformers dvd,11.0,47875842328,False
79996,65000,transformers dvd,12.0,879862003517,False
79997,65000,transformers dvd,13.0,97361372389,False
79998,65000,transformers dvd,14.0,93624995012,False


## Setup 4 - chapter 11 In One Function (omitted) 

Wrapping up Chapter 11 in a single function `generate_training_data`. 

This function computes a relevance grade out of raw clickstream data. Recall that the SDBN (Simplified Dynamic Bayesian Network) click model we learned about in chapter 11 helps overcome position bias. We also use a beta prior so that a single click doesn't count as much as an observation with hundreds.

In [7]:
#%load -s calculate_ctr,calculate_average_rank,caclulate_examine_probability,calculate_clicked_examined,calculate_grade,calculate_prior,calculate_sdbn ../ltr/sdbn_functions.py
def calculate_ctr(sessions):
    click_counts = sessions.groupby("doc_id")["clicked"].sum()
    sess_counts = sessions.groupby("doc_id")["sess_id"].nunique()
    ctrs = click_counts / sess_counts
    return ctrs.sort_values(ascending=False)

def calculate_average_rank(sessions):
    avg_rank = sessions.groupby("doc_id")["rank"].mean()
    return avg_rank.sort_values(ascending=True)

def caclulate_examine_probability(sessions):
    last_click_per_session = sessions.groupby(["clicked", "sess_id"])["rank"].max()[True]
    sessions["last_click_rank"] = last_click_per_session
    sessions["examined"] = sessions["rank"] <= sessions["last_click_rank"]
    return sessions

def calculate_clicked_examined(sessions):
    sessions = caclulate_examine_probability(sessions)
    return sessions[sessions["examined"]] \
        .groupby("doc_id")[["clicked", "examined"]].sum()

def calculate_grade(sessions):
    sessions = calculate_clicked_examined(sessions)
    sessions["grade"] = sessions["clicked"] / sessions["examined"]
    return sessions.sort_values("grade", ascending=False)

def calculate_prior(sessions, prior_grade, prior_weight):
    sessions = calculate_grade(sessions)
    sessions["prior_a"] = prior_grade * prior_weight
    sessions["prior_b"] = (1 - prior_grade) * prior_weight
    return sessions

def calculate_sdbn(sessions, prior_grade, prior_weight):
    sessions = calculate_prior(sessions, prior_grade, prior_weight)
    sessions["posterior_a"] = (sessions["prior_a"] + 
                               sessions["clicked"])
    sessions["posterior_b"] = (sessions["prior_b"] + 
      sessions["examined"] - sessions["clicked"])
    sessions["beta_grade"] = (sessions["posterior_a"] /
      (sessions["posterior_a"] + sessions["posterior_b"]))
    return sessions.sort_values("beta_grade", ascending=False)

def generate_training_data(sessions, prior_grade=0.2, prior_weight=10):
    all_sdbn = pandas.DataFrame()
    for query in sessions["query"].unique():        
        query_sessions = sessions[sessions["query"] == query].copy().set_index("sess_id")
        query_sessions = calculate_sdbn(query_sessions, prior_grade, prior_weight)
        query_sessions["query"] = query
        all_sdbn = pandas.concat([all_sdbn, query_sessions])
    return all_sdbn[["query", "clicked", "examined", "grade", "beta_grade"]].reset_index().set_index(["query", "doc_id"])

## Listing 12.1 Generating the sdbn training data

We kickoff with the data we left off with in chapter 11.

In this listing we user our "chapter 11 in one function" `generate_training_data` to rebuild training data.

In [8]:
training_data = generate_training_data(sessions,
                                       prior_grade=0.2,
                                       prior_weight=10)
training_data

Unnamed: 0_level_0,Unnamed: 1_level_0,clicked,examined,grade,beta_grade
query,doc_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
blue ray,27242815414,42,42,1.000000,0.846154
blue ray,600603132872,46,88,0.522727,0.489796
blue ray,827396513927,1304,3381,0.385685,0.385137
blue ray,600603141003,978,2620,0.373282,0.372624
blue ray,885170033412,568,2184,0.260073,0.259799
...,...,...,...,...,...
transformers dvd,47875819733,24,1679,0.014294,0.015394
transformers dvd,708056579739,23,1659,0.013864,0.014979
transformers dvd,879862003524,23,1685,0.013650,0.014749
transformers dvd,93624974918,19,1653,0.011494,0.012628


## Chapter 10 Functions (omitted from book)

Now with the chapter 11 setup out of the way, we'll need to give Chapter 10's code a similar treatment, wrapping that LTR system into a black box.

All of the following are support functions for the chapter:

1. Convert the sdbn dataframe into individual `Judgment` objects needed for training the model from chapter 10
2. Pairwise transformation of the data
3. Normalization of the data
4. Training the model
5. Uploading the model to Solr

All of these steps are covered in Chapter 10.

In [19]:

from tkinter.tix import Tree
import requests
import numpy
from ltr.judgments import judgments_from_file, judgments_to_nparray
from sklearn import svm
import json
import math
from itertools import groupby
from ltr.log import FeatureLogger
from ltr.judgments import judgments_open
from itertools import groupby
from ltr import download
from ltr.judgments import judgments_writer

from ltr.judgments import Judgment

def sdbn_to_judgments(training_data):
    """Turn pandas dataframe into ltr judgments objects."""
    judgments = []
    queries = {}
    next_qid = 0
    for row_dict in training_data.reset_index().to_dict(orient="records"):
        # Round grade to 10ths, Map 0.3 -> 3, etc
        grade = round(row_dict["beta_grade"], 1) * 10
        qid = -1
        if row_dict["query"] in queries:
            qid = queries[row_dict["query"]]
        else:
            queries[row_dict["query"]] = next_qid
            qid = next_qid
            next_qid += 1
        assert qid != -1
        
        judgments.append(Judgment(doc_id=row_dict["doc_id"],
                                  keywords=row_dict["query"],
                                  qid=qid,
                                  grade=int(grade)))
    return judgments

def write_judgments(judgments, dest="retrotech_judgments.txt"):
    with judgments_writer(open(dest, "wt")) as writer:
        for judgment in judgments:
            writer.write(judgment)

def normalize_features(logged_judgments):
    print(logged_judgments[0].features)
    print(logged_judgments[len(logged_judgments) - 1].features)
    print(logged_judgments[len(logged_judgments) - 1])
    all_features = []
    means = [0] * len(logged_judgments[0].features)
    for judgment in logged_judgments:
        for idx, f in enumerate(judgment.features):
            means[idx] += f
        all_features.append(judgment.features)
    
    for i in range(len(means)):
        means[i] /= len(logged_judgments)
      
    std_devs = [0.0] * len(logged_judgments[0].features)
    for judgment in logged_judgments:
        for idx, f in enumerate(judgment.features):
            std_devs[idx] += (f - means[idx])**2
            
    for i in range(len(std_devs)):
        std_devs[i] /= len(logged_judgments)
        std_devs[i] = math.sqrt(std_devs[i])
    
    for i in range(len(std_devs)):
        if std_devs[i] == 0:
            std_devs[i] = 0.00001
        
    # Normalize!
    normed_judgments = []
    for judgment in logged_judgments:
        normed_features = [0.0] * len(judgment.features)
        for idx, f in enumerate(judgment.features):
            normed = 0.0
            if std_devs[idx] > 0: 
                normed = (f - means[idx]) / std_devs[idx]
            normed_features[idx] = normed
        normed_judgment=Judgment(qid=judgment.qid,
                                 keywords=judgment.keywords,
                                 doc_id=judgment.doc_id,
                                 grade=judgment.grade,
                                 features=normed_features)
        normed_judgment.old_features=judgment.features
        normed_judgments.append(normed_judgment)

    return means, std_devs, normed_judgments


def pairwise_transform(normed_judgments, weigh_difference = True):
        
    predictor_deltas = []
    feature_deltas = []
    
    # For each query's judgments
    for qid, query_judgments in groupby(normed_judgments, key=lambda j: j.qid):

        # Annoying issue consuming python iterators, we ensure we have two
        # full copies of each query"s judgments
        query_judgments_copy_1 = list(query_judgments) 
        query_judgments_copy_2 = list(query_judgments_copy_1)

        # Examine every judgment combo for this query, 
        # if they"re different, store the pairwise difference:
        # +1 if judgment1 more relevant
        # -1 if judgment2 more relevant
        for judgment1 in query_judgments_copy_1:
            for judgment2 in query_judgments_copy_2:
                
                j1_features=numpy.array(judgment1.features)
                j2_features=numpy.array(judgment2.features)
                
                if judgment1.grade > judgment2.grade:
                    diff = judgment1.grade - judgment2.grade if weigh_difference else 1.0
                    predictor_deltas.append(+1)
                    feature_deltas.append(diff * (j1_features-j2_features))
                elif judgment1.grade < judgment2.grade:
                    diff = judgment2.grade - judgment1.grade if weigh_difference else 1.0
                    predictor_deltas.append(-1)
                    feature_deltas.append(diff * (j1_features-j2_features))

    # For training purposes, we return these as numpy arrays
    return numpy.array(feature_deltas), numpy.array(predictor_deltas)

def upload_model(model, model_name, means, std_devs, feature_set):
    feature_names = [ftr["name"] for ftr in feature_set]
    linear_model = ltr.generate_model(model_name, feature_names,
                                      means, std_devs, model.coef_[0])
    
    print(f"Delete model {model_name}")
    response = ltr.delete_model_store(products_collection, model_name)
    print(response)
    
    print(f"Upload model {model_name}")
    response = ltr.upload_model(products_collection, linear_model)
    print(response)
    
## TODO - can't easily to test/train split on these few queries
##   make more queries?

def train_ranksvm_model(training_data, model_name, feature_set):
    """Train a RankSVM model via Solr, store in Solr."""
    judgments = sdbn_to_judgments(training_data)
    judgments_path = "retrotech_judgments.txt"
    write_judgments(judgments, judgments_path)
    
    ltr.delete_feature_store(products_collection, model_name)
    ltr.delete_model_store(products_collection, model_name)
    print("Put feature-store:")
    response = ltr.upload_features(products_collection, feature_set, model_name)
    print(response)
    collection = engine.get_collection("products")
    ftr_logger = FeatureLogger(client, collection, feature_set=model_name,
                               id_field="upc")

    with judgments_open(judgments_path) as judgment_list:
        for qid, query_judgments in groupby(judgments, key=lambda j: j.qid):
            ftr_logger.log_for_qid(judgments=query_judgments, 
                                   qid=qid,
                                   keywords=judgment_list.keywords(qid), log=True)

    logged_judgments = ftr_logger.logged
    means, std_devs, normed_judgments = normalize_features(logged_judgments)
    feature_deltas, predictor_deltas = pairwise_transform(normed_judgments)

    model = svm.LinearSVC(max_iter=10000, verbose=1)
    model.fit(feature_deltas, predictor_deltas) 
    
    upload_model(model, model_name, means, std_devs, feature_set)

sdbn_to_judgments(training_data)            
write_judgments(sdbn_to_judgments(training_data))
!cat retrotech_judgments.txt


# qid:0: blue ray*1
# qid:1: bluray*1
# qid:2: dryer*1
# qid:3: headphones*1
# qid:4: ipad*1
# qid:5: iphone*1
# qid:6: kindle*1
# qid:7: lcd tv*1
# qid:8: macbook*1
# qid:9: nook*1
# qid:10: star trek*1
# qid:11: star wars*1
# qid:12: transformers dark of the moon*1
# qid:13: transformers dark of moon*1
# qid:14: dark of moon*1
# qid:15: dark of the moon*1
# qid:16: head phones*1
# qid:17: lcd television*1
# qid:18: television, lcd*1
# qid:19: apple laptop*1
# qid:20: apple iphone*1
# qid:21: amazon kindle*1
# qid:22: amazon ereader*1
# qid:23: blueray*1
# qid:24: transformers dvd*1

8	qid:0	 # 27242815414	blue ray
5	qid:0	 # 600603132872	blue ray
4	qid:0	 # 827396513927	blue ray
4	qid:0	 # 600603141003	blue ray
3	qid:0	 # 885170033412	blue ray
3	qid:0	 # 883929140855	blue ray
2	qid:0	 # 24543672067	blue ray
2	qid:0	 # 813774010904	blue ray
2	qid:0	 # 36725617605	blue ray
2	qid:0	 # 786936817232	blue ray
2	qid:0	 # 36725608443	blue ray
2	qid:0	 # 719192580374	blue ray
2	qid:0	 # 25192

## Also Chapter 10 - Perform a test / train split on the SDBN data (omitted)

This function is broken out from the model training. It lets us train a model on one set of data (reusing the chapter 10 training code), reserving test queries for evaluation.

In [10]:
from math import floor

def split_training_data(training_data, train_proportion=0.8):
    """Split queries in training_data into train / test split with `train` proportion going to training set."""
    queries = training_data.index.get_level_values('query').unique().copy().tolist()
    random.shuffle(queries)
    num_queries = len(queries)
    split_point = floor(num_queries * train_proportion)
    
    train_queries = queries[:split_point]
    test_queries = queries[split_point:]
    return training_data.loc[train_queries, :], training_data.loc[test_queries]


## Chapter 10 - Search Code (omitted)

Also from Chapter 10, a simple function to search using the LTR model and return a list of search results.

In [11]:
def search_and_grade(query, model_name, judgments, desired=[]):
    results = ltr.search_with_model(query, model_name, rows=10)
    results = pandas.DataFrame(results)
    results["desired"] = False
    for upc in desired:
        results.loc[results["upc"] == upc, "desired"] = True
        
    sdbn_query = judgments.loc[query].copy().reset_index()
    return results.merge(sdbn_query, left_on='upc', right_on='doc_id', how='left')

## Chapter 10 - Evaluate the model on the test set (omitted)

This function computes the model's performance on a set of test queries. The `test_data` is the control set not used to train the model. We compute the precision of these queries

In [12]:
def evaluate_model(test_data, model_name, training_data, rows=10, log=True):
    queries = test_data.index.get_level_values("query").unique()
    
    query_results = {}
    
    for query in queries:
        search_results = ltr.search_with_model(query, model_name,
                                               rows=rows, log=True)
    
        results = pandas.DataFrame(search_results).reset_index()
        judgments = training_data.loc[query, :].copy().reset_index()
        judgments["doc_id"] = judgments["doc_id"].astype(str)
        if len(results) == 0:
            print(f"No Results for {query}")
            query_results[query] = 0
        else:
            graded_results = results.merge(judgments, left_on="upc", right_on="doc_id", how="left")
            print(graded_results)
            graded_results[["clicked", "examined", "grade", "beta_grade"]] = graded_results[["clicked", "examined", "grade", "beta_grade"]].fillna(0)
            graded_results = graded_results.drop("doc_id", axis=1)

            query_results[query] = (graded_results["beta_grade"].sum() / rows)
    return query_results

## Listing 12.2 - model training

We wrap all the important decisions from chapter 10 in a few lines 

In [13]:
def train_and_evaluate_model(sessions, model_name, feature_set):
    judgments = generate_training_data(sessions)
    train, test = split_training_data(judgments, 0.8)
    train_ranksvm_model(train, model_name, feature_set=feature_set)
    evaluation = evaluate_model(test, model_name, judgments)
    return evaluation

In [22]:
random.seed(1234)
ltr.delete_feature_store(products_collection, "ltr_model_variant_0")

feature_set = [
    ltr.generate_query_feature(feature_name="long_description_bm25",
                               field_name="long_description"),
    ltr.generate_query_feature(feature_name="short_description_constant",
                               field_name="short_description",
                               constant_score=True)
]
feature_set[0]["store"] = "ltr_model_variant_0"
feature_set[1]["store"] = "ltr_model_variant_0"
print(json.dumps(feature_set, indent=2))
train_and_evaluate_model(sessions, "ltr_model_variant_0", feature_set)

[
  {
    "name": "long_description_bm25",
    "class": "org.apache.solr.ltr.feature.SolrFeature",
    "params": {
      "q": "long_description:(${keywords})"
    },
    "store": "ltr_model_variant_0"
  },
  {
    "name": "short_description_constant",
    "class": "org.apache.solr.ltr.feature.SolrFeature",
    "params": {
      "q": "short_description:(${keywords})^=1"
    },
    "store": "ltr_model_variant_0"
  }
]
Put feature-store:
{'responseHeader': {'status': 0, 'QTime': 2}}
Search Request:
{
  "query": "upc:(36725236271 883393003458 36725234789 882777064009 22265004289 27242817197 729507810218 885170042704 812491010310 827912072969 812491010334 74000373105 884483335329 827912068467 97278016000 827912068474 882777064207 36725235564 696211503197 885170042667 600603139758 13803112610 719192579996 22265004258 723755834491 605342041546 22265004302 729507813059)",
  "limit": 1000,
  "params": {},
  "fields": [
    "upc",
    "[features store=ltr_model_variant_0 efi.keywords=\"lcd telev


Liblinear failed to converge, increase the number of iterations.



Delete model ltr_model_variant_0
{'responseHeader': {'status': 0, 'QTime': 6}}
Upload model ltr_model_variant_0
{'responseHeader': {'status': 0, 'QTime': 2}}
search_with_model: search request:
{'fields': ['upc', 'name', 'manufacturer', 'score'], 'limit': 10, 'params': {'rq': '{!ltr reRankDocs=60000 reRankWeight=10.0 model=ltr_model_variant_0 efi.fuzzy_keywords="~dryer" efi.squeezed_keywords="dryer" efi.keywords="dryer"}', 'qf': 'name name_ngram upc manufacturer short_description long_description', 'defType': 'edismax', 'q': 'dryer'}}
search_with_model: search response:
{'responseHeader': {'zkConnected': True, 'status': 0, 'QTime': 4}, 'response': {'numFound': 812, 'start': 0, 'maxScore': 4.134992, 'numFoundExact': True, 'docs': [{'upc': '50946924311', 'name': 'Whirlpool - Duet&#x2122; Stack Kit III', 'manufacturer': 'Whirlpool', 'score': 0.060465567}, {'upc': '48231010702', 'name': 'LG - SteamDryer 7.3 Cu. Ft. 14-Cycle Ultra Capacity Gas Dryer - Graphite Steel', 'manufacturer': 'LG', '

{'dryer': 0.0,
 'blue ray': 0.0,
 'headphones': 0.0,
 'dark of moon': 0.03537007299011319,
 'transformers dvd': 0.005110964992700785}

In [None]:
# # What the user wants, but never visible! Never gets clicked!
# These are the widescreen transformers dvds of the hollywood movies
judgments = generate_training_data(sessions)
desired_movies = ["97360724240", "97360722345", "97368920347"] 
result = search_and_grade("transformers dvd", "ltr_model_variant_0", judgments, desired_movies)
upcs1 = result["upc"]
result

Unnamed: 0,upc,name,manufacturer,score,rank,desired,doc_id,clicked,examined,grade,beta_grade
0,708056579746,Nintendo - Transformers 3 Stylus 2-Pack,Nintendo,0.071629,0,False,708056579746.0,26.0,1664.0,0.015625,0.016726
1,47875332911,Transformers: Revenge of the Fallen - Windows,Activision,0.06615,1,False,47875332911.0,24.0,1630.0,0.014724,0.015854
2,47875841420,Transformers: Dark of the Moon Decepticons - N...,Activision,0.059614,2,False,,,,,
3,47875841369,Transformers: Dark of the Moon - PlayStation 3,Activision,0.057303,3,False,,,,,
4,34707056190,Memorex - 50-Pack 16x DVD+R Disc Spindle,Memorex,0.055923,4,False,,,,,
5,47875842328,Transformers: Dark of the Moon Stealth Force E...,Activision,0.055477,5,False,47875842328.0,29.0,1663.0,0.017438,0.01853
6,47875842335,Transformers: Dark of the Moon Stealth Force E...,Activision,0.055477,6,False,,,,,
7,47875841406,Transformers: Dark of the Moon Autobots - Nint...,Activision,0.055477,7,False,,,,,
8,23942950585,Verbatim - 25-Pack 16x DVD-R Disc Spindle,Verbatim,0.055429,8,False,,,,,
9,659846419028,Digital Innovations - DvdDr Laser Lens Cleaner...,Digital Innovations,0.054677,9,False,,,,,


## Listing 12.3

Train a model that hypothetically performs better offline called `ltr_model_variant_1`

In [16]:
random.seed(1234)

ltr.delete_feature_store(products_collection, "ltr_model_variant_1")

feature_set = [
    ltr.generate_fuzzy_query_feature("name_fuzzy", "name"),
    ltr.generate_bigram_query_feature("name_bigram", "name"),
    ltr.generate_bigram_query_feature("short_description_bigram", "short_description")
]

train_and_evaluate_model(sessions, "ltr_model_variant_1", feature_set)

KeyboardInterrupt: 

## Simulate a user querying, clicking, purchasing (omitted)

This function simulates a user performing a query and possibly taking an action as they scan down the results.

In [None]:
def simulate_live_user_session(query, model_name,
                               desired_probability=0.15,
                               indifferent_probability=0.03,
                               uninterested_probability=0.01,
                               quit_per_result_probability=0.2):
    """Simulates a user 'query' where purchase probability depends on if 
       products upc is in one of three sets.
       
       Users purchase a single product per session.    
       
       Users quit with `quit_per_rank_prod` after scanning each rank
       
       """   
    desired_products = ["97360724240", "97363560449", "97363532149",
                        "97360810042"]
    indifferent_products = ["97361312743", "97363455349", "97361372389"]
    search_results = ltr.search_with_model(query, model_name, rows=10)

    results = pandas.DataFrame(search_results).reset_index()
    for doc in results.to_dict(orient="records"): 
        draw = random.random()
        
        if doc["upc"] in desired_products:
            if draw < desired_probability:
                return True
        elif doc["upc"] in indifferent_products:
            if draw < indifferent_probability:
                return True
        elif draw < uninterested_probability:
            return True
        if random.random() < quit_per_result_probability:
            return False
        
    return False

## Listing 12.4 - Simulated A/B test on just `transformers dvd` query

Here we simulate 1000 users being served two rankings for `transformers dvd` and based on the hidden preferences here (`wants_to_purchase` and `might_purchase`) we see which performs better with conversions.

In [None]:
def a_b_test(query, model_a, model_b):
    """Randomly assign this user to a or b"""
    draw = random.random()
    model_name = model_a if draw < 0.5 else model_b
    
    purchase_made = simulate_live_user_session(query, model_name)
    return (model_name, purchase_made)

def simulate_user_a_b_test(query, model_a, model_b, number_of_users=1000):
    purchases = {model_a: 0, model_b: 0}
    for _ in range(number_of_users): 
        model_name, purchase_made = a_b_test(query, model_a, model_b)
        if purchase_made:
            purchases[model_name] += 1
    return purchases

In [None]:
random.seed(1234)

simulate_user_a_b_test("transformers dvd",
                       "ltr_model_variant_0",
                       "ltr_model_variant_1")

{'ltr_model_variant_0': 21, 'ltr_model_variant_1': 15}

## New helper: show the features for each SDBN entry (omitted)

This function shows us the logged features of each training row for the given sdbn data for debugging.

So not just

| query   | doc      | grade
|---------|----------|---------
|transformers dvd | 1234 | 1.0

But also a recording of the matches that occured

| query           | doc      | grade    | short_desc_match  | long_desc_match |...
|-----------------|----------|----------|-------------------|-----------------|---
|transformers dvd | 1234     | 1.0      | 0.0               | 1.0             |...

In [None]:
def associate_with_features(training_data, feature_set, model_name):
    """Log features alongside sdbn training_data into a dataframe"""
    judgments = sdbn_to_judgments(training_data)
    judgments_path = "retrotech_judgments.txt"
    write_judgments(judgments, judgments_path)
    
    ltr.delete_feature_store(products_collection, model_name)
    ltr.upload_features(products_collection, feature_set, model_name)

    ftr_logger = FeatureLogger(client, index=products_collection, feature_set=model_name, id_field="upc")
    
    with judgments_open(judgments_path) as judgment_list:
        for qid, query_judgments in groupby(judgments, key=lambda j: j.qid):
            ftr_logger.log_for_qid(judgments=query_judgments, 
                                   qid=qid,
                                   keywords=judgment_list.keywords(qid))

    logged_judgments = ftr_logger.logged
    means, std_devs, normed_judgments = normalize_features(logged_judgments)
    feature_deltas, predictor_deltas = pairwise_transform(normed_judgments)
    features, predictors = judgments_to_nparray(logged_judgments)
    logged_judgments_dataframe = pandas.concat([pandas.DataFrame(predictors),
                                            pandas.DataFrame(features)], 
                                           axis=1,
                                           ignore_index=True)
    columns = {idx + 2: ftr["name"] for idx, ftr in enumerate(feature_set)}
    columns[0] = "grade"
    columns[1] = "qid"
    
    qid_to_query = {}
    for j in logged_judgments:
        qid_to_query[j.qid] = j.keywords
        
    qid_to_query = pandas.DataFrame(qid_to_query.values()).reset_index().rename(columns={"index": "qid", 0: "query"})
    
    logged_judgments_dataframe = logged_judgments_dataframe.rename(columns=columns)
    logged_judgments_dataframe = logged_judgments_dataframe.merge(qid_to_query, how="left", on="qid")
    cols_order = ["query", "grade"] + [ftr["name"] for idx, ftr in enumerate(feature_set)]
    logged_judgments_dataframe["grade"] = logged_judgments_dataframe["grade"] / 10.0 
    return logged_judgments_dataframe[cols_order].sort_values("query")

## Listing 12.5 - Output matches for one feature set

Another way of formulating `presentation_bias` is to look at the kinds of documents not being shown to users, so we can strategically show those to users. Below we show the value of each feature in `explore_feature_set` for each document in the sdbn judgments.

In [None]:
def get_exploit_feature_set(store="aips_feature_store"):
    return [
        ltr.generate_fuzzy_query_feature("name_fuzzy", "name"),
        ltr.generate_query_feature("long_description_bm25", "long_description"),
        ltr.generate_query_feature("short_description_match", "short_description", True)]

def get_latest_explore_feature_set():
    return [
        ltr.generate_query_feature("long_description_match", "long_description", True),
        ltr.generate_query_feature("short_description_match", "short_description", True),
        ltr.generate_query_feature("name_match", "name", True),
        ltr.generate_query_feature("has_promotion", "has_promotion", value="true")]
    
explore_feature_set = get_latest_explore_feature_set()

judgments = generate_training_data(sessions)    
feature_names = [f["name"] for f in explore_feature_set]

judgments_with_features = \
  associate_with_features(judgments, explore_feature_set, "explore")
transformers_judgments = \
  judgments_with_features[(judgments_with_features["query"] ==
                            "transformers dvd")]
transformers_judgments


Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603124570
Missing doc 600603132827
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603135101
Missing doc 600603124570
Missing doc 600603139758
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603132872
Missing doc 600603141003
[1.0, 1.0, 1.0, 0.009569731]
[1.0, 0.0, 1.0, 0.0]
grade:0 qid:24 (transformers dvd) docid:47875839090


Unnamed: 0,query,grade,long_description_match,short_description_match,name_match,has_promotion
618,transformers dvd,0.0,1.0,0.0,1.0,0.0
623,transformers dvd,0.0,1.0,1.0,1.0,0.0
622,transformers dvd,0.0,1.0,1.0,1.0,0.0
621,transformers dvd,0.0,1.0,0.0,1.0,0.0
620,transformers dvd,0.0,1.0,0.0,1.0,0.0
619,transformers dvd,0.0,1.0,0.0,1.0,0.0
617,transformers dvd,0.0,0.0,0.0,1.0,0.0
610,transformers dvd,0.3,0.0,0.0,1.0,0.0
615,transformers dvd,0.3,0.0,0.0,1.0,0.0
614,transformers dvd,0.3,0.0,0.0,1.0,0.0


## Listing 12.6 - Train Gaussian Process Regressor

We train data on just the `transformers_training_data`. 

NOTE we could also train on the full sdbn training data, and see globally what's missing. However it's often convenient to zero in on specific queries to round out their training data.

In [None]:
from sklearn.gaussian_process import GaussianProcessRegressor

def train_gpr(training_data, feature_names):
    feature_data = training_data[feature_names]
    grades = training_data["grade"]
    gpr = GaussianProcessRegressor()
    gpr.fit(feature_data, grades)
    return gpr

In [None]:
train_gpr(transformers_judgments, feature_names)

## Listing 12.7: Predict on every value

Here `gpr` predicts on every possible feature value. This lets us analyze which set of feature values to use when exploring with users.

In [None]:
def calculate_prediction_std_dev(training_data, feature_names):
    zero_or_one = [0, 1]
    index = pandas.MultiIndex.from_product(
        [zero_or_one] * 4, names=feature_names)
    with_prediction = pandas.DataFrame(index=index).reset_index()

    gpr = train_gpr(training_data, feature_names)
    predictions_with_std = gpr.predict(
        with_prediction[feature_names],
                    return_std=True)
    with_prediction["predicted_grade"] = predictions_with_std[0]
    with_prediction["predicted_stddev"] = predictions_with_std[1]
   
    return  with_prediction.sort_values("predicted_stddev", ascending=True)

In [None]:
calculate_prediction_std_dev(transformers_judgments, feature_names)

Unnamed: 0,long_description_match,short_description_match,name_match,has_promotion,predicted_grade,predicted_stddev
2,0,0,1,0,0.2250004,4e-06
10,1,0,1,0,1.192093e-07,4e-06
14,1,1,1,0,0.0,7e-06
6,0,1,1,0,1.192093e-07,1e-05
0,0,0,0,0,0.1364695,0.79506
3,0,0,1,1,0.1364695,0.79506
8,1,0,0,0,0.0,0.79506
11,1,0,1,1,0.0,0.79506
12,1,1,0,0,0.0,0.79506
15,1,1,1,1,0.0,0.79506


## Listing 12.8 - Calculate Expected Improvement


We use [Expected Improvement](https://distill.pub/2020/bayesian-optimization/) scoring to select candidates for exploration within the `transformers dvd` query.

In [None]:
from scipy.stats import norm

def calculate_expected_improvement(training_data, feature_names, theta=0.6):
    improvement_data = calculate_prediction_std_dev(training_data, feature_names)
    improvement_data["opportunity"] = (improvement_data["predicted_grade"] -
                                       training_data["grade"].mean() - theta)

    improvement_data["prob_of_improvement"] = (norm.cdf(improvement_data["opportunity"] /
                                              improvement_data["predicted_stddev"]))

    improvement_data["expected_improvement"] = \
        (improvement_data["opportunity"] * 
        improvement_data["prob_of_improvement"] + 
        improvement_data["predicted_stddev"] *
        norm.pdf(improvement_data["opportunity"] /
                 improvement_data["predicted_stddev"]))
    
    return improvement_data.sort_values("expected_improvement", ascending=False)

In [None]:
calculate_expected_improvement(transformers_judgments, feature_names)

Unnamed: 0,long_description_match,short_description_match,name_match,has_promotion,predicted_grade,predicted_stddev,opportunity,prob_of_improvement,expected_improvement
1,0,0,0,1,0.08277285,0.929873,-0.629727,0.249134,0.138061
5,0,1,0,1,-5.960464e-08,0.929873,-0.7125,0.221769,0.118584
13,1,1,0,1,-5.960464e-08,0.929873,-0.7125,0.221769,0.118584
9,1,0,0,1,-5.960464e-08,0.929873,-0.7125,0.221769,0.118584
0,0,0,0,0,0.1364695,0.79506,-0.576031,0.234376,0.108956
3,0,0,1,1,0.1364695,0.79506,-0.576031,0.234376,0.108956
4,0,1,0,0,0.0,0.79506,-0.7125,0.185084,0.080412
7,0,1,1,1,0.0,0.79506,-0.7125,0.185084,0.080412
12,1,1,0,0,0.0,0.79506,-0.7125,0.185084,0.080412
15,1,1,1,1,0.0,0.79506,-0.7125,0.185084,0.080412


In [None]:
def predict_best_explore_candidate(query, training_data, feature_set, theta=0.6):
    with_features = associate_with_features(training_data, feature_set, "explore")
    fields = [f["name"] for f in feature_set]
    training_data = with_features[with_features["query"] == query]
    to_explore = calculate_expected_improvement(training_data, fields)
    options = to_explore.loc[:, fields]
    return options.loc[0]

In [None]:
explore_feature_set = get_latest_explore_feature_set()
predict_best_explore_candidate("transformers dvd", judgments, explore_feature_set)

Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603124570
Missing doc 600603132827
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603135101
Missing doc 600603124570
Missing doc 600603139758
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603132872
Missing doc 600603141003
[1.0, 1.0, 1.0, 0.009569731]
[1.0, 0.0, 1.0, 0.0]
grade:0 qid:24 (transformers dvd) docid:47875839090


long_description_match     0
short_description_match    0
name_match                 0
has_promotion              0
Name: 0, dtype: int64

## Create a query to fetch `explore` docs (omitted)

Based on the selected features from the GaussianProcessRegressor, we create a query to fetch a doc that contains those features.

In [None]:
def explore_query_string(explore_vector, query=""):
    config_explore = {
        "long_description_match": {"field": "long_description", "query_dependent": True},
        "short_description_match": {"field": "short_description", "query_dependent": True},
        "name_match": {"field": "name", "query_dependent": True},
        "long_description_bm25": {"field": "long_description", "query_dependent": True},
        "manufacturer_match": {"field": "manufacturer", "query_dependent": True},
        "has_promotion": {"field": "has_promotion", "query_dependent": False, "1_value": "true"}
    }
    clauses = []
    for col_name, config in config_explore.items():
        try:
            clause = ""
            if explore_vector[col_name] == 1.0 and config["field"] == "has_promotion":
                clause = f'+{config["field"]}:'
            elif explore_vector[col_name] == -1.0:
                clause = f'-{config["field"]}:'
            if len(clause) > 0:  
                if config["query_dependent"]:
                    clause += f"({query})"
                else:
                    clause += f'{config["1_value"]}'

            clauses.append(clause)
        except KeyError as e:
            pass
    
    final_query = " ".join(clauses)
    final_query = final_query.strip()
    if len(final_query) == 0:
        return "*"
    return final_query

## Listing 12.9 - Find document to explore from Solr

Here we fetch a document that matches the properties of something missing from our training set to display to the user

In [None]:
def explore(collection, query, training_data, feature_set):
    """Explore according to the provided explore vector, select
       a random doc from that group."""
    judgments_with_features = associate_with_features(training_data, feature_set, "explore")
    fields = [f["name"] for f in feature_set]
    to_explore = calculate_expected_improvement(judgments_with_features, fields)
    explore_vector = to_explore.sort_values("expected_improvement",
                                            ascending=False) \
                                .head().iloc[0][fields]
    print(to_explore.sort_values("expected_improvement", ascending=False).head().iloc[0])

    q = explore_query_string(explore_vector, query)
    print(q)
    response = collection.search_for_random_document(q)
    return response["docs"][0]["upc"]

In [None]:
#executes same query from pre-refactor, get a different result. 
{"fields": ["upc", "name", "manufacturer", "score"],
 "params": {"q": "+has_promotion:true +short_description:(transformers dvd)",
            "sort": "random_0.9664535356921388 DESC"}}
#Should yield: 97368920347, 
#First seed to yield 97368920347: 12

random.seed(12)
judgments = generate_training_data(sessions)
explore_feature_set = get_latest_explore_feature_set()
explore(products_collection, "transformers dvd",
        judgments, explore_feature_set)

Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603124570
Missing doc 600603132827
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603135101
Missing doc 600603124570
Missing doc 600603139758
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603132872
Missing doc 600603141003
[1.0, 1.0, 1.0, 0.009569731]
[1.0, 0.0, 1.0, 0.0]
grade:0 qid:24 (transformers dvd) docid:47875839090
long_description_match      1.000000
short_description_match     1.000000
name_match                  1.000000
has_promotion               1.000000
predicted_grade            41.516327
predicted_stddev            0.510606
opportunity                40.739650
prob_of_improvement         1.000000
expected_improvement       

'9781400532711'

## New heavily clicked doc is promoted!

```
      {
        "upc":"97360810042",
        "name":"Transformers: Dark of the Moon - Blu-ray Disc",
        "name_ngram":"Transformers: Dark of the Moon - Blu-ray Disc",
        "name_omit_norms":"Transformers: Dark of the Moon - Blu-ray Disc",
        "name_txt_en_split":"Transformers: Dark of the Moon - Blu-ray Disc",
        "manufacturer":"\\N",
        "short_description":"\\N",
        "long_description":"\\N",
        "promotion_b":true,
        "id":"72593b1c-313b-4f25-a4f2-04eae29d858b",
        "_version_":1710117636920049669
      },
```


## Simulate new sessions with the new data

We simulate new sessions, if the upc is in `might_purchase` or `wants_to_purchase`, we set it to 'clicked' with a given probability.

In [None]:
def generate_simulated_exploration_sessions(query, sessions, training_data, feature_set, n=500):
    """Conducts N (500) searches with the query and returns session data with
       simulated the simulated user behavior"""
    wants_to_purchase = ["97360724240", "97363560449", "97363532149", "97360810042", "97368920347"]
    might_purchase = ["97361312743", "97363455349", "97361372389"]
    explore_on_rank = 2.0
    with_explore_sessions = sessions.copy()
    for i in range(0, n):
        explore_upc = explore(products_collection, query, training_data, feature_set)
        print(i, explore_upc)
        sess_ids = list(set(sessions["sess_id"].tolist()))
        random.shuffle(sess_ids)
        sess_ids[0]
        new_session = sessions[sessions["sess_id"] == sess_ids[0]].copy()
        new_session["sess_id"] = 100000 + i
        new_session.loc[new_session["rank"] == explore_on_rank, "doc_id"] = explore_upc
        draw = random.random()
        new_session.loc[new_session["rank"] == explore_on_rank, "clicked"] = False
        if explore_upc in wants_to_purchase:
            if draw < 0.8:
                print(f"click {explore_upc}")
                new_session.loc[new_session["rank"] == explore_on_rank, "clicked"] = True
        elif explore_upc in might_purchase:
            if draw < 0.5:
                print(f"click {explore_upc}")
                new_session.loc[new_session["rank"] == explore_on_rank, "clicked"] = True
        else:
            if draw < 0.01:
                print(f"click {explore_upc}")
                new_session.loc[new_session["rank"] == explore_on_rank, "clicked"] = True

        with_explore_sessions = pandas.concat([with_explore_sessions, new_session])
    return with_explore_sessions

def get_sessions_with_exploration(query, sessions, training_data, feature_set):
    return generate_simulated_exploration_sessions(query, sessions, training_data, feature_set)

## Listing 12.10 - Update judgments from new sessions

Have we added any new docs that appear to be getting more clicks?

In [None]:
random.seed(1234)

query = "transformers dvd"
judgments = generate_training_data(sessions)
sessions_with_exploration = generate_simulated_exploration_sessions(query, sessions, judgments, explore_feature_set)
training_data_with_exploration = generate_training_data(
                                   sessions_with_exploration)
        
print(training_data_with_exploration.loc["transformers dvd"])
#print(sessions_with_exploration[sessions_with_exploration["sess_id"] == 100049])

Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603124570
Missing doc 600603132827
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603135101
Missing doc 600603124570
Missing doc 600603139758
Missing doc 600603139758
Missing doc 600603123061
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603132872
Missing doc 600603141003
[1.0, 1.0, 1.0, 0.009569731]
[1.0, 0.0, 1.0, 0.0]
grade:0 qid:24 (transformers dvd) docid:47875839090
long_description_match      1.000000
short_description_match     1.000000
name_match                  1.000000
has_promotion               1.000000
predicted_grade            41.516327
predicted_stddev            0.510606
opportunity                40.739650
prob_of_improvement         1.000000
expected_improvement       

## Listing 12.11 - Rebuild model using updated judgments

After showing the new document to users, we can rebuild the model using judgments that cover this feature blindspot.

In [None]:
random.seed(1234)

# {'blue ray': 0.0,
# 'dryer': 0.07068309073137659,
# 'headphones': 0.06426395939086295,
# 'dark of moon': 0.25681268708548055,
# 'transformers dvd': 0.10077083021678328}

ltr.delete_feature_store(products_collection, "ltr_model_variant_2")

promotion_feature_set = [
    ltr.generate_fuzzy_query_feature("name_fuzzy", "name"),
    ltr.generate_bigram_query_feature("name_bigram", "name"),
    ltr.generate_bigram_query_feature("short_description_bigram", "short_description"),
    ltr.generate_query_feature("has_promotion", "has_promotion", value="true")]
           
train_and_evaluate_model(sessions_with_exploration, "ltr_model_variant_2",
                         feature_set)

Put feature-store:
{'responseHeader': {'status': 0, 'QTime': 1}}
Duplicate Doc in qid:0 36725236271
Duplicate Doc in qid:0 36725236271
Missing doc 600603139758
Missing doc 600603132827
Duplicate Doc in qid:2 27242815414
Duplicate Doc in qid:2 27242815414
Missing doc 600603132872
Missing doc 600603141003
Duplicate Doc in qid:4 9781400532711
Duplicate Doc in qid:4 9781400532711
Duplicate Doc in qid:5 97360810042
Duplicate Doc in qid:5 97360810042
Missing doc 600603123061
Missing doc 600603123061
Duplicate Doc in qid:8 36725236271
Duplicate Doc in qid:8 36725236271
Missing doc 600603139758
Missing doc 600603135101
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Duplicate Doc in qid:11 97360810042
Duplicate Doc in qid:11 97360810042
Duplicate Doc in qid:12 97360810042
Duplicate Doc in qid:12 97360810042
Duplicate Doc in qid:13 9781400532711
Duplicate Doc in qid:13 9781400532711
Duplicate Doc in qid:14 803238004525
Duplicate Doc in qid:14 


Liblinear failed to converge, increase the number of iterations.



Delete model ltr_model_variant_2
{'responseHeader': {'status': 0, 'QTime': 6}}
Upload model ltr_model_variant_2
{'responseHeader': {'status': 0, 'QTime': 2}}
search_with_model: search request:
{'fields': ['upc', 'name', 'manufacturer', 'score'], 'limit': 10, 'params': {'rq': '{!ltr reRankDocs=60000 reRankWeight=10.0 model=ltr_model_variant_2 efi.fuzzy_keywords="~dryer" efi.squeezed_keywords="dryer" efi.keywords="dryer"}', 'qf': 'name name_ngram upc manufacturer short_description long_description', 'defType': 'edismax', 'q': 'dryer'}}
search_with_model: search response:
{'responseHeader': {'zkConnected': True, 'status': 0, 'QTime': 4}, 'response': {'numFound': 812, 'start': 0, 'maxScore': 4.134992, 'numFoundExact': True, 'docs': [{'upc': '14633195798', 'name': 'The Sims 3: Town Life Stuff - Mac/Windows', 'manufacturer': 'Electronic Arts', 'score': 0.04277634}, {'upc': '50946924311', 'name': 'Whirlpool - Duet&#x2122; Stack Kit III', 'manufacturer': 'Whirlpool', 'score': 0.04277634}, {'up

{'dryer': 0.0,
 'blue ray': 0.0,
 'headphones': 0.0,
 'dark of moon': 0.10870000186379851,
 'transformers dvd': 0.07329888093414069}

In [None]:
ltr.search_with_model("transformers dvd", "ltr_model_variant_2", rows=5)

[{'upc': '32429037763',
  'name': 'Transformers - DVD',
  'manufacturer': ' ',
  'score': 0.19473614,
  'rank': 0},
 {'upc': '93624995012',
  'name': 'Transformers - Original Soundtrack - CD',
  'manufacturer': 'Warner Bros.',
  'score': 0.06461063,
  'rank': 1},
 {'upc': '826663126044',
  'name': 'Transformers Japanese Collection: Headmasters - DVD',
  'manufacturer': ' ',
  'score': 0.062151596,
  'rank': 2},
 {'upc': '47875839090',
  'name': 'Transformers: Cybertron Adventures - Nintendo Wii',
  'manufacturer': 'Activision',
  'score': 0.059729535,
  'rank': 3},
 {'upc': '47875819733',
  'name': 'Transformers: The Game - PlayStation 3',
  'manufacturer': 'Activision',
  'score': 0.05891931,
  'rank': 4}]

## Listing 12.12 - Rerun A/B test on new `promotion` model

In [None]:
random.seed(1234)

simulated_purchases = simulate_user_a_b_test("transformers dvd",
                                             "ltr_model_variant_0",
                                             "ltr_model_variant_2")
simulated_purchases 

{'ltr_model_variant_0': 21, 'ltr_model_variant_2': 16}

## Listing 12.13 - Fully Automated LTR Loop

These lines expand Listing 12.13 from the book (the book content is a truncated form of what's below). You could put this in a loop and constantly try new features to try to get closer at a generalized ranking solution of what users actually want.

In [None]:
random.seed(1234)
ltr.delete_feature_store(products_collection, "aips_feature_store")

def gather_latest_sessions(query, sessions, feature_set):
    """For the sake of the examples, returns a static list of session data.
       In a production environment, this would the most up to date user interactions"""
    training_data = generate_training_data(sessions)
    latest_sessions = generate_simulated_exploration_sessions(query,
                                                              sessions,
                                                              training_data,
                                                              feature_set,
                                                              1)
    return latest_sessions

def train_and_deploy_model(sessions, model_name, feature_set):
    judgments = generate_training_data(sessions)
    train, test = split_training_data(judgments, 0.8)
    train_ranksvm_model(train, model_name, feature_set=feature_set)

def is_improvement(evaluation1, evaluation2):
    #Model comparison is stubbed out
    return True
    
def wait_for_more_sessions(t):
    time.sleep(t)

def ltr_retraining_loop(latest_sessions, iterations=sys.maxsize,
                        retrain_frequency=60 * 60 * 24):
    exploit_feature_set = get_exploit_feature_set()
    train_and_deploy_model(latest_sessions,
                           "exploit",
                           exploit_feature_set)
    for i in range(0, iterations):
        judgments = generate_training_data(latest_sessions)
        train, test = split_training_data(judgments)
        if i > 0:
            previous_explore_model_name = f"explore_variant_{i-1}"
            exploit_model_evaluation = evaluate_model(test, "exploit", judgments)
            explore_model_evaluation = evaluate_model(test, previous_explore_model_name, judgments)
            print(f"Exploit evaluation: {exploit_model_evaluation}")
            print(f"Explore evaluation: {explore_model_evaluation}")
            if is_improvement(explore_model_evaluation, exploit_model_evaluation):
                print("Promoting previous explore model")
                train_and_deploy_model(latest_sessions,
                                      "exploit",
                                       explore_feature_set)
                
        explore_feature_set = get_latest_explore_feature_set()
        train_and_deploy_model(latest_sessions,
                               f"explore_variant_{i}",
                               explore_feature_set)
        
        wait_for_more_sessions(retrain_frequency)
        latest_sessions = gather_latest_sessions("transformers dvd", latest_sessions, explore_feature_set)

ltr_retraining_loop(sessions, 5, 0)

Put feature-store:
{'responseHeader': {'status': 0, 'QTime': 2}}
Missing doc 600603139758
Missing doc 600603132827
Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603123061
Missing doc 600603123061
Missing doc 600603139758
Missing doc 600603135101
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603124570
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603139758
[2.7444975, 3.962, 0.0]
[2.1676683, 1.6966469, 0.0]
grade:0 qid:19 (television, lcd) docid:729507813059
[LibLinear]


Liblinear failed to converge, increase the number of iterations.



Delete model exploit
{'responseHeader': {'status': 0, 'QTime': 7}}
Upload model exploit
{'responseHeader': {'status': 0, 'QTime': 2}}
Put feature-store:
{'responseHeader': {'status': 0, 'QTime': 1}}
Missing doc 600603139758
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
Missing doc 600603139758
Missing doc 600603124570
Missing doc 600603123061
Missing doc 600603123061
Missing doc 600603139758
Missing doc 600603135101
Missing doc 600603132872
Missing doc 600603141003
Missing doc 600603141003
Missing doc 600603132872
Missing doc 600603140631
Missing doc 600603125065
Missing doc 600603132827
Missing doc 600603133237
[1.0, 0.0, 1.0, 0.009569731]
[1.0, 0.0, 1.0, 0.0]
grade:1 qid:19 (dark of the moon) docid:47875842335
[LibLinear]Delete model explore_variant_0
{'responseHeader': {'status': 0, 'QTime': 3}}
Upload model explore_variant_0
{'responseHeader': {'status': 0, 'QTime': 2}}
Missing doc 600603132872
Missing doc 600603141003
Missing d

KeyboardInterrupt: 

Up next: [Chapter 13: Semantic Search with Dense Vectors](../ch13/1.setting-up-the-outdoors-dataset.ipynb)