Before you start creating custom scorers, make sure you have created and configured a Solr cluster and trained a ranker by either following the steps on the notebook "Answer-Retrieval.ipynb" or by using some other tooling. 

If you used the python notebook mentioned above, you are ready to start. Otherwise, please make sure you enter the needed information into the file credentials.json present in the same directory of this notebook. This will allow the exercise performed here to read and write needed credentials and constant values for all the steps ahead.

## 1. Create a Custom Scorer

Custom Scorers should be defined and configured within custom-scorer folder.

In this step, you will create a custom feature to improve the ranking phase. In general, there can be multiple custom features that can be engineered depending upon the use case.

The custom scorer on this tutorial adds a feature related to the number of up votes an answer received on Stack Exchange. This feature is related to the content of a document. You can also create scorers that work on the content of the query or on both (query and document).

The class that implements your scorer can be added to the corresponding package (one of document, query, query_document) in the rr_custom_scorers project. This is how the class for the Up votes scorer looks like:

--------------------------------------------------------------------------
from document_scorer import DocumentScorer

class UpVoteScorer(DocumentScorer):

    def __init__(self, name='DocumentScorer', short_name='ds', description='Description of the scorer',
                 include_stop=False):
        """ Base class for any scorers that consume a Solr document and extract
            a specific signal from a Solr document

            Args:
                name (str): Name of the Scorer
                short_name (str): Used for the header which is sent to ranker
                description (str): Description of the scorer
        """
        super(UpVoteScorer, self).__init__(name=name, short_name=short_name, description=description)

    def score(self, document):
        upVote = document['upModVotes']

        if upVote is not None:
            if upVote == 0:
                return 0
            elif 0 < upVote <= 3:
                return 0.15
            elif 3 < upVote <= 5:
                return 0.35
            elif 5 < upVote <= 8:
                return 0.55
            elif 8 < upVote <= 11:
                return 0.75
            elif 11 < upVote <= 14:
                return 0.85
            elif upVote > 14:
                return 1
        else:
            return 0

----------------------------------------------------------------------------

Notice that the scorer uses the information stored in document (upModVotes). Make sure the data you need to use in the scorer is stored in your solr documents.

Some preprocessing might be required in order to define the logic of the scorer (the normalization criteria, for example). For this example, a histogram of up votes was analyzed in order to define the normalization ranges.

Once you have created the class that implements the custom socrer, edit the file features.json within the config directory of the project rr_custom_scorers_proxy_app to add the custom feature information. It should look like this:

{
  "scorers":[
    {
      "init_args":{
        "name":"UpVoteScorer",
        "short_name":"uvs1",
        "description":"Score based on the number of up votes a document received"
      },
      "type":"document",
      "module":"document_upvote_scorer",
      "class":"UpVoteScorer"
    }
  ]
}

If you write more than one scorer, just add them to the scorers:[] list.


## 2. Install Dependencies and Compile Custom Scorers

In [None]:
import subprocess
import shlex
import os
from shutil import copyfile

#getting current directory
curdir = os.getcwd()

CUSTOM_SCORERS_PATH=curdir+'/../custom-scorer'

#creating wheel in /custom-scorer/ and copying it using pip install
#Note: Wheel is a packaging framework for Python and is used
#for packaging the custom scorer project

try:
    os.system("cd "+CUSTOM_SCORERS_PATH+"; pip wheel .")
    os.system("pip install retrieve_and_rank_scorer-0.0.1-py2-none-any.whl")
    print('Successfully installed retrieve_and_rank_scorer-0.0.1-py2-none-any.whl package.')
except:
    print ('Command to install custom scorer whl file failed.')


## 3. Create service.cfg File

This step writes your RR credentials to the service.cfg file, expected by the proxy app.
To create the file, run the code below


In [None]:
import os
import json

#getting current directory
curdir = os.getcwd()

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)

SERVICE_CFG_PATH=curdir+'/../config'

#creating service.cfg file
serviceCfgPath = SERVICE_CFG_PATH+'/service.cfg'
with open(serviceCfgPath, 'w') as serviceCfgFile:
    serviceCfgFile.write('SOLR_CLUSTER_ID='+credentials['cluster_id']+'\n')
    serviceCfgFile.write('SOLR_COLLECTION_NAME='+credentials['collection_name']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_BASE_URL=https://gateway.watsonplatform.net/retrieve-and-rank/api'+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_USERNAME='+credentials['username']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_PASSWORD='+credentials['password']+'\n')


## 4. Start the Server

Start the python flask server by using your command line or terminal as shown below.

$ python server.py 

## 5. Generate Training Data

Once the training ground truth file is ready, a file which contains the feature vectors for each questions needs to be generated. This file will also contain the new features added by your custom scorers.

To generate the traingdata.csv file:

0. Make sure the proxy app is running
1. Edit bin/python/trainproxy.py - fl (fields) in lines 83, 95 to consider the fields used by the added custom scorers
2. Run the code below (this is composed of two phase: generation of a trainingdata.csv file and sending the 
    request to create a ranker to the service). You know the request has been sent when the output shows something
    like: {"ranker_id":"3b140ax15-rank-2018", "name":"rr_ask_ranker_cs", "created":"2016-05-19T14:51:50.635Z",   
    "url":"https://gateway.watsonplatform.net/retrieve-and-rank/api/v1/rankers/3b140ax15-
    rank-2018","status":"Training","status_description":"The ranker instance is in its training phase"}
3. Validate trainingdata.csv has the new feature
    (e.g. question_id,f0,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,r1,r2,s,newfeature,ground_truth)
    
### NOTE : THIS STEP COULD TAKE A LONG TIME! 

In [None]:
import subprocess
import json
import shlex
import os

#getting current directory
curdir = os.getcwd()

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)

USERNAME=credentials['username']
PASSWORD=credentials['password']
SOLR_CLUSTER_ID=credentials['cluster_id']
COLLECTION_NAME=credentials['collection_name']
TRAIN_FILE_PATH=curdir+'/../bin/python'
GROUND_TRUTH_FILE=curdir+"/../data/groundtruth/answerGT_train.csv"

#Running command that trains a ranker
cmd = 'python %s/trainproxy.py -u %s:%s -i %s -c %s -x %s -n %s' %\
    (TRAIN_FILE_PATH, USERNAME, PASSWORD, GROUND_TRUTH_FILE, SOLR_CLUSTER_ID, COLLECTION_NAME, "ranker")
try:
    process = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE)
    output = process.communicate()[0]
except:
    print ('Command:')
    print (cmd)
    print ('Response:')
    print (output)

## 6. Train a New Ranker with Custom Scorer(s)

To train a ranker which will consider the added features:

0. Make sure the proxy app is running
1. Run the code below

    NOTE: In this step you will need to provide the value for the RANKER_NAME constant. 

### NOTE : THIS STEP COULD TAKE A LONG TIME! 

In [None]:
import subprocess
import json
import shlex
import os


#getting current directory
curdir = os.getcwd()

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)
    

BASEURL=credentials['url']
RANKER_URL=BASEURL+"rankers"
USERNAME=credentials['username']
PASSWORD=credentials['password']
TRAINING_DATA=curdir+'/../data/groundtruth/trainingdata.csv'

#please provide the ranker name
RANKER_NAME="rr_ask_cs_ranker"

#Checking if ranker with same name already exists
curl_cmd = 'curl -u "%s":"%s" "%s"' %(USERNAME, PASSWORD, RANKER_URL)
process = subprocess.Popen(shlex.split(curl_cmd), stdout=subprocess.PIPE)
output = process.communicate()[0]
found = False
ranker_id = ''
try:
    parsed_json = json.loads(output)
    rankers = parsed_json['rankers']
    for i in range(len(rankers)):
        ranker_json = rankers[i]
        if ranker_json['name'] == RANKER_NAME:
            found = True
            ranker_id = ranker_json['ranker_id']
except:
    print ('Command:')
    print (curl_cmd)
    print ('Response:')
    print (output) 

if found:
    print "Ranker "+RANKER_NAME+" already exists with ID "+ranker_id+"."
    print json.dumps(parsed_json, sort_keys=True, indent=4)
else:
    #Running command that trains a ranker
    cmd = 'curl -k -X POST -u %s:%s -F training_data=@%s -F training_metadata="{\\"name\\":\\"%s\\"}" %s' %\
        (USERNAME, PASSWORD, TRAINING_DATA, RANKER_NAME, RANKER_URL)
    process = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE)
    output = process.communicate()[0]
    print cmd
    try:
        parsed_json = json.loads(output)
        print json.dumps(parsed_json, sort_keys=True, indent=4)
        credentials['cs_ranker_id'] = parsed_json['ranker_id']
        with open(credFilePath, 'w') as credFileUpdated:
            json.dump(credentials, credFileUpdated)
            
    except:
        print ('Command:')
        print (cmd)
        print ('Response:')
        print (output)

## 7. Check Status of a Ranker with Custom Scorers

If the generation of training data finished succesfully and the output was that a new ranker has been created and it is on its training phase, you can query the status to check when it becomes "Available" by running the code below.

In [None]:
import subprocess
import json
import shlex
import os

#getting current directory
curdir = os.getcwd()

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)

BASEURL=credentials['url']
RANKER_URL=BASEURL+"rankers"
USERNAME=credentials['username']
PASSWORD=credentials['password']
RANKER_ID=credentials['cs_ranker_id']

#Running command that checks the status of a ranker
curl_cmd = 'curl -u %s:%s %s/%s' % (USERNAME, PASSWORD, RANKER_URL, RANKER_ID)
process = subprocess.Popen(shlex.split(curl_cmd), stdout=subprocess.PIPE)
output = process.communicate()[0]
try:
    parsed_json = json.loads(output)
    print json.dumps(parsed_json, sort_keys=True, indent=4)
except:
    print ('Command:')
    print (curl_cmd)
    print ('Response:')
    print (output)

### Note : Before running experiments, you should update the .env file with the ranker id of the trained ranker and restart the plython flask application

## 8. Run Experiments with New Ranker using Custom Scorer(s)

To test a ranker using custom scorers:

0. Make sure the proxy app is running
1. Edit bin/python/testproxy.py
2. Edit fl (fields) in lines 85, 178 to consider the fields used by the added custom scorers
3. Run the code below to:
    - Edit service.cfg
    - Create experiments folder in /data (if you run these multiple times, rename the folder to keep all of the
        wanted results otherwise it will be overwritten)
    - Run experiment_with_custom_scorers.sh to run the test experiment. This step will generate the files needed
        for analysis of the custom scorers performance

### NOTE : THIS STEP COULD TAKE A LONG TIME! 


In [None]:
import subprocess
import json
import shlex
import os

#getting current directory
curdir = os.getcwd()

#please provide the full path to the file answerGT_test.csv
TEST_FILE=curdir+"/../data/groundtruth/answerGT_test.csv"

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)

#editing service.cfg file
SERVICE_CFG_PATH=curdir+'/../config'

SERVICE_CFG_PATH = SERVICE_CFG_PATH+'/service.cfg'

#adding ranker and test file
with open(SERVICE_CFG_PATH, 'w') as serviceCfgFile:
    serviceCfgFile.write('SOLR_CLUSTER_ID='+credentials['cluster_id']+'\n')
    serviceCfgFile.write('SOLR_COLLECTION_NAME='+credentials['collection_name']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_BASE_URL=https://gateway.watsonplatform.net/retrieve-and-rank/api'+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_USERNAME='+credentials['username']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_PASSWORD='+credentials['password']+'\n')
    serviceCfgFile.write('RANKER_ID='+credentials['ranker_id']+'\n')
    serviceCfgFile.write('TEST_RELEVANCE_FILE='+TEST_FILE+'\n')
    
#creating experiments directory
curdir = os.getcwd()

DATA_PATH=curdir+'/../data'

cmd1 = 'mkdir '+DATA_PATH+'/experiments'
try:
    process = subprocess.Popen(shlex.split(cmd1), stdout=subprocess.PIPE)
    output = process.communicate()[0]
except:
    print ('Command to create experiments directory failed:')
    print (cmd1)

#running experiment script
try:
    os.system("cd "+curdir+"/../; ./bin/bash/experiment.sh "+SERVICE_CFG_PATH\
              +" "+DATA_PATH+"/experiments")
except:
    print ('Command to run experiment failed.')

#changing ranker id to reflect custom scorer ranker
with open(SERVICE_CFG_PATH, 'w') as serviceCfgFile:
    serviceCfgFile.write('SOLR_CLUSTER_ID='+credentials['cluster_id']+'\n')
    serviceCfgFile.write('SOLR_COLLECTION_NAME='+credentials['collection_name']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_BASE_URL=https://gateway.watsonplatform.net/retrieve-and-rank/api'+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_USERNAME='+credentials['username']+'\n')
    serviceCfgFile.write('RETRIEVE_AND_RANK_PASSWORD='+credentials['password']+'\n')
    serviceCfgFile.write('RANKER_ID='+credentials['cs_ranker_id']+'\n')
    serviceCfgFile.write('TEST_RELEVANCE_FILE='+TEST_FILE+'\n')
    
#running experiment with custom scorers script
try:
    os.system("cd "+curdir+"/../; ./bin/bash/experiment_with_custom_scorers.sh "+SERVICE_CFG_PATH\
              +" "+DATA_PATH+"/experiments/")
except:
    print ('Command to run experiment with custom scorers failed.')

## 9. Test Ranker with Custom Scorer(s)

To test the ranker, submit a query and observe the results returned by running the command below

    NOTE: In this step you should provide the value for the QUESTION constant.

In [None]:
import subprocess
import json
import shlex
import os

#getting current directory
curdir = os.getcwd()

#loading credentials
credFilePath = curdir+'/../config/credentials.json'
with open(credFilePath) as credFile:
    credentials = json.load(credFile)

USERNAME=credentials['username']
PASSWORD=credentials['password']
RANKER_ID=credentials['cs_ranker_id']

#please provide the query to test
QUESTION="what is the best city to visit in brazil"

#Running command that queries Solr
QUESTION = QUESTION.replace(" ","%20")
curl_cmd = 'curl GET "http://localhost:3000/api/custom_ranker?&q=%s&wt=json&fl=id,title,subtitle,answer,\
answerScore,userReputation,views,upModVotes,downModVotes,userId,username,tags,userId,username,authorUsername,authorUserId&\
wt=json&ranker_id=%s&fq=\'\'"' %(QUESTION, RANKER_ID)
print curl_cmd
process = subprocess.Popen(shlex.split(curl_cmd), stdout=subprocess.PIPE)
output = process.communicate()[0]
try:
    parsed_json = json.loads(output)
    print json.dumps(parsed_json, sort_keys=True, indent=4) 
except:
    print ('Command:')
    print (curl_cmd)
    print ('Response:')
    print (output)

## 10. Analyze Experiment Results

Using the iPython notebook launched in Step 2 (testing step), repeat iPython notebook analysis from Step 2. You will need to add a new experiment variable to add the experiment with custom features included in the analysis charts or calculating NDCG scores.

In [None]:
import os
import pandas as pd
import numpy as np
import json
import matplotlib.pyplot as plt
%matplotlib inline
import functools
import requests
import random
import seaborn as sns
import analysis_utils as au
sns.set_context("notebook", font_scale=1.5, rc={"lines.linewidth": 2.5})
sns.set_style('darkgrid')
from collections import defaultdict, Counter

In [None]:
# Parameters
#getting current directory
curdir = os.getcwd()
base_directory=curdir+'/../data'
experiments_directory = os.path.join(base_directory, 'experiments')

In [None]:
# Solr experiment
solr_experiment_path = os.path.join(experiments_directory, 'exp_solr_only.json')
solr_experiment = au.RetrieveAndRankExperiment(experiment_file_path=solr_experiment_path)
solr_entries = solr_experiment.experiment_entries

# RR experiment
rr_experiment_path = os.path.join(experiments_directory, 'exp_retrieve_and_rank.json')
rr_experiment = au.RetrieveAndRankExperiment(experiment_file_path=rr_experiment_path)
rr_entries = rr_experiment.experiment_entries

# RR experiment with custom scorers
rr_experiment_path_scorer = os.path.join(experiments_directory, 'exp_retrieve_and_rank_scorers.json')
rr_experiment_scorer = au.RetrieveAndRankExperiment(experiment_file_path=rr_experiment_path_scorer)
rr_entries_scorer = rr_experiment_scorer.experiment_entries

In [None]:
def query_solr(query, fq=None, wt='json', fl='id,title,subtitle,answer,answerScore,upModVotes', num_rows=10):
    " Query standalone Solr "
    params = dict(q=query, wt=wt, fl=fl, rows=num_rows)
    if fq is not None:
        params['fq'] = fq
    return solr_experiment.rr_service.select(params)

def query_retrieve_and_rank(query, fq=None, wt='json', fl='id,title,subtitle,answer,answerScore,upModVotes', num_rows=10):
    " Query the retrieve and rank API "
    ranker_id = rr_experiment.ranker_id
    params = dict(q=query, ranker_id=ranker_id, wt=wt, fl=fl, rows=num_rows)
    if fq is not None:
        params['fq'] = fq
    return rr_experiment.rr_service.fcselect(params)

def get_doc_by_id(doc_id, query=None):
    " Get the solr document, if we have the id"
    query = '*:*' if query is None else query
    resp = query_solr(query='*:*', fq='id:%s' % doc_id, fl='id,title,subtitle,answer,answerScore,upModVotes')
    if resp.ok:
        docs = resp.json().get('response', {}).get('docs', [])
        if len(docs) > 0 and docs[0]['id'] == doc_id:
            return docs[0]
        elif len(docs) == 0:
            raise ValueError('No docs returned. Response json : %r' % resp.json())
        else:
            raise ValueError('ID of top document does not match. Response json : %r' % resp.json())
    else:
        raise resp.raise_for_status()

## 11. Experiment Analysis

### Total Relevance

Total Relevance measures, for each query sent to the ranker, the % of answers in the top X documents that were relevant.

For example, let query X have 8 relevant documents. The first 5 documents in the response from the ranker are relevant but the next 5 documents are all irrelevant. Total Relevance would be calculated as follows:

    Total Relevance@001 = 1 relevant document in top 1 / 1 possible relevant document in top 1 = 1.00
    ...
    Total Relevance@005 = 5 relevant documents in top 5 / 5 possible relevant documents in top 5 = 1.00
    ...
    Total Relevance@008 = 5 relevant documents in top 8 / 8 possible relevant documents in top 8 = 0.625
    ...
    Total Relevance@010 = 5 relevant documents in top 10 / 8 possible relevant documents in top 10 = 0.625

Thus total relevance measures the % of documents in the top X documents that are relevant, as compared to the maximum number of relevant documents that could be returned in the top X.

This concludes the custom features section. The proxy app is probably still running on a session. To interrupt the process, just do Ctrl+C.

In [None]:
# Define the function
plot_total_relevance_at_n = functools.partial(au.plot_relevance_results, func=au.total_relevance_at_n,
                                              xlabel='Documents@00N Index', ylabel='Relevance %',
                                              title='Avg. % of Documents in Top N that are Relevant')

In [None]:
plot_total_relevance_at_n([solr_entries, rr_entries,rr_entries_scorer],
                          legend=['Solr','RR','RR with Custom Scorer'],
                          title='Total relevance with UpVote scorer')

### Normalized Discounted Cumulative Gain (NDCG)

Discounted Cumulative Gain is an Information Retrieval Metric that takes into account the position and relevance of documents at different positions. Normalized Discounted Cumulative Gain normalizes the metric based on what the optimal ranking of results would be.

Notation:
rel_i = relevance of ith document

DCG@00N = rel_1 + sum(rel_i / log2(i + 1) for i in range(1,n))

NDCG@00N = DCG@00N / IDCG@00N

IDCG@00N = "DCG of the optimal ordering of a result set"

For example, consider a query that has 3 relevant documents. 1 of these 3 documents has relevance 2 and the other 2 have relevance 1, (and naturally all other documents have relevance 0). Assume the relevance of the documents in the result set is as follows

RS = [1, 0, 2, 1, 0, ...] (first retrieved document has relevance 1, second has relevance 0, etc.)

The optimal result set would have most relevant documents first. Here is the ideal ordering for this problem:
IRS = [2, 1, 1, 0, 0, ...]

After doing the math, we get:
DCG@005  = 2.43
IDCG@005 = 3.13
NDCG@005 = 0.77

In [None]:
absolute_strategy_ndcg = functools.partial(au.experiment_average_ndcg, method='absolute')
relative_strategy_ndcg = functools.partial(au.experiment_average_ndcg, method='relative')
plot_absolute_ndcg = functools.partial(au.plot_relevance_results, func=absolute_strategy_ndcg,
                                      xlabel='Documents@00N Index', ylabel='NDCG',
                                      title='Absolute NDCG@00N')
plot_relative_ndcg = functools.partial(au.plot_relevance_results, func=relative_strategy_ndcg,
                                        xlabel='Documents@00N Index', ylabel='NDCG',
                                        title='Relative NDCG@00N')

In [None]:
plot_absolute_ndcg([solr_entries, rr_entries,rr_entries_scorer],
                    legend=['Solr', 'RR','RR with custom scorer'],
                    title='Absolute NDCG')

In [None]:
plot_relative_ndcg([solr_entries, rr_entries,rr_entries_scorer],
                    legend=['Solr', 'RR'],
                    title='Relative NDCG')

## Cleanup and Shutdown

In [None]:
import os
import subprocess
import shlex

#getting current directory
curdir = os.getcwd()

CUSTOM_SCORERS_ANSWERS_PATH=curdir+'/../data/groundtruth/'

#Remove temporary answers csv files generated by custom scorers
cmd1 = 'rm '+CUSTOM_SCORERS_ANSWERS_PATH+'answer_*.csv'

try:
    os.system(cmd1)
    print (cmd1)
except:
    print ('Removal of .csv file from answers directory failed:')
    print (cmd1)
