## Step 5 - Randon Click Model 
Consider two different click models, (a) the Random Click Model (RCM), and (b) one out of the remaining 3 aforementioned models. The parameters of some of these models can be estimated using the Maximum Likelihood Estimation (MLE) method, while others require using the Expectation-Maximization (EM) method. Implement the two models so that (a) there is a method that learns the parameters of the model given a set of training data, (b) there is a method that predicts the click probability given a ranked list of relevance labels, (c) there is a method that decides - stochastically - whether a document is clicked based on these probabilities.
Having implemented the two click models, estimate the model parameters using the Yandex Click Log [file].
(Note 6: Do not learn the attractiveness parameter 𝑎uq)

In [3]:
from random import uniform

def trainRCM(click_log_path):
    '''
    Estimates the only parameter P_click of the Random Click Model using a click log
    by dividing the total amount of clicks by the total amount of shown documents.
    
    Args:
        click_log_path (String): Location of the click log.
        
    Return:
        Float: The P_click parameter used to decide whether to click on a document.
        
    '''
    shown_docs = 0
    clicks = 0
    with open(click_log_path,'r') as f:
        for line in f:
            line = line.split()
            
            # Count all shown docs
            if line[2] == 'Q':
                shown_docs += len(line)-5
                
            # Count all clicks
            else:
                clicks += 1
    P_click = clicks/float(shown_docs)
    return P_click

def predictProbRCM(ranking, P_click):
    '''
    Generates the probability for each document in a ranking to be clicked on
    based on a Random Click Model.
    
    Args:
        ranking (List): List of ranked documents represented by relevance.
        
    Return:
        List: A list of click probabilities for each document in the ranking.
    '''
    click_probabilities = []
    for doc in ranking:
        click_probabilities.append(P_click)
    return click_probabilities
        
def assignClicksRCM(click_probabilities):
    '''
    Based on their click probabilities, either do or do not assign a click to each document.
    
    Args:
        ranking_probabilities (List): A list of click probability and document tuples.
        P_click (float): Probability used to decide whether to click on a document.
        
    Return:
        List: A list representing clicks with 1's on documents in ranking with the same index.
    '''
    clicks = []
    for prob in click_probabilities:
        if uniform(0,1) < prob:
            clicks.append(1)
        else:
            clicks.append(0)
    return clicks

def randomClickModel(click_log_path, ranking):
    '''
    Implements a Random Click Model. This model decides to click on a document
    with a probability P_click without taking anything else into account.
    
    Args:
        click_log_path (String): Location of the click log.
        ranking (List): List of ranked documents represented by relevance.
        
    Return:
        List: A list representing clicks with 1's on documents in ranking with the same index.
    '''
    P_click = trainRCM(click_log_path)
    click_probabilities = predictProbRCM(ranking, P_click)
    clicks = assignClicksRCM(click_probabilities)
    return clicks
    

In [6]:
ranking_I = ['HR','R','HR','N','R']
click_log_path = 'YandexRelPredChallenge.txt'
clicks = randomClickModel(click_log_path, ranking_I)
print(clicks)

[0, 1, 0, 0, 0]
