## Comparing Agents List:

###### 1. Single Action Agent
Always returns the same action / product id.

###### 2. Random Action Agent
The Agent returns the random action from all of product id's range.

###### 3. Organic Product Count Agent
The Agent that selects an action based on the most frequently viewed Product.

###### 4. Organic Per User Count Agent
The Agent that counts organic views of products per a user and selects an action for the most frequently shown product.

###### 5. Define Organic MF Agent
The Agent that selects an action from the model that performs Organic Events matrix factorisation.

###### 6. Bandit Click Count Agent
The Agent that selects an Action for the most frequently clicked bandit action before.

###### 7. Bandit MF Agent - used Matrix Factorization learning model
The Agent chooses maximum logit (logistic regression) from all possible actions


## Environment Setup

In [1]:
from copy import deepcopy
import pandas as pd
import recogym
from recogym.bench_agents import add_agent_id, combine_stat
from single_action_agent import SingleActionAgent
%matplotlib inline
import gym
import matplotlib.pyplot as plt

from recogym import env_1_args, Configuration
from recogym.agents import OrganicUserEventCounterAgent, organic_user_count_args, OrganicCount, organic_count_args, \
    OrganicMFSquare, organic_mf_square_args, BanditCount, bandit_count_args, BanditMFSquare, bandit_mf_square_args
from recogym.agents import RandomAgent, random_args
from recogym.evaluate_agent import plot_verify_agents

# Set style for pretty plots
plt.style.use('ggplot')

Products = 1000

# You can overwrite environment arguments here:

env_1_args['random_seed'] = 42
env_1_args['num_products']= Products

env_1_args['phi_var']=0.0
env_1_args['number_of_flips']=0
# env_1_args['sigma_mu_organic'] = 0.0
# env_1_args['sigma_omega']=0
# env_1_args['normalize_beta']=True

# Initialize the gym for the first time by calling .make() and .init_gym()

env = gym.make('reco-gym-v1')
env.init_gym(env_1_args)

env.reset()


1. Define Single Action Agent

In [2]:
single_action_agent = SingleActionAgent(Configuration({**env_1_args,}))

SingleActionAgent %%%% num_products: 1000


2. Define Random Action Agent

In [3]:
random_agent = RandomAgent(Configuration({
    **random_args,
    **env_1_args,
}))

RandomAgent %%%% num_products: 1000


3. Define Organic Product Count Agent

In [4]:
organic_count_agent = OrganicCount(Configuration({**organic_count_args,**env_1_args,}))

4. Define Organic Per User Count Agent

In [5]:
organic_user_counter_agent = OrganicUserEventCounterAgent(Configuration({**organic_user_count_args, **env_1_args, 'select_randomly': True,}))

5. Define Organic MF Agent

In [6]:
organic_mf_agent = OrganicMFSquare(Configuration({**organic_mf_square_args, **env_1_args, 'select_randomly': True,}))

6. Define Bandit Click Count Agent

In [7]:
bandit_count_agent = BanditCount(Configuration({**bandit_count_args,**env_1_args,}))

BanditCount %%%% num_products: 1000


7. Define Bandit MF Agent

In [8]:
bandit_mf_square_agent = BanditMFSquare(Configuration({
    **bandit_mf_square_args,
    **env_1_args,
}))

## A/B-Test Evaluation

In [9]:
result_bandit_mf = recogym.test_agent(deepcopy(env), deepcopy(bandit_mf_square_agent), 1000, 1000)
result_bandit_mf_id = add_agent_id('Bandit MF Agent', *result_bandit_mf)

Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 0/1000 [00:00<?, ?it/s]

START: Agent Training #0
START: Agent Training @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:27<00:00, 35.86it/s]
Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 4/1000 [00:00<00:26, 37.00it/s]

END: Agent Training @ Epoch #0 (27.89681100845337s)
START: Agent Evaluating @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:35<00:00, 28.47it/s]


END: Agent Evaluating @ Epoch #0 (35.2789249420166s)


In [None]:
result_single = recogym.test_agent(deepcopy(env), deepcopy(single_action_agent), 1000, 1000, )
result_single_id = add_agent_id('Single Action Agent', *result_single)

Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 2/1000 [00:00<01:00, 16.62it/s]

START: Agent Training #0
START: Agent Training @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:24<00:00, 40.07it/s]
Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 2/1000 [00:00<00:54, 18.38it/s]

END: Agent Training @ Epoch #0 (24.961561918258667s)
START: Agent Evaluating @ Epoch #0


Users:  46%|████▌     | 462/1000 [00:18<00:28, 19.03it/s]

In [None]:
result_random = recogym.test_agent(deepcopy(env), deepcopy(random_agent), 1000, 1000)
result_random_id = add_agent_id('Random Agent', *result_random)

In [None]:
result_organic_count = recogym.test_agent(deepcopy(env), deepcopy(organic_count_agent), 1000, 1000)
result_organic_count_id = add_agent_id('Organic Count Agent', *result_organic_count)

In [None]:
result_organic_user_counter = recogym.test_agent(deepcopy(env), deepcopy(organic_user_counter_agent), 1000, 1000)
result_organic_user_counter_id = add_agent_id('Organic Count per User Agent', *result_organic_user_counter)

In [None]:
result_organic_mf = recogym.test_agent(deepcopy(env), deepcopy(organic_mf_agent), 1000, 1000)
result_organic_mf_id = add_agent_id('Organic MF Agent', *result_organic_mf)

In [None]:
result_bandit_count = recogym.test_agent(deepcopy(env), deepcopy(bandit_count_agent), 1000, 1000)
result_bandit_count_id = add_agent_id('Bandit Count Agent', *result_bandit_count)

In [None]:
comb_result = combine_stat([result_single_id,
                            result_random_id,
                            result_organic_count_id,
                            result_organic_user_counter_id,
                            result_organic_mf_id,
                            result_bandit_mf_id,
                            result_bandit_count_id])

In [None]:
comb_result

In [None]:
fig = plot_verify_agents(comb_result)
plt.ylabel('CTR')
plt.show()




