## Comparing Agent Performance

This notebook compares the performance of a selection of our included agents. The results presented are the median CTR
that one would achieve if the agent were used to recommend products to 100 test users after being trained.

In [1]:
import gym, recogym
from recogym import env_1_args

from copy import deepcopy

env_1_args['random_seed'] = 42
env_1_args['num_products'] = 100

env = gym.make('reco-gym-v1')
env.init_gym(env_1_args)

from recogym.agents import BanditMFSquare, bandit_mf_square_args
from recogym.agents import BanditCount, bandit_count_args
from recogym.agents import RandomAgent, random_args
from recogym import Configuration

agent_banditmfsquare = BanditMFSquare(Configuration({
    **bandit_mf_square_args,
    **env_1_args,
}))
agent_banditcount = BanditCount(Configuration({
    **bandit_count_args,
    **env_1_args,
}))
agent_rand = RandomAgent(Configuration({
    **random_args,
    **env_1_args,
}))

BanditCount %%%% num_products: 100
RandomAgent %%%% num_products: 100


In [2]:
# Credible interval of the CTR median and 0.025 0.975 quantile.
recogym.test_agent(deepcopy(env), deepcopy(agent_rand), 1000, 1000)

Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 0/1000 [00:00<?, ?it/s]

START: Agent Training #0
START: Agent <recogym.agents.random_agent.RandomAgent object at 0x10c209390> Training: offline users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:17<00:00, 55.71it/s]
Organic Users: 0it [00:00, ?it/s]
Users:   1%|          | 7/1000 [00:00<00:14, 69.30it/s]

END: Agent Training @ Epoch #0 (17.958183765411377s)
START: Agent Evaluating online users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:19<00:00, 51.17it/s]


END: Agent Evaluating @ Epoch #0 (19.678795099258423s)


(0.009163083488403513, 0.009841530644539048, 0.010552328652439158)

In [3]:
# Credible interval of the CTR median and 0.025 0.975 quantile.
recogym.test_agent(deepcopy(env), deepcopy(agent_banditcount), 1000, 1000)

Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 4/1000 [00:00<00:28, 34.61it/s]

START: Agent Training #0
START: Agent <recogym.agents.bandit_count.BanditCount object at 0x13116f910> Training: offline users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:20<00:00, 50.00it/s]
Organic Users: 0it [00:00, ?it/s]
Users:   1%|          | 8/1000 [00:00<00:12, 79.43it/s]

END: Agent Training @ Epoch #0 (20.004718780517578s)
START: Agent Evaluating online users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:17<00:00, 55.70it/s]


END: Agent Evaluating @ Epoch #0 (18.124396800994873s)


(0.012907625884488641, 0.013717019494475985, 0.01455908365001457)

In [4]:
# Credible interval of the CTR median and 0.025 0.975 quantile.
recogym.test_agent(deepcopy(env), deepcopy(agent_banditmfsquare), 1000, 1000)

Organic Users: 0it [00:00, ?it/s]
Users:   0%|          | 4/1000 [00:00<00:31, 31.88it/s]

START: Agent Training #0
START: Agent BanditMFSquare(
  (product_embedding): Embedding(100, 5)
  (user_embedding): Embedding(100, 5)
) Training: offline users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:22<00:00, 44.30it/s]
Organic Users: 0it [00:00, ?it/s]
Users:   1%|          | 7/1000 [00:00<00:20, 49.61it/s]

END: Agent Training @ Epoch #0 (22.577062845230103s)
START: Agent Evaluating online users 1000 @ Epoch #0


Users: 100%|██████████| 1000/1000 [00:30<00:00, 33.12it/s]


END: Agent Evaluating @ Epoch #0 (30.32013511657715s)


(0.012335110219876598, 0.013120254911025349, 0.013937579562077085)

As can be seen from the higher median CTR, the _`Agent`_ which performs matrix factorisation on the bandit data performs the best!