# Running Example of RL-Recruiter+

This jupyter notebook displays how to build and train the RL-Recruiter+ and then do the prediction.

In [1]:
import sys
import json
import numpy as np
sys.path.append('../')
from rl_recruiter.rl_model import RL_Recruiter_plus
from rl_recruiter.entropy_cal import type_2_entro

First we need to obtain the predictability of each participants

In [2]:
type_2_entro('./data/trajectory.json', './data/predictability.json')

We build the RL-Recruiter using two parameter settings files.

The first settings saved in a json file that needs to initialize the model. A example is in this [page](https://github.com/chungdz/RL-Recruiter-Plus/blob/master/example/data/para_settings.json).

There are several parameters need to be deployed:
* "total_person": the number of whole participants.
* "max_user": the number of participants to be selected.
* "train_start": the index in the list of trajectories that the model needs to begin learning.
* "train_end": the index in the list of trajectories that the model needs to stop learning. For example, if the trajectories are from 10 days and each day's trajectories are gathered in a list, we set "train_start" as 0 and "train_end" as 10.
* "train_epoch": the number of training epoch.
* "layer": the number of rows in value function table, cannot be higher than "max_user", the higher the layer, the larger the table, and thus it needs more data and epoch to learn and may get a better result.

There is another file needs to be imported to the model, which contains the threshold values for discretizing the entropy values which is one input to do the training and predicting. It is in the json list format. Just using our [thresholds](https://github.com/chungdz/RL-Recruiter-Plus/blob/master/example/data/thres.json) is fine. There are 100 threshold in the list. Making another file with the same format is also feasible. 

In [3]:
rlp = RL_Recruiter_plus('./data/para_settings.json', './data/thres.json')

## Training Process
### Input
There are two data files need to be input. 

The first is the trajectory data, you can see example [here](https://github.com/chungdz/RL-Recruiter-Plus/blob/master/example/data/trajectory.json). The trajectory data is a dictionary saved in json format. The key is a participant id and the value is a list of trajectory sets in each time period. The participant ids need to be mapped into consecutive integer. If there are 100 participants, then the key list in trajectory dictionary are like ["0", "1", ..., "99"]. One trajectory set contains nonredundant categorical integers representing the area covered by its participant in this time period. The integer "-1" in trajectory sets will be ignored and not counted in coverage.

The other is the participants' predictability data that can be calculated from trajectory data. The format of the predictability data is also a dictionary with keys the participant ids and values the list of entroy values in each time period.
### Output
RL-Recruiter+ shows the training results for each time period by select participants and calculate their absolute coverage and reletive coverage at the beginning of the next time period(e.g. using the trained value function after the j-1 time period to select participants in the beginning of the j time period to show performance). 

Absolute coverage is the exact number of area covered. Relative coverage is the absolute coverage divided by the highest possible coverage, which is obtained by selecting all participants. 

In the first time period the RL-Recruiter+ randomly selects participants.

In [4]:
rlp.train_and_evaluate('./data/trajectory.json', './data/predictability.json')

Loading trajectory Data...
Getting eligible user list
init record set
load entro_daily
train and evaluate from day to day


100%|██████████████████████████████████████████████████████████████████████████████████| 14/14 [03:03<00:00, 13.12s/it]

Each day's coverage:
[664, 778, 1115, 1191, 1267, 954, 1268, 1323, 1489, 1259, 1219, 1349, 1082, 1055]
relative coverage:
[6628, 6529, 6530, 6531, 6532, 6533, 6534, 6434, 6435, 6335, 6235, 6135, 6035, 5935, 5835, 5735, 5736, 5635, 5536, 5436, 5336, 5236, 5136, 5036, 4936, 7600, 7601, 7602, 7603, 7604, 7504, 7404, 7304, 7305, 7205, 7206, 7106, 7006, 6906, 6806, 6807, 6808, 6809, 6810, 6910, 6911, 6912, 6913, 6914, 6814, 6815, 6816, 6817, 6818, 6719, 6619, 6519, 6419, 6319, 6219, 6119, 6019, 5919, 5920, 5921, 5922, 5923, 5924, 5925, 5926, 5927, 5928, 5828, 5728, 5628, 5528, 5428, 5429, 5329, 5229, 5129, 5029, 4929, 4930, 4931, 4932, 4933, 4934, 4935, 4937, 4938, 4839, 4840, 4740, 4640, 4540, 4440, 4340, 4341, 4241, 4240, 4141, 4041, 3941, 3841, 3741, 3641, 3541, 3441, 3341, 3241, 3141, 3041, 2941, 2942, 2841, 2741, 2641, 2541, 2542, 2442, 2441, 2342, 2341, 2242, 2241, 2142, 2042, 1942, 1842, 1742, 1642, 1542, 1442, 1342, 1242, 1142, 1042, 942, 842, 742, 741, 740, 739, 839, 939, 838, 738,




## Prediction Process

There is an optional input to do the prediction. This input is a list of entropy values with the length equal to number of total participants. Value in dimension k represents the entropy value for participant with id "k". 

The RL_Recruiter+ gives a list of promoting participants after the predict process.

In [18]:
# First get the entropy list.(optional)
total_person = rlp.hypara_dict['total_person']
train_last_day = rlp.hypara_dict['train_end']

with open('./data/predictability.json', 'r') as f:
    entro_dict = json.load(f)

last_day = len(entro_dict["0"]) - 1

entro_list = np.zeros((total_person))
for k, v in entro_dict.items():
    try:
        entro_list[int(k)] = v[train_last_day - 1]
    except:
        pass

In [23]:
# Then do the prediction.
selected_participants = rlp.predict(entro_list=entro_list)
print(selected_participants)

[127, 4, 1, 69, 9, 8, 129, 170, 167, 85, 152, 12, 96, 145, 42, 7, 128, 93, 5, 32]
