Methods and Experiments
=============

<a id='sec1'></a>

## 1. Experimental Setup

Enumerate all trajectories for each user given the trajectory length (e.g. 3, 4, 5) and the (start, end) POIs.

For each trajectory, compute a score based on the features below:
 * User Interest (time)
 * User interest (frequency)
 * POI Popularity
 * Travelling Cost
 * Trajectory probability based on the transition probabilities between different POI categories and the following rules for choosing a specific POI within certain category:
  * The Nearest Neighbor of the current POI
  * The most Popular POI
  * A random POI choosing with probability proportional to the reciprocal of its distance to current POI
  * A random POI choosing with probability proportional to its popularity

Plot the scores of generated and actual trajectories for each (user, trajectoryLength, startPOI, endPOI) tuple with some degree of transparency (alpha).

Recommend trajectory with the highest score and measure the performance of recommendation using recall, precision and F1-score.

Optimise parameters in the score function by learning, in this specific case, the cost function could be based on recall, precision or F1-score, we can also control the estimation of transition matrix.

<a id='sec2'></a>

## 2. First Experiment Results

### 2.1 Basic computing steps

1. Split actual trajectories input two parts, one for training, the other for testing.  
   Concretely, For each user, consider all the trajectories with length `3`, `4` and `5`, pick one for testing set and put all others into training set.
   
1. Use trajectories in training set to compute (MLE) a transition matrix where element `[i, j]` denotes the transition probability from POI category `i` to POI category `j`.

1. For each trajectory $T$ in training set, enumerate all possible trajectories that satisfy the following requirements:
 * The trajectory length is the same as that of $T$
 * The start/end POI are the same as those of $T$
 * No sub-tour exists
 
1. Compute the `8` scores [described above](#sec1), rescale each score into range `[-1, 1]`, compute the weighted sum of these score to get a single score (weights are normalised so that they are in range `[0, 1]` and their sum is `1`)
 
1. Choose the trajectory with the highest score $T^*$ and compute F1 score as follows:  
 * recall = $\frac{|T^* \cap T|}{|T|}$  
 * precision = $\frac{|T^* \cap T|}{T^*}$  
 * F1-score = $\frac{2 \times \text{recall} \times \text{precision}}{\text{recall} + \text{precision}}$
 
1. Compute the mean F1 score for all trajectory $T$ in the training set.

1. Use coordinate-wise greedy search to find an good weight vector such that the mean F1 score is as large as possible.

### 2.2 Some experimental results

Random weights, use 10 random weight vector, the mean F1 scores are as follows:  
`[0.717, 0.705, 0.665, 0.719, 0.698, 0.719, 0.704, 0.724, 0.693, 0.665]`,  mean = `0.701`, std = `0.02`

Some arbitrary weights:  
`[[1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8, 1/8], # avgF1: 0.712`  
 `[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], # avgF1: 0.712`  
 `[0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], # avgF1: 0.707`  
 `[0.5, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], # avgF1: 0.700`  
 `[0.2, 0.2, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1]] # avgF1: 0.708`

Coordinate-wise greedy search, start from `[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]`, for each coordinate,  
search a value in [0, 0.05, 0.1, 0.15, 0.2, ..., 0.95, 1] to maximize the mean F1 score while fixing other coordinates.

Search for the first coordinate,

<img src='images/coord1.png'></img>

Search for the second coordinate,

<img src='images/coord2.png'></img>

Search for the third coordinate,

<img src='images/coord3.png'></img>

Search for the fourth coordinate,

<img src='images/coord4.png'></img>

Search for the fifth coordinate,

<img src='images/coord5.png'></img>

Search for the sixth coordinate,

<img src='images/coord6.png'></img>

Search for the seventh coordinate,

<img src='images/coord7.png'></img>

Search for the eighth coordinate,

<img src='images/coord8.png'></img>

After that, a weight vector `[0.15, 0, 0.05, 0.25, 0.4, 0, 0, 0.9]` was found.
The testing result (mean F1 score):

Mean F1-score: `0.727` (`0.671` in ijcai15 paper)