<img src="./images/logo.png" alt="Alt Text" width="700">


# Daisy Rec Evaluate New Algorithm tutorial

This is a tutorial under construction on how to add a new model into DaisyRec and evaluate its test metrics.

This tutorial is still under construction so please do report on any bugs you might find!

## Steps 

 Say you want to re-implement the Neural Collaborative Filtering algorithm as designed by He et al. (2017) (archiv link: [arxiv.org/abs/1708.05031](https://arxiv.org/abs/1708.05031)). The following is a guide on implementation  
 
 First, **create a shortened name string**. For this implementation, we use "neumf". Remember this string.



### Step 1 - adding default hyperparameter configurations
 
1. Go to folder 'daisy/assets'

2. Create a yaml config file for the model using the name. This case the new file would be 'daisy/assets/neumf.yaml'

3. Inside the YAML file, input **all** the hyperparameters in as keys yaml format. The values will be the default values. In this case, we have:

```
    # Hyperparameters
    factors: 24
    num_layers: 2
    dropout: 0.5
    lr: 0.001
    epochs: 30
    reg_1: 0.001
    reg_2: 0.001
    GMF_model: ~
    MLP_model: ~

    # Model name
    model_name: NeuMF
```

You may input your custom MLP or GMF model into the .yaml file, or put ~ to use our implementation. Note that these values are indeed **default only**, they can be tuned or different values can be tested using command line arguments 

### Step 2 - adding the model .py file

1. Create the new model python file, following naming convention of adding "Recommender" at the back of the model name i.e., NeuMFRecommender.py

2. Put the file into the daisy/models folder, in the /accuracy subfolder or /diversity subfolder, depending on if your model is focused on increasing accuracy or diversity of recommendation. In this case, since our model is accuracy-based, the absolute file path of our model code (from the root folder) is /daisy/model/accuracy/NeuMFRecommender.py


3. Inside the file, define your model class briefly; we will go into deeper details later. Name your class the model name in your yaml file. Your model should usually be a child class of `GeneralRecommender`, imported from daisy/model/AbstractRecommender.py. Please add a class variable `tunable_param_names` which is an array with all of the hyperparameters as outlined in the yaml file. In this case, it would be:

```
from daisy.model.AbstractRecommender import GeneralRecommender

class NeuMF(GeneralRecommender):
    tunable_param_names = ['num_ng', 'factors', 'num_layers', 'dropout', 'lr', 'batch_size', 'reg_1', 'reg_2']
    '''
    NeuMF Recommender Class, it can be seperate as: GMF and MLP
    '''
    def __init__(self, config):
        super(NeuMF, self).__init__(config)
        self.config = config
```

4. Now, import the Model in daisy/model/Models.py, which is used for importing models into other files. Inside the RecommenderModel() function, you will see a large if-else block matching the model name with the model class import. For this case, we will add the `elif` block:

```
    elif algo_name == 'neumf':
    from daisy.model.accuracyRecommender.NeuMFRecommender import NeuMF
    return NeuMF
```

### Step 3 - Loading data into the model

The dataset is loaded in test.py and tune.py using:

```
''' Train Test split '''
    splitter = TestSplitter(config)
    train_index, test_index = splitter.split(df)
    train_set, test_set = df.iloc[train_index, :].copy(), df.iloc[test_index, :].copy()
```

`train_set` is just a portion of the full data in a Pandas DataFrame with columns corresponding to user IDs, item IDs, ratings and timestamps. The user IDs and item IDs numbering start from 0, not necessarily in order. For example:

| User ID | Item ID | Rating | Timestamp |
|---------|---------|--------|-----------|
|   133 |   2023 |   4.5  |  1624165321  |
|   0 |   345 |   3.8  |  1624165487  |
|   210 |   293 |   5.0  |  1624165632  |

Some datasets have explicit feedback (i.e., ratings are 0.0/5 to 5.0/5) whereas some only have implicit feedback (i.e., only interaction existence is captured. Rating is hard set to 1.0/5). Inspect the dataset you want before use.

For negative sampling, we will use `BasicNegtiveSampler`. If you need some special processing, feel free to create your own custom sampler, or explore the functionality of `AEDataset` and `SkipGramNegativeSampler` and see if these classes are performing the processing that you are looking for. 

`BasicNegtiveSampler` finds num_ng negative samples for each row, where num_ng is specified by you in the run configuration (default 4), and stored in the `config` object. This means that it will find num_ng unique items that the user has _not_ interacted for every row. Then, it will add these items into a new column called "neg_set": 

| User ID | Item ID | Rating | Timestamp | Neg_set(assuming num_ng=4) |
|---------|---------|--------|-----------|---------|
|   133 |   2023 |   4.5  |  1624165321  | [51,135,246,51] |
|   0 |   345 |   3.8  |  1624165487  | [564,346,163,3] |
|   210 |   293 |   5.0  |  1624165632  | [1625,256,452,254] |

Most of the methods (especially neural methods) need to convert into a pytorch DataLoader. In this case, we do:

```
sampler = BasicNegativeSampler(train_set, config)
train_samples = sampler.sampling() # This returns a numpy array or pandas df
train_dataset = BasicDataset(train_samples) # Converts pd.df/np.ndarray to simple pytorch Dataset
train_loader = get_dataloader(
    train_dataset, 
    batch_size=config['batch_size'], 
    shuffle=True, 
    num_workers=4) # Convert torch Dataset to torch DataLoader
```

Now, `train_loader` is the data loader, which will be the input to `model.fit()` to be explained shortly


### Step 4 - incorporating into tune.py and test.py

We now need to put the model name into the model builder in both tune.py and test.py. 

1. In each file, search "if config['algo_name'].lower() in" in your code editor. You should come across the following code block in both files:
    
    <img src='./images/new_algo/testpy.png' alt="image in test.py" width="700"/>
    

2. This code block basically:
    - Builds the model that you want using `RecommenderModel()`. This function takes in the model name string and returns the model class, which then further takes the `config` dictionary in order to instantiate a model object
    - Pre-processes the raw `train_set` data into a format that is your model needs. This includes performing negative sampling  and converting the raw pandas df or numpy array into a torch DataLoader `train_loader`
    - Fits your model to the training data using `model.fit(train_loader)`


3. Notice that each algorithm shorthand name is in an array corresponding to whatever settings are needed for building and fitting that model. As shown in the blue circle, we **add our model name 'neumf'** into the array corresponding to the data processing that we need to fit our model. If you need custom processing, feel free to create your own `elif` block 

### Step 5 - Understanding the `AbstractRecommender` and `GeneralRecommender` classes in AbstractRecommender.py  

#### <u>AbstractRecommender</u>

This should be treated as an classic OOP abstract class, and a parent of all recommenders, outlining all methods needed for recommender systems. Note that this class is already a child of torch.nn.Module, so it already has all built-in methods needed for deep learning in Pytorch.

The following are brief descriptions of the model methods. Do overwrite in child classes as necessary

##### Private Methods 

- _ _init_weight(self, m)_: If model initialises model weights of a layer in a neural network
- _ _build_criterion(self, loss_type)_: Identifies what type of loss function the model should be trained on. Takes a string called 'loss_type' and outputs the loss function
- _ _build_optimiser(self, \*\*kwargs)_: Builds a gradient descent optimiser from the self.config and returns it

Use the `self.config` dictionary for all the configuration and hyperparameter information needed for the model. It is initialised at the top of tune.py/ test.py and subsequently stored in the model. We recommend that you print self.config once in order to understand what is contained within the dictionary. 

##### Abstract (unimplemented) Methods - Must implement downstream

- _calc_loss(self, batch)_: Calculate the loss from the loss function
- _fit(self, train_loader)_: Train the model using the training data loader
- _predict(self, u, i)_: Run the model (for neural methods, do 1 forward prop) to attain the item score/probability for a given item and user.
- _rank(self, test_loader)_: Run the model after model has been trained to rank all candidate items in the test data loader according to score/probability, and return the top k
- _full_rank(self, u)_: Run the model to rank all items in the dataset for a given user after the model has veen trained

#### <u>GeneralRecommender</u>

This is a child class of AbstractRecommender, and is a parent of all gradient descent-based recommender models. This class:

1. Attaches the GPU in the `__init__()` 
2. Implements the `self.fit()` method using gradient descent. Feel free to override the fit() method in child classes

Thus, any model utilising gradient descent (which is almost all models in our library) should be a child of GeneralRecommender



### Step 6 - Designing the `model.rank()`

The `rank()` function is the most important function in your model as it produces the top-k predicted items for each user **which will be used for evaluation metrics**. Thus, how well your model does is dependent on your rank function.

#### Function signature
```
    def rank(self, test_loader: torch.utils.data.DataLoader) -> np.ndarray:
```

#### Function input: `test_loader`

As convention for pytorch Dataloaders, in `test_loader`, these data are contained in batches (default 128). The data is two-dimensional, so we will have 128 arrays (rows) per batch. 

Each array is for each user, so the array will contain 1. **the user ID** and 2. **another array of negatively-sampled items, called the candidates set** (i.e. items for which a recommendation needs to be made for the user and ranked). 

So we will have 128 rows of the following. As an example, say row 103 looks like: 

Index 103: [ user-id=192, [... 1000 candidate_item_ids ...]]

Now, use your logic to process the data and produce the top-N recommendations for the user, with the highest score first. When processing the data, **do not change the order (indices) of these rows**. This is because in the evaluation, our library will use the index in order to identify the user (rather than user-id) and compare the predictions to the ground truth. So, for row 103, given k=50, we expect:

Index 103: [... 50 predicted_item_ids ...]

#### Function output 

Finally, concatenate all these rows together for all batches, and return this result. From function signature, you should expect that you need to return an np.ndarray of shape (u, k) where u is the total number of users in the test set and k is the number of items per user.

Finally, run `python test.py` in your command line with --algo_name="your_model" (in this case, "neumf") and other arguments (as shown in the [DaisyRec command generator](https://daisyrec.netlify.app/)) in order to run the model.

Congratulations! If everything went right, the logger should produce the proper metrics for your model, and you can fairly compare it to other models 

### Step 7 (optional): Tuning your model

If you wish to find better hyperparameters to tune your model, you can use tune.py. In tune.py, almost all the code in test.py is ran multiple times to produce and optimise the preferred metric using optuna.samplers.TPESampler() (which uses Tree-structured Parzen Estimator (TPE) for hyperparameter optimisation) 

Since most of the code is the same, if you follow all the above steps, no further changes to the code should be needed. Simply run tune.py using --algo_name="your_mode" and other arguments in the [DaisyRec command generator](https://daisyrec.netlify.app/). 

The logger should output the hyperparameters in the console, as well as create a folder /tune_res/ with the resultant parameters inside for every trial. 