Skip to content

Tool for simplifying to perform experiments with collaborative filtering models

License

Notifications You must be signed in to change notification settings

Ilyushin/rec-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tool for simplifying to perform experiments with collaborative filtering models

  1. Install the tool
chmod +x ./build_local.sh
./build_local.sh
  1. Setting up a configuration. An example you can find in config_example.yaml

  2. Running an experiments

rec_tool --config ./config_example.yaml

Config options

Config example This config provides an option to start several experiments with some models (Embedding model, MLP, NCF, Matrix Factorization, SVD, etc.) on several datasets (Movilens/Bookcrossing/Behance/Goodreads). There are few additional options like using different metrics for evaluation together, Grid Search on batch size parameter, number of epochs and learning rate.

  data:
    input_data:
      clear: true
      movielens:
        use: true
        type: ml-1m
        path: /tmp/rec_tool/dataset/movielens
        transformations: rec_tool.transformations.movielens.prepare_data

The additional datasets could be defined as follows

      goodreads:
        use: false
        type: goodreads
        transformations: rec_tool.transformations.goodreads.get_goodreads_data
      bookcrossing:
        use: false
        type: bookcrossing
        transformations: rec_tool.transformations.bookcrossing.bookcrossing_converting
      behance:
        use: false
        type: behance
        transformation: rec_tool.transformations.behance.behance_converting
        

There is option to run several models

  model:
    model: [
      rec_tool.models.embedding.embedding_model,
      rec_tool.models.mlp.mlp,
      rec_tool.models.ncf.ncf_model,
      rec_tool.models.mf.mf,
      rec_tool.models.svd.svd,
    ]

For model evaluation several metrics could be defined as well

    loss: rec_tool.losses.mean_squared_error
    metrics: [
      rec_tool.metrics.accuracy,
      rec_tool.metrics.rmse,
      rec_tool.metrics.mae
    ]

To start GridSearch over batch_size, epochs and learning_rate you need to define the range of these parameters using lists

    batch_size: [1024, 2048, 4096]
    epoch: [50, 100, 200]
    optimizers: adam
    grid_search: True
    learning_rate: 0.01

To save the model you need define a directory for the model. There is also an option to write results into csv file.

  result:
    model: /tmp/rec_tool/model/
    log: /tmp/rec_tool/log/
    results_csv: run_results.csv
    log_to_ml_flow: True
    clear: true

Example with the one model, goodreads dataset, one batch_size, one epoch

config:
  data:
    input_data:
      clear: true
      movielens:
        use: false
        type: ml-1m
        path: /tmp/rec_tool/dataset/movielens
        transformations: rec_tool.transformations.movielens.prepare_data
      goodreads:
        use: true
        type: goodreads
        transformations: rec_tool.transformations.goodreads.get_goodreads_data
      bookcrossing:
        use: false
        type: bookcrossing
        transformations: rec_tool.transformations.bookcrossing.bookcrossing_converting
      behance:
        use: false
        type: behance
        transformation: rec_tool.transformations.behance.behance_converting
  model:
    model: [rec_tool.models.ncf.ncf_model]
    loss: rec_tool.losses.mean_squared_error
    metrics: [
      rec_tool.metrics.accuracy,
      rec_tool.metrics.rmse,
      rec_tool.metrics.mae
    ]
    batch_size: [1024]
    epoch: [50]
    optimizers: adam
    grid_search: True
    learning_rate: 0.01


  result:
    model: /tmp/rec_tool/model/
    log: /tmp/rec_tool/log/
    results_csv: run_results.csv
    log_to_ml_flow: True
    clear: true

Collaborative Filtering models description

Model and paper Examples
Variational Autoencoder for Collaborative Filtering (VAECF) vae.py
Singular Value Decomposition (SVD) svd.py
Matrix Factorization (MF) mf.py
Multi-Layer Perceptron (MLP) mlp.py
Neural Matrix Factorization (NeuMF) / Neural Collaborative Filtering (NCF) ncf.py
Bayesian Personalized Ranking (BPR) bpr.py
Weighted Matrix Factorization (WMF) wmf.py
SVD++ svdpp.py

About

Tool for simplifying to perform experiments with collaborative filtering models

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 4

  •  
  •  
  •  
  •