UniRec

Introduction

UniRec is an easy-to-use, lightweight, and scalable implementation of recommender systems. Its primary objective is to enable users to swiftly construct a comprehensive ecosystem of recommenders using a minimal set of robust and practical recommendation models. These models are designed to deliver scalable and competitive performance, encompassing a majority of real-world recommendation scenarios.

It is important to note that this goal differs from those of other well-known public libraries, such as Recommender and RecBole, which include missions of providing an extensive range of recommendation algorithms or offering various datasets.

The term "Uni-" carries several implications:

Unit: Our aim is to employ a minimal set of models to facilitate the recommendation service onboarding process across most real-world scenarios. By maintaining a lightweight and extensible architecture, users can effortlessly modify and incorporate customized models into UniRec, catering to their specific future requirements.
United: In contrast to the Natural Language Processing (NLP) domain, it is challenging to rely on a single model to serve end-to-end business applications in recommender systems. It is desirable that various modules or stages (such as retrieval and ranking) within a recommender system are not isolated and trained independently but are closely interconnected.
Unified: While we acknowledge that model parameters cannot be unified, we believe there is potential to unify model structures. Consequently, we are exploring the possibility of utilizing a unified Transformer structure to serve different modules within recommender systems.
Universal: We aspire for UniRec to support a wide range of recommendation scenarios, including gaming, music, movies, ads, and e-commerce, using a universal data model.

Installation

Installation from PyPI

Ensure that PyTorch with CUDA supported (version 1.10.0-1.13.1) is installed:

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

python -c "import torch; print(torch.__version__)"

Install unirec with pip:
```
pip install unirec
```

Installation from Wheel Locally

Ensure that PyTorch with CUDA supported (version 1.10.0-1.13.1) is installed:

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

python -c "import torch; print(torch.__version__)"

Clone Git Repo

git clone https://github.com/microsoft/UniRec.git

Build

cd UniRec
pip install --user --upgrade setuptools wheel twine
python setup.py sdist bdist_wheel

After building, the wheel package could be found in UniRec/dist.

Install

pip install dist/unirec-*.whl

The specific package name could be find in UniRec/dist.

Check if unirec is installed sucessfully:

python -c "from unirec.utils import general; print(general.get_local_time_str())"

Algorithms

Algorithm	Type	Paper	Code
MF	Collaborative Filtering	BPR	unirec/model/cf/mf.py
UserCF	Collaborative Filtering	-	unirec/model/cf/usercf.py
SLIM	Collaborative Filtering	SLIM	unirec/model/cf/slim.py
AdmmSLIM	Collaborative Filtering	ADMMSLIM	unirec/model/cf/admmslim.py
SAR	Collaborative Filtering	ItemCF, SAR	unirec/model/cf/sar.py
EASE	Collaborative Filtering	EASE	unirec/model/cf/ease.py
MultiVAE	Collaborative Filtering	MultiVAE	unirec/model/cf/multivae.py
SVDPlusPlus	Sequential Model	SVD++	unirec/model/sequential/svdplusplus.py
AvgHist	Sequential Model	-	unirec/model/sequential/avghist.py
AttHist	Sequential Model	-	unirec/model/sequential/atthist.py
GRU	Sequential Model	-	unirec/model/sequential/gru.py
SASRec	Sequential Model	SASRec	unirec/model/sequential/sasrec.py
ConvFormer	Sequential Model	ConvFormer	unirec/model/sequential/convformer.py
FastConvFormer	Sequential Model	ConvFormer	unirec/model/sequential/fastconvformer.py
FM	Ranking Model	Factorization Machine	unirec/model/rank/fm.py
BST	Ranking Model	Behavior sequence transformer	unirec/model/rank/bst.py
MoRec	Multi-objective	MoRec	unirec/facility/morec

Examples

To go through all the examples listed below, we provide a script for downloading and split for ml-100k dataset. Run:

python download_split_ml100k.py

The files for the raw dataset would be saved in your home dir: ~/.unirec/dataset/ml-100k

Next, it is essential to convert the raw dataset into a format compatible with UniRec. Use the script to process and save the files in UniRec/data/ml-100k.

cd examples/preprocess
bash preprocess_ml100k.sh

General Training

To train an existing model in UniRec, for instance, training SASRec with ml-100k dataset, refer to the script provided in examples/training/train_ml100k.sh.

Multi-GPU Training

UniRec supports multi-GPU training with the integration of Accelerate. An example script is available at examples/training/multi_gpu_train_ml100k.sh. The key arguments in the script could be found in line 3-12 in the script:

GPU_INDICES="0,1" # e.g. "0,1"

# Specify the number of nodes to use (one node may have multiple GPUs)
NUM_NODES=1

# Specify the number of processes in each node (the number should equal the number of GPU_INDICES)
NPROC_PER_NODE=2

For more details about the launching command, please refer to Accelerate Docs.

Hyperparameter Tuning with wandb

UniRec supports hyperparameter tuning (or hyperparameter optimization, HPO) with the intergration of WandB. There are three major steps to start a wandb experiment.

Compose a training script and enable wandb. An example is provided in examples/training/train_ml100k_with_wandb.sh. The key arguments are:
- --use_wandb=1: enable wandb in process
- --wandb_file=/path/to/configuration_file: the configuration file for wandb, including command, metrics, method, and search space.
Define sweep configuration. Write a YAML-format configuration file to set the command, monitor metrics, tuning method and search space.An example is available at examples/training/wandb.yaml. For more details about the configuration file, refer to WandB Docs
Initialize sweeps and start sweep agents. To start an experiment with wandb, first, initialize a sweep controller for selecting hyperparameters and issuing intructions; then an agent would actually perform the runs. An example for launching wandb experiments is provided in examples/training/wandb_start.sh. Note that we offer a pipeline command in the script to start the agent automatically after sweep initialization. However, we recommend the simpler manual two-step process:

## Step 1. Initialize sweeps with CLI using configuration file. 
## For more details, please refer to https://docs.wandb.ai/guides/sweeps/initialize-sweeps

wandb sweep config.yaml

## Step 2. After `wandb sweep`, you would get a sweep id and the hint to use `sweep agent`, like:

## wandb: Creating sweep from: ./wandb.yaml
## wandb: Created sweep with ID: xxx
## wandb: View sweep at: https://wandb.ai/xxx/xxx/xxx/xxx
## wandb: Run sweep agent with: wandb agent xxx/xxx/xxx/xxx

wandb agent entity/project/sweep_ID

Serving with C# and Java

UniRec supports C# and Java inference based on ONNX format. We provide inference for user embedding, item embedding, and user-item score.

For more details, please refer to examples/serving/README

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
examples		examples
tests		tests
unirec		unirec
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
requirements.unirec.yaml		requirements.unirec.yaml
setup.py		setup.py

License

microsoft/UniRec

Folders and files

Latest commit

History

Repository files navigation

UniRec

Introduction

Installation

Installation from PyPI

Installation from Wheel Locally

Algorithms

Examples

General Training

Multi-GPU Training

Hyperparameter Tuning with wandb

Serving with C# and Java

Contributing

Trademarks

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages