ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

Introduction

This is the pytorch implementation of ReLLa proposed in the paper ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation.

In this repo, we implement ReLLa with transformers==4.28.1. We also provide a newer version of implementation with transformers==4.35.2 in this repo.

Requirements

pip install -r requirments.txt

Data preprocess

You can directly use the processed data from this link. (including data w/o and w/ retrieval: full testing set, sampled training set, history length 30/30/60 for Ml-1m/Ml-25m/BookCrossing)

Or you can preprocess by yourself. Scripts for data preprocessing of BookCrossing, MovieLens-1M, MovieLens-25M are included in data_preprocess.

Get semantic embeddings

Get semantic item embeddings for retrieval.

python get_semantic_embed.py --model_path XXX --data_set BookCrossing/ml-1m/ml-25m --pooling average

Retrieval and pre-store the neighbor item indice

BookCrossing

python topK_relevant_BookCrossing.py

MovieLens-1M

python topK_relevant_ml1m.py

MovieLens-25M

python topK_relevant_ml25m.py

Convert data into text

python data2json.py --K 10 --temp_type simple --set test --dataset ml-1m

Demo processed data is under ./data/ml-1m/proc_data/data/test/test_5_simple.json

Training_set_construction

This step samples training data from the whole training set, and constructs a mixture dataset of both original data and retrieval-enhanced data.

python training_set_construction.py --K 5

Quick start

You should provide the model path in the scripts.

Inference

python scripts/script_inference.py --K 5 --dataset ml-1m --temp_type simple

Finetune

python scripts/script_finetune.py --dataset ml-1m --K 5 --train_size 64 --train_type simple --test_type simple --epochs 5 --lr 1e-3 --total_batch_size 64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

Introduction

Requirements

Data preprocess

Get semantic embeddings

Retrieval and pre-store the neighbor item indice

Convert data into text

Training_set_construction

Quick start

Inference

Finetune

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data/ml-1m/proc_data/data		data/ml-1m/proc_data/data
data_preprocess		data_preprocess
scripts		scripts
README.md		README.md
data2json.py		data2json.py
finetune.py		finetune.py
get_semantic_embed.py		get_semantic_embed.py
inference.py		inference.py
load_prompt_BookCrossing.py		load_prompt_BookCrossing.py
load_prompt_ml1m.py		load_prompt_ml1m.py
load_prompt_ml25m.py		load_prompt_ml25m.py
requirments.txt		requirments.txt
topK_relevant_BookCrossing.py		topK_relevant_BookCrossing.py
topK_relevant_ml1m.py		topK_relevant_ml1m.py
topK_relevant_ml25m.py		topK_relevant_ml25m.py
training_set_construction.py		training_set_construction.py

LaVieEnRose365/ReLLa

Folders and files

Latest commit

History

Repository files navigation

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

Introduction

Requirements

Data preprocess

Get semantic embeddings

Retrieval and pre-store the neighbor item indice

Convert data into text

Training_set_construction

Quick start

Inference

Finetune

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages