Skip to content
/ SCPR Public

Interactive Path Reasoning on Graph for Conversational Recommendation

Notifications You must be signed in to change notification settings

gangyizh/SCPR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SCPR

Paper: Interactive Path Reasoning on Graph for Conversational Recommendation (KDD 2020).

Conversational Path Reasoning (CPR) framework introduce graph to address the multi-round conversational recommendation problem. It tackles what item to recommend and what attribute to ask problem through message propagation on the graph.

Please kindly cite our paper if you use our code/dataset!

@inproceedings{lei2020interactive,
  title={Interactive Path Reasoning on Graph for Conversational Recommendation},
  author={Lei, Wenqiang and Zhang, Gangyi and He, Xiangnan and Miao, Yisong and Wang, Xiang and Chen, Liang and Chua, Tat-Seng},
  booktitle={Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  pages={2073--2083},
  year={2020}
}
Shortcuts:

Code:

https://github.com/farrecall/SCPR

Data (Latest Version 2021.12) : Suitable for SCPR

- Google Drive: https://drive.google.com/file/d/1Xkq5UGuE70P8QIBOWSmVtxDImnOkCoZA/view?usp=sharing
- Tencent Weiyun:https://share.weiyun.com/ctZX2rnq

Data (Early Release Version 2020.9) : Suitable for SCPR_code_v2020, UNICORN, MCMIPL

- Google Drive: https://drive.google.com/file/d/1uIgF7hHAjjK3a48G43UJGYI14RMcd4fV/view?usp=sharing
- Tencent Weiyun:https://share.weiyun.com/SWYnQi8z

The latest data version (2021.12) is similar to the earlier data version (2020.9), moreover the architecture is simpler to comprehend.


This is our torch implementation for the paper:

Environment Requirement

  • Python >= 3.6
  • Numpy >= 1.12
  • PyTorch >= 1.0

Example to Run the Code

1. Graph Construction

python graph_init.py --data_name <data_name>

<data_name> is one of {LAST_FM, LAST_FM_STAR, YELP, YELP_STAR}

Note:
  • LAST_FM_STAR and YELP_STAR using the original attributes (pruning off frequency < 10 attributes) for binary question scenario.
  • Following the setting of EAR, LAST_FM is designed to evaluate binary question scenario by merging relevant attributes into coarse-grained attributes and YELP is designed for enumerated questions by builting a 2-layer taxonomy.

2. Train FM Embedding

python FM_train.py --data_name <data_name>

More Details:

Use python FM_train.py -h to get more argument setting details.

  -h, --help             show this help message and exit
  -lr <lr>              learning rate
  -flr <flr>            learning rate of feature similarity learning
  -bs <bs>              batch size
  -hs <hs>              hidden size & embedding size
  -dr <dr>              dropout ratio
  -uf <uf>              update feature
  -me <me>              the number of train epoch
  -seed <seed>          random seed
  --data_name <data_name>
                        One of {LAST_FM, LAST_FM_STAR, YELP, YELP_STAR}.

3. Train RL Agent & Evaluate

python RL_model.py --data_name <data_name> --fm_epoch <the epoch of FM embedding>
Note:
  • The default fm_epoch is 0, which means the FM embedding we trained in a particular FM epoch. To run quickly, you can use this preset FM embedding for RL training, which can be found in the tmp/<data_name>/FM-model-embeds.

More Details:

Use python RL_model.py -h to get more argument setting details.

  -h, --help            show this help message and exit
  --seed <seed>         random seed.
  --epochs <epochs>     the number of RL train epoch.
  --fm_epoch <fm_epoch> the epoch of FM embedding
  --batch_size <batch_size>
                        batch size.
  --gamma <gamma>       reward discount factor.
  --lr <lr>             learning rate.
  --hidden <hidden>     hidden size
  --memory_size <memory_size>
                        the size of memory
  --data_name <data_name>
                        One of {LAST_FM*, LAST_FM, YELP*, YELP}.
  --entropy_method <entropy_method>
                        entropy_method is one of {entropy, weight entropy}
  --max_turn <max_turn>
                        max conversation turn
  --ask_num <attr_num>   the number of attributes for <data_set>
  --observe_num <observe_num>
                        the number of epochs to save RL model and metric
  --target_update <target_update>
                        the number of epochs to update policy parameters

Dataset

We provide two processed datasets: Last-FM, Yelp.

  • You can find the full version of recommendation datasets via Last-FM, Yelp
  • Here we list the relation types in different datasets to let readers to get better understanding of the dataset.
Dateset LastFM* Yelp*
User-Item
Interaction
#Users 1,801 27,675
#Items 7,432 70,311
#Interactions 76,693 1,368,606
#attributes 33 29
Graph #Entities 9,266 98,605
#Relations 4 3
#Triplets 138,217 2,884,567
Relations Description Number of Relations
Interact user---item 76,696 1,368,606
Friend user---user 23,958 688,209
Like user---attribute 7,276 *
Belong_to item---attribute 30,290 350,175

Dateset LastFM Yelp
User-Item
Interaction
#Users 1,801 27,675
#Items 7,432 70,311
#Interactions 76,693 1,368,606
#attributes 8,438 590
Graph #Entities 17,671 98,576
#Relations 4 3
#Triplets 228,217 2,533,827
Relations Description Number of Relations
Interact user---item 76,696 1,368,606
Friend user---user 23,958 688,209
Like user---attribute 33,120 *
Belong_to item---attribute 94,446 477,012

Data Description

1. Graph Generate Data

  • user_item.json

    • Interaction file.
    • A dictionary of key value pairs. The key and the values of a dictionary entry: [userID : a list of itemID].
  • tag_map.json

    • Map file.
    • A dictionary of key value pairs. The key and the value of a dictionary entry: [Real attributeID : attributeID].
  • user_dict.json

    • User file.
    • A dictionary of key value pairs. The key is userID and the value of a dictionary entry is a new dict: (''friends'' : a list of userID) & [''like'' : attributeID]
  • item_dict.json

    • Item file.
    • A dictionary of key value pairs. The key is itemID and the value of a dictionary entry is a new dict: [''attribute_index'' : a list of attributeID]

2. FM Sample Data

For the process of generating FM train data, please refer to Appendix B.2 of the paper.
  • sample_fm_data.pkl
    • The pickle file consists of five lists, and the fixed index of each list forms a training tuple(user_id, item_id, neg_item, cand_neg_item, prefer_attributes).
user_pickle = pickle_file[0]           user id
item_p_pickle = pickle_file[1]         item id that has interacted with user
i_neg1_pickle = pickle_file[2]         negative item id that has not interacted with user
i_neg2_pickle = pickle_file[3]         negative item id that has not interacted with the user in the candidate item set
preference_pickle = pickle_file[4]     the user’s preferred attributes in the current turn

3. UI Interaction Data

  • review_dict.json
    • Items that the user has interacted with
    • Used for generating FM sample data
    • Used for training and testing in RL

About

Interactive Path Reasoning on Graph for Conversational Recommendation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages