Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue run retail dataset #7

Closed
KylinA1 opened this issue Aug 22, 2022 · 3 comments
Closed

Issue run retail dataset #7

KylinA1 opened this issue Aug 22, 2022 · 3 comments

Comments

@KylinA1
Copy link

KylinA1 commented Aug 22, 2022

Hi, Thank you for your quick response.
I was unable to run the code on retail dataset. (and Successfully run the code on ml10m and yelp datasets )
BTW, I have fixed the following minor issues to enable the correctness:

  • change the folder name from 'Yelp' to 'yelp' (match the folder name in DataHandler_time.py)
  • change line 14 in DataHandler_time.py to 'retail' (instead of 'Tmall')

During the execution of

python labcode_retail.py --data retail --graphSampleN 15000 --reg 1e-1 --save_path model_name

The error pops out due to the index out of range. I think these are some issues with the dataset.
Please let me know if you could get it run.
And would you mind to sharing me the original data and data preprocessing code?

@akaxlh
Copy link
Owner

akaxlh commented Aug 22, 2022

Thank you for your bugfix! I'll check the issue as soon as possible.

@KylinA1
Copy link
Author

KylinA1 commented Sep 6, 2022

Thank you for your bugfix! I'll check the issue as soon as possible.

Thanks for your response.
I couldn't understand the following steps in line76-78 of labcode_yelp.py

		padTgtNodes = tf.concat([tgtNodes, tf.constant([maxNum], dtype=tf.int64)], axis=-1)
		attval = tf.reshape(tf.math.segment_sum(padAttval, padTgtNodes), [-1, args.latdim])
		attval = tf.slice(attval, [0, 0], [maxNum, -1])

while in labcode_ml10m.py it changes to

		padTgtNodes = tf.concat([tgtNodes, tf.reshape(maxNum-1, [1])], axis=-1)
		attval = tf.reshape(tf.math.segment_sum(padAttval, padTgtNodes), [-1, args.latdim])

why we need to padding the target Nodes.

@akaxlh
Copy link
Owner

akaxlh commented Sep 26, 2022

Hi, sorry for the delay.

  1. The padding is to avoid the situation that the row with biggest id is not involved, in which case the result embedding matrix will lose these rows.
  2. The error. I don't have the index out of range error here. Do you have this error every time or just sometimes. If this error occurs occassionally, it may relate to the sub-graph sampling algorithm.
  3. Dataset and processing code. You can email me at aka_xia@foxmail.com. I will send you the original data and processing code.
  4. Thank you for your bugfix.

@akaxlh akaxlh closed this as completed Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants