Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why put the test[u][0] into the candidate seqences? #21

Open
BEbillionaireUSD opened this issue Feb 7, 2021 · 6 comments
Open

Why put the test[u][0] into the candidate seqences? #21

BEbillionaireUSD opened this issue Feb 7, 2021 · 6 comments

Comments

@BEbillionaireUSD
Copy link

In the evaluate function, it makes a item_index list and put the test[u][0] in it.
What I consider is that the test[u][0] should be what we want to predict, but in this way, the model knows it should predict from the possibility of these candidates, including the one we want to predict.
Is this a kind of data leaking? Or did I misunderstand something?

@BEbillionaireUSD
Copy link
Author

Specifically, what I mean is this part

for i in reversed(train[u]):
seq[idx] = i
idx -= 1
if idx == -1: break
rated = set(train[u])
rated.add(0)
item_idx = []#[test[u][0]]
for _ in range(101):
t = np.random.randint(1, itemnum + 1)
while t in rated: t = np.random.randint(1, itemnum + 1)
item_idx.append(t)

    predictions = -model.predict(*[np.array(l) for l in [[u], [seq], item_idx]])

@kang205
Copy link
Owner

kang205 commented Feb 7, 2021 via email

@BEbillionaireUSD
Copy link
Author

Thanks for your quick reply!
But there is another function to calculate the validation recall rate, i.e., "evaluate_valid()"
In the function, it doesn't involve test but adds the valid item into the item_index

Here is the function:

rated = set(train[u])
rated.add(0)
item_idx = [valid[u][0]]
for _ in range(100):
    t = np.random.randint(1, itemnum + 1)
    while t in rated: t = np.random.randint(1, itemnum + 1)
    item_idx.append(t)

predictions = -model.predict(sess, [u], [seq], item_idx)
predictions = predictions[0]

My understanding of this phase is that: The model randomly chooses 100 candidates from all items (except those that have appeared before ) and adds the one it wants to predict into the candidate set. Then it predicts the probability of these 101 candidates.
The loop seems to be a little bit strange.

@kang205
Copy link
Owner

kang205 commented Feb 7, 2021 via email

@BEbillionaireUSD
Copy link
Author

Thanks! But what if I want to predict the next item without knowing the real-next-item? Let item_index contains all items?

@coolsubbu
Copy link

Hi ,

I would like to know the answer to same question asked by the CherlyLbt.

How do we predict the next item without knowing the real-next item?..

Thanks
coolsubbu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants