Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dump) at the prediction #254

Closed
phenric opened this issue Jan 20, 2018 · 17 comments
Closed

Segmentation fault (core dump) at the prediction #254

phenric opened this issue Jan 20, 2018 · 17 comments

Comments

@phenric
Copy link

phenric commented Jan 20, 2018

I'm using the WARP to create my model. To fit it, I've no problem. But for the prediction, I've a segmentation fault (core dump).

How can I solve it ?

Thx

I'm using Ubuntu 16.04 with python 2.7.

@maciejkula
Copy link
Collaborator

How did you install LightFM? Can you post the inputs you are using and the functions you are calling?

@phenric
Copy link
Author

phenric commented Jan 21, 2018

Thank you for your responsiveness.
I installed LightFM with pip.
I use scores = model.predict(user_id, np.arange(n_items), data['items'], data['users'], 2)
where data['items'], data['users'] are float matrices and np.arange(n_items) is a int matrix.

@maciejkula
Copy link
Collaborator

data['items'] and data['users'] are dense numpy matrices? Or sparse matrices? If sparse, in what format?

@maciejkula
Copy link
Collaborator

maciejkula commented Jan 21, 2018

Can you experiment and try to narrow down with what arguments, or in what conditions, does the problem happen?

@phenric
Copy link
Author

phenric commented Jan 21, 2018

Yes data['items'] and data['users'] are spares.csr_matrix build with:
`def csv_to_csr(file_given):

with open(file_given, 'r') as f:

    lines = list(csv.reader(f, delimiter=";"))
    lines = np.array(lines[0:], dtype='float32')

return sparse.csr_matrix(lines, dtype='float32m')`

@maciejkula
Copy link
Collaborator

What shape is lines?

@phenric
Copy link
Author

phenric commented Jan 21, 2018

The shape of lines is (100, 8)

@maciejkula
Copy link
Collaborator

OK, so you have a 100 users/items in your dataset? Is it a dataset you can share so that I could reproduce the problem?

@phenric
Copy link
Author

phenric commented Jan 21, 2018

Yes, that's right. My datasets are not real. I generated them for the test. So, I can give them. Do you have an email address where I can send the files ?

@maciejkula
Copy link
Collaborator

Can you please make a gist or a github repo with the full code that I can run?

@phenric
Copy link
Author

phenric commented Jan 21, 2018

Here you can find the repo https://github.com/phenric/reco
Feel free to add modifications
Thank you

@maciejkula
Copy link
Collaborator

Your code doesn't actually work from line 34 onwards: https://github.com/phenric/reco/blob/master/recommander.py#L34

@phenric
Copy link
Author

phenric commented Jan 21, 2018

When I comment those lines, I still have a segmentation fault. Do you have any idea about the issue ? What am I wrong ?

@maciejkula
Copy link
Collaborator

I know what the issue is. You're passing a ludicrously large user_id into the predict function. Am I right?

@maciejkula
Copy link
Collaborator

#256

For future reference, LightFM expects that user and item ids be contiguous and start at zero. This means that if you have 10 users the largest possible user index you should be passing in to predict is 9.

@phenric
Copy link
Author

phenric commented Jan 21, 2018

Thank you very much for your help !
You're right. When the ID is smaller, I don't have Segmentation fault anymore but Exception: Number of user feature rows does not equal the number of users. So, I supposed it was my error ?

@maciejkula
Copy link
Collaborator

Yes, it was your error, but the library should never segfault. So thank you for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants