ValueError: negative row index found on input #4

dantlz · 2016-07-14T16:37:17Z

When I run the attached input, I get the following input:

Traceback (most recent call last):
File "/Users/username/Desktop/Recommendation/Implementation.py", line 206, in
collaborative_filter(formatted, result)
File "/Users/username/Desktop/Recommendation/Implementation.py", line 80, in
collaborative_filter
df, plays = read_data(input_filename)
File "/Users/username/Desktop/Recommendation/Implementation.py", line 25, in read_data
data['user'].cat.codes.copy())))
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 182, in init
self._check()
File "/usr/local/lib/python2.7/site-packages/scipy/sparse/coo.py", line 240, in _check
raise ValueError('negative row index found')
ValueError: negative row index found

From what I can tell, the input is correctly formatted with 3 columns separated by tabs. Thank you for your time!
faulty_input.txt

benfred · 2016-07-14T17:11:36Z

So - it seems like this row '4a81291db77648b0 nan 1' is tripping up the pandas read_table parser.
Its interpret the artist there as a floating point NaN instead of a string, which causes the category code to fail etc.

Looks like this is by design in the pandas.read_table function , adding a 'na_filter=False' to the argument list bypasses the NaN check and should work

    data = pandas.read_table(filename,
                             usecols=[0, 1, 2],
                             names=['user', 'artist', 'plays'],
                             na_filter=False)

dantlz · 2016-07-15T08:36:43Z

That completely resolved the issue. Thank you very much for the help!

eliasah · 2017-03-30T12:30:23Z

This is issue also affects the code from your distance-metrics project.

benfred · 2017-03-30T18:07:09Z

@eliasah I've added nearest neighbour support to this project recently: #14 . It should be better than the code I included with the original blog post - calculation is parallelized and won't run out of memory if the full similarity matrix is large.
The lastfm.example here shows how to use:https://github.com/benfred/implicit/blob/master/examples/lastfm.py

I'll update that post/code to point here sometime soon

dantlz closed this as completed Jul 15, 2016

MariosGr mentioned this issue Mar 17, 2018

model fit crushes with more than 2^31 positive interactions #86

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError: negative row index found on input #4

ValueError: negative row index found on input #4

dantlz commented Jul 14, 2016

benfred commented Jul 14, 2016

dantlz commented Jul 15, 2016

eliasah commented Mar 30, 2017 •

edited

Loading

benfred commented Mar 30, 2017

ValueError: negative row index found on input #4

ValueError: negative row index found on input #4

Comments

dantlz commented Jul 14, 2016

benfred commented Jul 14, 2016

dantlz commented Jul 15, 2016

eliasah commented Mar 30, 2017 • edited Loading

benfred commented Mar 30, 2017

eliasah commented Mar 30, 2017 •

edited

Loading