Skip to content
This repository has been archived by the owner on Mar 10, 2022. It is now read-only.

libgen.py: IndexError('basic_string::substr: __pos (which is 4294967295) > this->size() (which is 45)') #76

Closed
tmplt opened this issue Feb 25, 2019 · 4 comments
Labels
bug core this issue regards the bookwyrm core upstream Something is wrong in a library or dependent code

Comments

@tmplt
Copy link
Owner

tmplt commented Feb 25, 2019

Querying bookwyrm -t 'mass effect' yields an pybind11::error_already_set during self.bookwyrm.feed(item). While the core::item is constructed, the std::string returned from core::get_string(py::dict &, const char*) apparently triggers this; stepping through the code the value is successfully indexed from the dict and the py::str and std::string constructed. The exception is thrown during function return (?).

Originally posted by @tmplt in #75 (comment)

@tmplt
Copy link
Owner Author

tmplt commented Feb 25, 2019

To be more specific, the item we feed is constructed from the first entry in these search results. The exception is thrown when calling get_string(dict, "title"), which is upon its first occurance in the nonexact_ts constructor.

@tmplt
Copy link
Owner Author

tmplt commented Feb 25, 2019

Constructing the title field by default yields no exceptions from subsequent calls to get_string(). Does the string contain something illegal?

@tmplt
Copy link
Owner Author

tmplt commented Feb 25, 2019

This exception does not occur during item construction. It is thrown somewhere in item::matches(const item&, const unsigned int) const! Going by the error message, somewhere we do a bad substr with pos = 2^32 - 1 which logically yields an std::out_of_range. This exception is then translated to a Python IndexError that is then caught in the module runner function.

@tmplt
Copy link
Owner Author

tmplt commented Feb 25, 2019

The exception is thrown in fuzzywuzzy's fuzz::partial_ratio, lib/fuzzywuzzy/src/fuzzywuzzy.cpp:46 (the only occurance of std::string::substr() where in

size_t long_start = utils::max(0u, block.dpos - block.spos);
size_t long_end   = long_start + shorter.length();
auto long_substr = longer.substr(long_start, long_end);

long_start has been set to 4294967295.

@tmplt tmplt added bug core this issue regards the bookwyrm core upstream Something is wrong in a library or dependent code labels Feb 25, 2019
@tmplt tmplt closed this as completed in 2f4e495 Feb 25, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug core this issue regards the bookwyrm core upstream Something is wrong in a library or dependent code
Projects
None yet
Development

No branches or pull requests

1 participant