Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Text Search - Documentation of Limits #11

Open
erichiller opened this issue Mar 7, 2017 · 1 comment
Open

Full Text Search - Documentation of Limits #11

erichiller opened this issue Mar 7, 2017 · 1 comment
Assignees

Comments

@erichiller
Copy link

I haven't seen this in the documentation, but following the tutorial and then inserting nodes with a large-ish string as an attribute intended for Full Text Search (query) - is there a max string length for attribute values? I was trying to to implement a hybrid of graph and full text search, but insertion alone was taking seconds per page (HTML)?

@krotik
Copy link
Owner

krotik commented Mar 7, 2017

Hi Eric,

that is an interesting use-case which I am quite interested in myself. There is in general no maximum string length for attribute values.

The index (full text search index) is word based so an input string is split into words (split by whitespace). The words are then stored in two ways:

node attribute name + word -> node keys + word positions
node attribute name + md5 hash of value -> node keys

First one for word and phrase search - second one for value lookup.
The code for this can be found under /eliasdb/graph/util/indexmanager.go
Have a look at the unit test to see how this component works...

Since a lot of HTML doesn't have spaces between tags I would imagine that the words get quite long. It might help to chop them up a bit...

Best way forward to narrow down what exactly goes wrong would be to write a unit/benchmark for the IndexManager with some suitable test data.

@krotik krotik self-assigned this Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants