Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer not unserialized properly? #226

Closed
alexturpin opened this issue Aug 24, 2016 · 1 comment
Closed

Tokenizer not unserialized properly? #226

alexturpin opened this issue Aug 24, 2016 · 1 comment

Comments

@alexturpin
Copy link

Hello,

I'm trying to use a serialized index with a different tokenizer and it's not working for me.

The tokenizer is first created and registered. It gets set on the index and then I add my documents. Once that's done, the index is serialized to JSON. I can confirm by looking at the JSON that my tokenizer's label was serialized.

The problem comes when loading the index subsequently. I ensure that the tokenizer is first registered before loading the serialized index. The code in lunr.Index.load then uses lunr.tokenizer.load to load the serialized tokenizer. I've confirmed that this works with logging. The problem comes when it assigns the newly loaded tokenizer to idx.tokenizer.

idx.tokenizer = lunr.tokenizer.load(serialisedData.tokenizer)

That's the function used to set the tokenizer for an index. The actual tokenizer eventually resides in tokenizerFn. I believe the code to unseralize a tokenizer when loading an index should instead be:

idx.tokenizer = lunr.tokenizer.load(serialisedData.tokenizer)

That works on my end, and I will submit a PR shortly with that fix.

Cheers

@olivernn
Copy link
Owner

Fixed in 0.7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants