Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create index without importing if reindexing by individual records only #747

Closed
MrHubble opened this issue Oct 7, 2016 · 5 comments
Closed

Comments

@MrHubble
Copy link

MrHubble commented Oct 7, 2016

@ankane do you think we should update the readme to include something along the lines of: If you are reindexing by individual records only, then you should create the index first without importing (ie Product.reindex(import: false)) so the Searchkick settings in your model are applied ?

Background
I've had a lot of issues with how I use default scopes in my multitenant app, and how I reindex my data with Searchkick. A full discussion of the problem can be found here, but the end solution was to use reindex on individual records rather than the model as a whole. An example of my custom rake task to reindex would look like:

Businesses.all.each do |business|
  set_current_business(business.id)
  Products.all.each do |product|
    product.reindex
  end
end

This works for me at the moment and allows me to use search_data without using .joins with unscoped as I can daisy chain through the models that are scoped.

I ran into a problem when I decided I would like to match on text_start on a column that was already existing in the index. I added the code match code to my model:

searchkick text_start: [:model_number], index_name: -> { [ model_name.plural, Rails.env].join('_') }, settings: {number_of_shards: 1, number_of_replicas: 1}

I reindexed and included it in my search: .search(query, fields: [{model_number: :text_start}]), however, I could not get it to work. Looking at the field in Kibana I saw two entries:
screen shot 2016-10-06 at 8 48 57 pm

I thought having two entries may be an issue so I thought I would start fresh and delete all of my indices with: curl -XDELETE 'http://localhost:9200/_all/'

After reindexing the individual record I tried to search using text_start and would get the following: Searchkick::InvalidQueryError: [400] {"error":{"root_cause":[{"type":"query_parsing_exception","reason":"[match] analyzer [searchkick_autocomplete_search] not found","index":"services_development",

From this I realised that the numbers usually appended to my index name by Searchkick were not present. I then deleted the index again and this time created it with Service.reindex(import: false), using curl 'localhost:9200/_cat/indices?v' showed the index was created with the appended numbers. Using this already created index I then used reindex on the individual record and this time when I performed my search with text_start I was able to obtain results correctly.

@ankane
Copy link
Owner

ankane commented Oct 7, 2016

This use case seems pretty uncommon, so I don't think it belongs in the readme, but we could improve the error message.

@MrHubble
Copy link
Author

MrHubble commented Oct 7, 2016

Thanks. Even though it's an uncommon use case, is it still acceptable for
me to create the index and reindex this way? Am I creating any other
problems for myself? Just looking for reassurance that it's ok.

I'll look into how to improve the error message.

@jimmybaker
Copy link

I'm actually working on the same problem. We have a very large set of data to index and I need to provide progress of the reindex back to my users. To accomplish this I was going to do as you did above and send the progress as each batch is reindexed. I'm just worried that once I call Product.reindex(import: false) searchkick will begin using the newly created index although it isn't up to date yet.

@MrHubble
Copy link
Author

@jimmybaker I assume that as soon as you run Product.reindex(import: false) then all previous documents are deleted and that is now your index that Searchkick will use. Please let us know if you've found anything to suggest otherwise.

@MrHubble
Copy link
Author

MrHubble commented Oct 23, 2016

One thing to bear in mind when using Product.reindex(import: false) is that it will be trickier to reindex without downtime. One of the key features of using Searchkick with the standard Product.reindex is the ability to reindex without downtime, however using (import: false) will delete all documents so if your users attempt to search before you reindex by the individual record then they won't get the correct results.

I'm unsure if there's a way to reindex after changing a Searchkick setting with (import: false) without deleting all current documents.

Edit:
You could possibly use something like: #334 (comment)

@MrHubble MrHubble closed this as completed Dec 1, 2016
@lock lock bot locked as resolved and limited conversation to collaborators Dec 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants