Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refresh_interval #27

Closed
missinglink opened this issue Nov 29, 2014 · 2 comments
Closed

refresh_interval #27

missinglink opened this issue Nov 29, 2014 · 2 comments

Comments

@missinglink
Copy link
Member

We used to set a very high refresh_interval but somewhere along the way the setting seems to have been removed.

I'm assuming we can get a performance improvement by reducing refresh to something like 1m instead of 1s, which would mean newly indexed docs wouldn't appear in the search for at most 1m but would mean ES isn't creating new segments all the time.

Additionally this may save RAM as we are building less inverted indexes and less FSTs, plus the commit point log would be 60x smaller, resulting in a faster startup times on larger indexes.

@missinglink
Copy link
Member Author

So.. after playing with this for some time I can remember why it was disabled in the first place.

In my local config (~/pelias.json) I have the following set:

"elasticsearch": {
  "settings": {
    "index": {
      "refresh_interval": "-1"
    }
  }
}

I personally prefer this setting because it speeds up imports (how much I'm not quite sure) BUT at the expense of obviously not refreshing the index, ever.

What it basically means is that for newbies starting out with Pelias, they can run a long import and then query the index to find 0 results, which could be confusing and costly to debug.

Plus I'm pretty sure the segments are not written to disk, meaning data loss on failure.

For those reasons I'm going to leave the setting at the default (which is 1s) and allow advanced users to set their own refresh_interval as long as they are aware of the effects of doing so.

I would recommend that you either leave this at the default setting or make liberal use of curl -s -X POST "localhost:9200/pelias/_refresh";.

cc/ @heffergm @sevko @hkrishna

@missinglink
Copy link
Member Author

For more info on the ES internals, I would recommend this chapter: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/inside-a-shard.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant