Skip to content

Indexing VS Legacy Indexing

subvertallchris edited this page Oct 24, 2014 · 4 revisions

There's a lot of confusion about how indexing works with Neo4j. If you read the documentation, you'll find indexes described but you'll also find a lot about "legacy indexing." You'll also see that to get full-text search, you need legacy indexing. Naturally, this leads to the question of what this gem supports, why, and what that means. Let's go through that.

This restates some of the information presented here, so you may want to review that for some specifics.

What type of indexes do this gem support?

We support the 2.0+ property indexes and labels. These, under the hood, are Lucene exact. They are case-sensitive and do not provide partial-word support. When you do a regex or range search in Cypher, you're stepping outside of an index and may take a performance hit. You can mitigate that by writing your query in such a way that it searches through a limited subset of your data, but that's another story.

Why doesn't this gem support legacy indexes?

The big reason is because we do not think the technical debt of implementing a feature described as "legacy" is worth what it provides. Legacy indexing changes the way you query and adds expectations of support and features that, frankly, we're not prepared to provide. Labels and exact indexes are suitable for most cases, so we support those. Beyond that, the "legacy" designation implies a looming removal from the code base so depending on it feels dangerous.

None of this is to say we would immediately reject a pull request if somebody added comprehensive support, but it is to say it's not on the project's roadmap.

I really need full-text searches. What can I do?

The lack of full-text search was a big issue for the maintainers, especially since Neo4j.rb 2.3 did support it and it's such a no-brainer feature. To work around this, we've been very happy using Searchkick in our projects. We added the methods it requires for interoperability with the gem and have had a great experience with it so far.

Under the hood, Searchkick uses Elasticsearch, which uses Lucene, which is what Neo4j is using. As far as performance is concerned, aside from the extra traffic to/from Elasticsearch, your search performance should be the same. We are of the opinion (as individuals -- this gem is not owned by Neo Tech and we are not spokespeople for them!) that Searchkick/Elasticsearch will do a better job since search is all it does; Neo4j, on the other hand, is not built with global searches as its strong point. It's perfect otherwise as an all-purpose database, it's matching and filtering and data modeling is unparalleled, but if you want to just plug a search query in, Elasticsearch is a better choice.

Clone this wiki locally