Allow custom Analyzers for indexing in Neo4j 2+ #1346

ccondit · 2013-10-22T16:52:59Z

Neo4j 1.x allowed clients to supply a custom Lucene Analyzer class providing tremendous flexibility for searching Neo4j graphs. As of 2.0.0M6 there's no way to specify such a custom analyzer with the new schema indexes. It would be a great benefit to re-enable this feature.

See also:
http://stackoverflow.com/questions/19455802/custom-analyzer-with-neo4js-automatic-indexing

jakewins · 2013-10-23T13:28:52Z

This is by design, we don't want to expose lucene-specific features through the API. This is to allow us to build our own index implementations later on without breaking the API. Exposing Lucene through our current API has indeed lent tremendous power to it, but it has also made it virtually impossible for us to improve the indexing past Lucene.

If you want to use custom lucene features, you'll have to roll your own stand-alone indexing, or build a custom indexing provider.

However, I'm very interested in what specific use cases you have for the Analyzer. Any input on use cases will of course be valuable when we look at what features to add next to the new indexing. Could you give some examples of what you'd like to use the custom analyzer for?

ccondit · 2013-10-23T15:33:42Z

In my use case there are Lucene filter chains dealing with specific constructs such as part numbers that the "default" fulltext Lucene analyzer often compromises. We also stopped relying on Lucene's stemmers for fulltext search and switched to a lemmatization filter which delivers much better results. Since we have different property types the PerFieldAnalyzerWrapper is very useful for switching analyzers for a Neo4j part number property vs a Neo4j fulltext property.

I completely understand not wanting to allow Lucene to leak through the abstraction. Is there documentation on building a custom index provider in Neo4j 2+?

jexp · 2013-10-24T00:06:55Z

It is pretty simple, I did it for mapdb.

You can base yours on the lucene index provider.

jakewins · 2013-10-28T16:25:40Z

@ccondit brilliant, thanks for the feedback. As you will notice if you dig into the new backend, there is no clear support built in for fulltext indexes. It was deemed out of scope for 2.0 ( for now we refer people to the legacy indexes for that feature ). However, we have all the intention in the world to add full text search to the new indexes, and want to include good cypher constructs for the various cool things that can be done with full text indexing.

I'll add your notes to the story that tracks full text indexing, and we'll make sure to take that into account when we get to working on it. It will be a bit further down the line though, we've got some big ticket items in between that have higher prio, so if you want this now, rolling your own is your best option. Word of warning: these provider SPIs are internal, and so may change without notice.

jakewins · 2015-08-07T19:19:35Z

Closing this as this remains a wontfix, with a pointer to the workaround outlined above. Alternatively, one could run ElasticSearch next to Neo to handle these full text cases.

ccondit · 2015-08-07T22:47:16Z

@jakewins - just out of curiosity is it possible to run both ElasticSearch and Neo4j on the same JVM? Looks like the Lucene dependency would conflict.

jakewins closed this as completed Aug 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow custom Analyzers for indexing in Neo4j 2+ #1346

Allow custom Analyzers for indexing in Neo4j 2+ #1346

ccondit commented Oct 22, 2013

jakewins commented Oct 23, 2013

ccondit commented Oct 23, 2013

jexp commented Oct 24, 2013

jakewins commented Oct 28, 2013

jakewins commented Aug 7, 2015

ccondit commented Aug 7, 2015

Allow custom Analyzers for indexing in Neo4j 2+ #1346

Allow custom Analyzers for indexing in Neo4j 2+ #1346

Comments

ccondit commented Oct 22, 2013

jakewins commented Oct 23, 2013

ccondit commented Oct 23, 2013

jexp commented Oct 24, 2013

jakewins commented Oct 28, 2013

jakewins commented Aug 7, 2015

ccondit commented Aug 7, 2015