-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow custom Analyzers for indexing in Neo4j 2+ #1346
Comments
This is by design, we don't want to expose lucene-specific features through the API. This is to allow us to build our own index implementations later on without breaking the API. Exposing Lucene through our current API has indeed lent tremendous power to it, but it has also made it virtually impossible for us to improve the indexing past Lucene. If you want to use custom lucene features, you'll have to roll your own stand-alone indexing, or build a custom indexing provider. However, I'm very interested in what specific use cases you have for the Analyzer. Any input on use cases will of course be valuable when we look at what features to add next to the new indexing. Could you give some examples of what you'd like to use the custom analyzer for? |
In my use case there are Lucene filter chains dealing with specific constructs such as part numbers that the "default" fulltext Lucene analyzer often compromises. We also stopped relying on Lucene's stemmers for fulltext search and switched to a lemmatization filter which delivers much better results. Since we have different property types the PerFieldAnalyzerWrapper is very useful for switching analyzers for a Neo4j part number property vs a Neo4j fulltext property. I completely understand not wanting to allow Lucene to leak through the abstraction. Is there documentation on building a custom index provider in Neo4j 2+? |
It is pretty simple, I did it for mapdb.
You can base yours on the lucene index provider. |
@ccondit brilliant, thanks for the feedback. As you will notice if you dig into the new backend, there is no clear support built in for fulltext indexes. It was deemed out of scope for 2.0 ( for now we refer people to the legacy indexes for that feature ). However, we have all the intention in the world to add full text search to the new indexes, and want to include good cypher constructs for the various cool things that can be done with full text indexing. I'll add your notes to the story that tracks full text indexing, and we'll make sure to take that into account when we get to working on it. It will be a bit further down the line though, we've got some big ticket items in between that have higher prio, so if you want this now, rolling your own is your best option. Word of warning: these provider SPIs are internal, and so may change without notice. |
Closing this as this remains a wontfix, with a pointer to the workaround outlined above. Alternatively, one could run ElasticSearch next to Neo to handle these full text cases. |
@jakewins - just out of curiosity is it possible to run both ElasticSearch and Neo4j on the same JVM? Looks like the Lucene dependency would conflict. |
Neo4j 1.x allowed clients to supply a custom Lucene Analyzer class providing tremendous flexibility for searching Neo4j graphs. As of 2.0.0M6 there's no way to specify such a custom analyzer with the new schema indexes. It would be a great benefit to re-enable this feature.
See also:
http://stackoverflow.com/questions/19455802/custom-analyzer-with-neo4js-automatic-indexing
The text was updated successfully, but these errors were encountered: