Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom Analyzers for indexing in Neo4j 2+ #1346

Closed
ccondit opened this issue Oct 22, 2013 · 6 comments
Closed

Allow custom Analyzers for indexing in Neo4j 2+ #1346

ccondit opened this issue Oct 22, 2013 · 6 comments

Comments

@ccondit
Copy link

ccondit commented Oct 22, 2013

Neo4j 1.x allowed clients to supply a custom Lucene Analyzer class providing tremendous flexibility for searching Neo4j graphs. As of 2.0.0M6 there's no way to specify such a custom analyzer with the new schema indexes. It would be a great benefit to re-enable this feature.

See also:
http://stackoverflow.com/questions/19455802/custom-analyzer-with-neo4js-automatic-indexing

@jakewins
Copy link
Contributor

This is by design, we don't want to expose lucene-specific features through the API. This is to allow us to build our own index implementations later on without breaking the API. Exposing Lucene through our current API has indeed lent tremendous power to it, but it has also made it virtually impossible for us to improve the indexing past Lucene.

If you want to use custom lucene features, you'll have to roll your own stand-alone indexing, or build a custom indexing provider.

However, I'm very interested in what specific use cases you have for the Analyzer. Any input on use cases will of course be valuable when we look at what features to add next to the new indexing. Could you give some examples of what you'd like to use the custom analyzer for?

@ccondit
Copy link
Author

ccondit commented Oct 23, 2013

In my use case there are Lucene filter chains dealing with specific constructs such as part numbers that the "default" fulltext Lucene analyzer often compromises. We also stopped relying on Lucene's stemmers for fulltext search and switched to a lemmatization filter which delivers much better results. Since we have different property types the PerFieldAnalyzerWrapper is very useful for switching analyzers for a Neo4j part number property vs a Neo4j fulltext property.

I completely understand not wanting to allow Lucene to leak through the abstraction. Is there documentation on building a custom index provider in Neo4j 2+?

@jexp
Copy link
Member

jexp commented Oct 24, 2013

It is pretty simple, I did it for mapdb.

You can base yours on the lucene index provider.

@jakewins
Copy link
Contributor

@ccondit brilliant, thanks for the feedback. As you will notice if you dig into the new backend, there is no clear support built in for fulltext indexes. It was deemed out of scope for 2.0 ( for now we refer people to the legacy indexes for that feature ). However, we have all the intention in the world to add full text search to the new indexes, and want to include good cypher constructs for the various cool things that can be done with full text indexing.

I'll add your notes to the story that tracks full text indexing, and we'll make sure to take that into account when we get to working on it. It will be a bit further down the line though, we've got some big ticket items in between that have higher prio, so if you want this now, rolling your own is your best option. Word of warning: these provider SPIs are internal, and so may change without notice.

@jakewins
Copy link
Contributor

jakewins commented Aug 7, 2015

Closing this as this remains a wontfix, with a pointer to the workaround outlined above. Alternatively, one could run ElasticSearch next to Neo to handle these full text cases.

@jakewins jakewins closed this as completed Aug 7, 2015
@ccondit
Copy link
Author

ccondit commented Aug 7, 2015

@jakewins - just out of curiosity is it possible to run both ElasticSearch and Neo4j on the same JVM? Looks like the Lucene dependency would conflict.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants