Caution
|
This feature is a work in progress. Make sure to read the Limitations section! |
The integration with Elasticsearch is in development and should be considered experimental. We do think we have the basics covered and we are looking for feedback.
Patches can be sent as pull requests to the Github repository, but also general feedback, suggestions and questions are very welcome. To get in touch or find other interesting links for contributors, see the Community page of the Hibernate website.
The goal of integrating with Elasticsearch is to allow Hibernate Search users to benefit from the full-text capabilities integrated with Hibernate ORM but replacing the local Lucene based index with a remote Elasticsearch service.
There could be various reasons to prefer this over an "embedded Lucene" approach:
-
wish to separate the service running your application from the Search service
-
integrate with an existing Elasticsearch instance
-
benefit from Elasticsearch’s out of the box horizontal scalability features
-
explore the data updated by an Hibernate powered application using the Elasticsearch dashboard integrations such as Kibana
There are a couple of drawbacks compared to the embedded Lucene approach though:
-
incur a performance penalty of remote RPCs both for index updates and to run queries
-
need to manage an additional service
-
possibly need to buy and manage additional servers
Which solution is best will depend on the specific needs of your system and your organization.
Note
|
Why not use Elasticsearch directly
The #1 reason is that Hibernate Search integrates perfectly with Hibernate ORM. All changes done to your objects will trigger the necessary index changes transparently.
There is no more paradigm shift in your code. You are working on Hibernate ORM managed objects, doing your queries on object properties with a nice DSL. |
To experiment with the Elasticsearch integration you will have to download Elasticsearch and run it: Hibernate Search connects to an Elasticsearch node but does not provide one.
One option is to use the Elasticsearch Docker image (see here for Elasticsearch 2).
Hibernate Search expects an Elasticsearch cluster running version 2.x or 5.x. The version running on your cluster will be automatically detected on startup, and Hibernate Search will adapt its behavior based on the detected version.
When upgrading your Elasticsearch cluster though, some administrative tasks are still required on your cluster: Hibernate Search will not take care of those.
Warning
|
Hibernate Search does not support the |
The targeted version is largely transparent to Hibernate Search users, but there are a few differences in how Hibernate Search behaves depending on the Elasticsearch version that may affect you. The table details those differences.
2.x | 5.x | |
---|---|---|
Configuration required for purges |
None |
|
Datatype used for text fields in Elasticsearch |
|
|
Not implemented |
Implemented |
|
Implementation of |
|
|
Note
|
Hibernate Search internal tests run against Elasticsearch {testElasticsearchVersion} by default. |
In addition to the usual dependencies like Hibernate ORM and Hibernate Search,
you will need the new hibernate-search-elasticsearch
jar.
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-search-elasticsearch</artifactId>
<version>{hibernateSearchVersion}</version>
</dependency>
Hibernate Search can work with an Elasticsearch server without altering its configuration.
However some features offered by Hibernate Search require specific configuration:
-
on Elasticsearch 2.x only (not necessary on 5.x): if you want to be able to use the Hibernate Search MassIndexer with
purgeAllOnStart
enabled - it is enabled by default -, or to useFullTextSession.purge()
orFullTextSession.purgeAll()
, install thedelete-by-query
plugin -
if you want to retrieve the distance in a geolocation query, enable the
lang-groovy
plugin, see Elasticsearch Spatial queries -
if you want to use paging (as opposed to scrolling) on result sets larger than 10000 elements (for instance access the 10001st result), you may increase the value of the
index.max_result_window
property (default is 10000).
Configuration is minimal.
Add the configuration properties to your persistence.xml
or where you put the rest of your Hibernate Search configuration.
- Select Elasticsearch as the backend
-
hibernate.search.default.indexmanager elasticsearch
- Hostname and port for Elasticsearch
-
hibernate.search.default.elasticsearch.host http://127.0.0.1:9200
(default)You may also select multiple hosts (separated by whitespace characters), so that they are assigned requests in turns (load balancing):
hibernate.search.default.elasticsearch.host http://es1.mycompany.com:9200 http://es2.mycompany.com:9200
In the example above, the first request will go to
es1
, the second toes2
, the third toes1
, and so on.Also note having multiple hosts will enable failover: if one node happens to fail to serve a request (timeout, server error, invalid HTTP response, …), the same request will be sent to the next one; if the second request is served without error, the failure will be blamed on the node and no error will be reported to the application.
The failover feature will also be enabled when you only have one configured host but other hosts have been added thanks to automatic discovery (see below).
- Username for Elasticsearch connection
-
hibernate.search.default.elasticsearch.username ironman
(default is empty, meaning anonymous access) - Password for Elasticsearch connection
-
hibernate.search.default.elasticsearch.password j@rV1s
(default is empty)CautionIf you used HTTP instead of HTTPS in any of the Elasticsearch host URLs (see above), your password will be transmitted in clear text over the network.
- Select the index creation strategy
-
hibernate.search.default.elasticsearch.index_schema_management_strategy CREATE
(default)Let’s see the options for the
index_schema_management_strategy
property:Value Definition none
The index, its mappings and the analyzer definitions will not be created, deleted nor altered. Hibernate Search will not even check that the index already exists.
validate
The index, its mappings and analyzer definitions will be checked for conflicts with Hibernate Search’s metamodel. The index, its mappings and analyzer definitions will not be created, deleted nor altered.
update
The index, its mappings and analyzer definitions will be created, existing mappings will be updated if there are no conflicts. + Caution: if analyzer definitions have to be updated, the index will be closed automatically during the update.
create
The default: an existing index will not be altered, a missing index will be created along with their mappings and analyzer definitions.
drop-and-create
Indexes will be deleted if existing and then created along with their mappings and analyzer definitions. This will delete all content from the indexes!
drop-and-create-and-drop
Similar to
drop-and-create
but will also delete the index at shutdown. Commonly used for tests.CautionStrategies in production environmentsIt is strongly recommended to use either
none
orvalidate
in a production environment.drop-and-create
anddrop-and-create-and-drop
are obviously unsuitable in this context (unless you want to reindex everything upon every startup), andupdate
may leave your mapping half-updated in case of conflict.To be precise, if your mapping changed in an incompatible way, such as a field having its type changed, updating the mapping may be impossible without manual intervention. In this case, the
update
strategy will prevent Hibernate Search from starting, but it may already have successfully updated the mappings for another index, making a rollback difficult at best.Also, when updating analyzer definitions, Hibernate Search will stop the affected indexes during the update. This means the
update
strategy should be used with caution when multiple clients use Elasticsearch indexes managed by Hibernate Search: those clients should be synchronized in such a way that while Hibernate Search is starting, no other client tries to use the index.For these reasons, migrating your mapping should be considered a part of your deployment process and be planned cautiously.
NoteMapping validation is as permissive as possible. Fields or mappings that are unknown to Hibernate Search will be ignored, and settings that are more powerful than required (e.g. a field annotated with
@Field(index = Index.NO)
in Search but marked as"index": analyzed
in Elasticsearch) will be deemed valid.One exception should be noted, though: date formats must match exactly the formats specified by Hibernate Search, due to implementation constraints.
- Maximum time to wait for the successful execution of a request to the Elasticsearch server before failing (in ms)
-
hibernate.search.default.elasticsearch.request_timeout 60000
(default)The execution time of a request includes the time needed to establish a connection, to send the request, and to receive the whole response, optionally retrying in case of node failures.
- Maximum time to wait for a connection to the Elasticsearch server before failing (in ms)
-
hibernate.search.default.elasticsearch.connection_timeout 3000
(default) - Maximum time to wait for a response from the Elasticsearch server before failing (in ms)
-
hibernate.search.default.elasticsearch.read_timeout 60000
(default) - Maximum number of simultaneous connections to the Elasticsearch cluster
-
hibernate.search.default.elasticsearch.max_total_connection 20
(default) - Maximum number of simultaneous connections to a single Elasticsearch server
-
hibernate.search.default.elasticsearch.max_total_connection_per_route 2
(default) - Whether to enable automatic discovery of servers in the Elasticsearch cluster (
true
orfalse
) -
hibernate.search.default.elasticsearch.discovery.enabled false
(default)When using automatic discovery, the Elasticsearch client will periodically probe for new nodes in the cluster, and will add those to the server list (see
host
above). Similarly, the client will periodically check whether registered servers still respond, and will remove them from the server list if they don’t. - Time interval between two executions of the automatic discovery (in seconds)
-
hibernate.search.default.elasticsearch.discovery.refresh_interval 10
(default)This setting will only be taken into account if automatic discovery is enabled (see above).
- Scheme to use when connecting to automatically discovered nodes (
http
orhttps
) -
hibernate.search.default.elasticsearch.discovery.default_scheme http
(default)This setting will only be taken into account if automatic discovery is enabled (see above).
- Maximum time to wait for the indexes to become available before failing (in ms)
-
hibernate.search.default.elasticsearch.index_management_wait_timeout 10000
(default)This setting is ignored when the
NONE
strategy is selected, since the index will not be checked on startup (see above).This value must be lower than the read timeout (see above).
- Status an index must at least have in order for Hibernate Search to work with it (one of "green", "yellow" or "red")
-
hibernate.search.default.elasticsearch.required_index_status green
(default)Only operate if the index is at this level or safer. In development, set this value to
yellow
if the number of nodes started is below the number of expected replicas. - Whether to perform an explicit refresh after a set of operations has been executed against a specific index (
true
orfalse
) -
hibernate.search.default.elasticsearch.refresh_after_write false
(default)This is useful in unit tests to ensure that a write is visible by a query immediately without delay. This keeps unit tests simpler and faster. But you should not rely on the synchronous behaviour for your production code. Leave at
false
for optimal performance of your Elasticsearch cluster. - When scrolling, the minimum number of previous results kept in memory at any time
-
hibernate.search.elasticsearch.scroll_backtracking_window_size 10000
(default) - When scrolling, the number of results fetched by each Elasticsearch call
-
hibernate.search.elasticsearch.scroll_fetch_size 1000
(default) - When scrolling, the maximum duration
ScrollableResults
will be usable if no other results are fetched from Elasticsearch, in seconds -
hibernate.search.elasticsearch.scroll_timeout 60
(default)
Note
|
Properties prefixed with
This excludes properties related to the internal Elasticsearch client, which at the moment is common to every index manager (but this will change in a future version).
Excluded properties are |
Like in Lucene embedded mode, indexes are transparently updated when you create or update entities mapped to Hibernate Search. Simply use familiar annotations from [search-mapping].
The name of the index will be the lowercased name provided to @Indexed
(non qualified class name by default).
Hibernate Search will map the fully qualified class name to the Elasticsearch type.
The org.hibernate.search.annotations.Field
annotation allows you to provide a replacement value for null properties through the indexNullAs
attribute (see [field-annotation]), but this value must be provided as a string.
In order for your value to be understood by Hibernate Search (and Elasticsearch), the provided string must follow one of those formats:
-
For string values, no particular format is required.
-
For numeric values, use formats accepted by
Double.parseDouble
,Integer.parseInteger
, etc., depending on the actual type of your field. -
For booleans, use either
true
orfalse
. -
For dates (
java.util.Calendar
,java.util.Date
,java.time.*
), use the ISO-8601 format.The full format is
yyyy-MM-dd’T’HH:mm:ss.nZ[ZZZ]
(for instance2016-11-26T16:41:00.006+01:00[CET]
). Please keep in mind that part of this format must be left out depending on the type of your field, though. For ajava.time.LocalDateTime
field, for instance, the provided string must not include the zone offset (+01:00
) or the zone ID ([UTC]
), because those don’t make sense.Even when they make sense for the type of your field, the time and time zone may be omitted (if omitted, the time zone will be interpreted as the default JVM time zone).
Warning
|
Analyzers are treated differently than in Lucene embedded mode. |
Using the definition
attribute in the @Analyzer
annotation, you can refer to the name of the
built-in Elasticsearch analyzer, or custom analyzers already registered on your Elasticsearch instances.
More information on analyzers, in particular those already built in Elasticsearch, can be found in the Elasticsearch documentation.
# Custom analyzer
index.analysis:
analyzer.custom-analyzer:
type: custom
tokenizer: standard
filter: [custom-filter, lowercase]
filter.custom-filter:
type : stop
stopwords : [test1, close]
From there, you can use the custom analyzers by name in your entity mappings.
@Entity
@Indexed(index = "tweet")
public class Tweet {
@Id
@GeneratedValue
private Integer id;
@Field
@Analyzer(definition = "english") // Elasticsearch built-in analyzer
private String englishTweet;
@Field
@Analyzer(definition = "whitespace") // Elasticsearch built-in analyzer
private String whitespaceTweet;
@Field(name = "tweetNotAnalyzed", analyzer = Analyze.NO, store = Store.YES)
// Custom analyzer:
@Field(
name = "tweetWithCustom",
analyzer = @Analyzer(definition = "custom-analyzer")
)
private String multipleTweets;
}
You may also reference a built-in Lucene analyzer implementation using the @Analyzer.impl
attribute:
Hibernate Search will translate the implementation to an equivalent Elasticsearch built-in type, if possible.
Warning
|
Using the It should only be used when migrating an application that already used Hibernate Search, moving from an embedded Lucene instance to an Elasticsearch cluster. |
@Entity
@Indexed(index = "tweet")
public class Tweet {
@Id
@GeneratedValue
private Integer id;
@Field
@Analyzer(impl = EnglishAnalyzer.class) // Elasticsearch built-in "english" analyzer
private String englishTweet;
@Field
@Analyzer(impl = WhitespaceAnalyzer.class) // Elasticsearch built-in "whitespace" analyzer
private String whitespaceTweet;
}
You can also define analyzers within your Hibernate Search mapping using the @AnalyzerDef
annotation,
like you would do with an embedded Lucene instance.
When Hibernate Search creates the Elasticsearch indexes, the relevant definitions will then be automatically added as a
custom analyzer
in the index settings.
Two different approaches allow you to define your analyzers with Elasticsearch.
The first, recommended approach is to use the factories provided by the hibernate-search-elasticsearch
module:
-
org.hibernate.search.elasticsearch.analyzer.ElasticsearchCharFilterFactory
-
org.hibernate.search.elasticsearch.analyzer.ElasticsearchTokenFilterFactory
-
org.hibernate.search.elasticsearch.analyzer.ElasticsearchTokenizerFactory
Those classes can be passed to the factory
attribute of
the @CharFilterDef
, @TokenFilterDef
and @TokenizerDef
annotations.
The params
attribute may be used to define the type
parameter and any other parameter
accepted by Elasticsearch for this type.
The parameter values will be interpreted as JSON. The parser is not strict, though:
-
quotes around strings may be left out in some cases, as when a string only contains letters.
-
when quotes are required (e.g. your string may be interpreted as a number, and you don’t want that), you may use single quotes instead of double quotes (which are painful to write in Java).
Note
|
You may use the |
Elasticsearch*Factory
types@Entity
@Indexed(index = "tweet")
@AnalyzerDef(
name = "tweet_analyzer",
charFilters = {
@CharFilterDef(
name = "custom_html_strip",
factory = ElasticsearchCharFilterFactory.class,
params = {
@Parameter(name = "type", value = "'html_strip'"),
// One can use Json arrays
@Parameter(name = "escaped_tags", value = "['br', 'p']")
}
),
@CharFilterDef(
name = "p_br_as_space",
factory = ElasticsearchCharFilterFactory.class,
params = {
@Parameter(name = "type", value = "'pattern_replace'"),
@Parameter(name = "pattern", value = "'<p/?>|<br/?>'"),
@Parameter(name = "replacement", value = "' '"),
@Parameter(name = "tags", value = "'CASE_INSENSITIVE'")
}
)
},
tokenizer = @TokenizerDef(
factory = ElasticsearchTokenizerFactory.class,
params = {
@Parameter(name = "type", value = "'whitespace'"),
}
)
)
public class Tweet {
@Id
@GeneratedValue
private Integer id;
@Field
@Analyzer(definition = "tweet_analyzer")
private String content;
}
The second approach is to configure everything as if you were using Lucene: use the Lucene factories, their parameter names, and format the parameter values as required in Lucene. Hibernate Search will automatically convert these definitions to the Elasticsearch equivalent.
Warning
|
Referencing Lucene factories is not recommended with Elasticsearch because it will never allow you to take full advantage of Elasticsearch analysis capabilities. Here are the known limitations of the automatic translation:
Therefore, Lucene factories should only be referenced within analyzer definitions when migrating an application that already used Hibernate Search, moving from an embedded Lucene instance to an Elasticsearch cluster. |
@Entity
@Indexed(index = "tweet")
@AnalyzerDef(
name = "tweet_analyzer",
charFilters = {
@CharFilterDef(
name = "custom_html_strip",
factory = HTMLStripCharFilterFactory.class,
params = {
@Parameter(name = "escapedTags", value = "br,p")
}
),
@CharFilterDef(
name = "p_br_as_space",
factory = PatternReplaceCharFilterFactory.class,
params = {
@Parameter(name = "pattern", value = "<p/?>|<br/?>"),
@Parameter(name = "replacement", value = " ")
}
)
},
tokenizer = @TokenizerDef(
factory = WhitespaceTokenizerFactory.class
)
)
public class Tweet {
@Id
@GeneratedValue
private Integer id;
@Field
@Analyzer(definition = "tweet_analyzer")
private String content;
}
You can write custom field bridges and class bridges.
For class bridges and field bridges creating multiple fields,
make sure to make your bridge implementation also implement the MetadataProvidingFieldBridge
contract.
Caution
|
Creating sub-fields in custom field bridges is not supported. You create a sub-field when your This lack of support is due to Elasticsearch not allowing a field to have multiple types. In the example above, the field would have both the As an alternative, you may append a suffix to the original field name in order to create a sibling field, e.g. use This limitation is true in particular for field bridges applied to the |
/**
* Used as class-level bridge for creating the "firstName" and "middleName" document and doc value fields.
*/
public static class FirstAndMiddleNamesFieldBridge implements MetadataProvidingFieldBridge {
@Override
public void set(String name, Object value, Document document, LuceneOptions luceneOptions) {
Explorer explorer = (Explorer) value;
String firstName = explorer.getNameParts().get( "firstName" );
luceneOptions.addFieldToDocument( name + "_firstName", firstName, document );
document.add( new SortedDocValuesField( name + "_firstName", new BytesRef( firstName ) ) );
String middleName = explorer.getNameParts().get( "middleName" );
luceneOptions.addFieldToDocument( name + "_middleName", middleName, document );
document.add( new SortedDocValuesField( name + "_middleName", new BytesRef( middleName ) ) );
}
@Override
public void configureFieldMetadata(String name, FieldMetadataBuilder builder) {
builder
.field( name + "_firstName", FieldType.STRING )
.sortable( true )
.field( name + "_middleName", FieldType.STRING )
.sortable( true );
}
}
Note
|
This interface and |
You can write queries like you usually do in Hibernate Search: native Lucene queries and DSL queries (see [search-query]). We do automatically translate the most common types of Apache Lucene queries and all queries generated by the Hibernate Search DSL except more like this (see below).
Note
|
Unsupported Query DSL features
Queries written via the DSL work. Open a JIRA otherwise. The notable exception is more like this queries. Hibernate Search has a more advanced algorithm than Lucene (or Elasticsearch/Solr) which is not easily portable with what Elasticsearch exposes. If you need this feature, contact us. |
On top of translating Lucene queries, you can directly create Elasticsearch queries by using either its String format or a JSON format:
FullTextSession fullTextSession = Search.getFullTextSession(session);
QueryDescriptor query = ElasticsearchQueries.fromQueryString("title:tales");
List<?> result = fullTextSession.createFullTextQuery(query, ComicBook.class).list();
FullTextSession fullTextSession = Search.getFullTextSession(session);
QueryDescriptor query = ElasticsearchQueries.fromJson(
"{ 'query': { 'match' : { 'lastName' : 'Brand' } } }");
List<?> result = session.createFullTextQuery(query, GolfPlayer.class).list();
Caution
|
Date/time in native Elasticsearch queries
By default Elasticsearch interprets the date/time strings lacking the time zone as if they were represented using the UTC time zone. If overlooked, this can cause your native Elasticsearch queries to be completely off. The simplest way to avoid issues is to always explicitly provide time zones IDs or offsets when building native Elasticsearch queries. This may be achieved either by directly adding the time zone ID or offset in date strings, or by using the |
The Elasticsearch integration supports spatial queries by using either the DSL or native Elasticsearch queries.
For regular usage, there are no particular requirements for spatial support.
However, if you want to calculate the distance from your entities to a point without sorting by the distance to this point, you need to enable the Groovy plugin by adding the following snippet to your Elasticsearch configuration:
script.engine.groovy.inline.search: on
You may handle large result sets in two different ways, with different limitations.
For (relatively) smaller result sets, you may use the traditional offset/limit querying provided by the FullTextQuery
interfaces: setFirstResult(int)
and setMaxResults(int)
.
Limitations:
-
This will only get you as far as the 10000 first documents, i.e. when requesting a window that includes documents beyond the 10000th result, Elasticsearch will return an error. If you want to raise this limit, see the
index.max_result_window
property in Elasticsearch’s settings.
If your result set is bigger, you may take advantage of scrolling by using the scroll
method on org.hibernate.search.FullTextQuery
.
Limitations:
-
This method is not available in
org.hibernate.search.jpa.FullTextQuery
. -
The Elasticsearch implementation has poor performance when an offset has been defined (i.e.
setFirstResult(int)
has been called on the query before callingscroll()
). This is because Elasticsearch does not provide such feature, thus Hibernate Search has to scroll through every previous result under the hood. -
The Elasticsearch implementation allows only limited backtracking. Calling
scrollableResults.setRowNumber(4)
when currently positioned at index1006
, for example, may result in aSearchException
being thrown, because only 1000 previous elements had been kept in memory. You may work this around by tweaking the property:hibernate.search.elasticsearch.scroll_backtracking_window_size
(see Elasticsearch integration configuration). -
The
ScrollableResults
will become stale and unusable after a given period of time spent without fetching results from Elasticsearch. You may work this around by tweaking two properties:hibernate.search.elasticsearch.scroll_timeout
andhibernate.search.elasticsearch.scroll_fetch_size
(see Elasticsearch integration configuration). Typically, you will solve timeout issues by reducing the fetch size and/or increasing the timeout limit, but this will also increase the performance hit on Elasticsearch.
Sorting is performed the same way as with the Lucene backend.
If you happen to need an advanced Elasticsearch sorting feature that is not natively supported in SortField
or in Hibernate Search sort DSL, you may still create a sort from JSON, and even mix it with DSL-defined sorts:
QueryBuilder qb = fullTextSession.getSearchFactory()
.buildQueryBuilder().forEntity(Book.class).get();
Query luceneQuery = /* ... */;
FullTextQuery query = s.createFullTextQuery( luceneQuery, Book.class );
Sort sort = qb.sort()
.byNative( "authors.name", "{'order':'asc', 'mode': 'min'}" )
.andByField("title")
.createSort();
query.setSort(sort);
List results = query.list();
All fields are stored by Elasticsearch in the JSON document it indexes,
there is no specific need to mark fields as stored when you want to project them.
The downside is that to project a field, Elasticsearch needs to read the whole JSON document.
If you want to avoid that, use the Store.YES
marker.
You can also retrieve the full JSON document by using org.hibernate.search.elasticsearch.ElasticsearchProjectionConstants.SOURCE
.
query = ftem.createFullTextQuery(
qb.keyword()
.onField( "tags" )
.matching( "round-based" )
.createQuery(),
VideoGame.class
)
.setProjection( ElasticsearchProjectionConstants.SCORE, ElasticsearchProjectionConstants.SOURCE );
projection = (Object[]) query.getSingleResult();
If you’re looking for information about execution time, you may also use org.hibernate.search.elasticsearch.ElasticsearchProjectionConstants.TOOK
and org.hibernate.search.elasticsearch.ElasticsearchProjectionConstants.TIMED_OUT
:
query = ftem.createFullTextQuery(
qb.keyword()
.onField( "tags" )
.matching( "round-based" )
.createQuery(),
VideoGame.class
)
.setProjection(
ElasticsearchProjectionConstants.SOURCE,
ElasticsearchProjectionConstants.TOOK,
ElasticsearchProjectionConstants.TIMED_OUT
);
projection = (Object[]) query.getSingleResult();
Integer took = (Integer) projection[1]; // Execution time (milliseconds)
Boolean timedOut = (Boolean) projection[2]; // Whether the query timed out
The Elasticsearch integration supports the definition of full text filters.
Your filters need to implement the ElasticsearchFilter
interface.
public class DriversMatchingNameElasticsearchFilter implements ElasticsearchFilter {
private String name;
public DriversMatchingNameElasticsearchFilter() {
}
public void setName(String name) {
this.name = name;
}
@Override
public String getJsonFilter() {
return "{ 'term': { 'name': '" + name + "' } }";
}
}
You can then declare the filter in your entity.
@Entity
@Indexed
@FullTextFilterDef(name = "namedDriver",
impl = DriversMatchingNameElasticsearchFilter.class)
public class Driver {
@Id
@DocumentId
private int id;
@Field(analyze = Analyze.YES)
private String name;
...
}
From then you can use it as usual.
ftQuery.enableFullTextFilter( "namedDriver" ).setParameter( "name", "liz" );
For static filters, you can simply extend the SimpleElasticsearchFilter
and provide an Elasticsearch filter in JSON form.
The optimization features documented in [search-optimize] are only partially implemented. That kind of optimization is rarely needed with recent versions of Lucene (on which Elasticsearch is based), but some of it is still provided for the very specific case of indexes meant to stay read-only for a long period of time:
-
The automatic optimization is not implemented and most probably never will be.
-
The manual optimization (
searchFactory.optimize()
) is implemented.
Search queries are logged to the org.hibernate.search.fulltext_query
category at DEBUG
level,
as when using an embedded Lucene instance (the query format is Elasticsearch’s, though).
In addition, you can enable the logging of every single request sent to the Elasticsearch cluster
by enabling TRACE
logging for the log category org.hibernate.search.elasticsearch.request
.
Not everything is implemented yet. Here is a list of known limitations.
Please check with JIRA and the mailing lists for updates, but at the time of writing this at least the following features are known to not work yet:
-
Query timeouts: HSEARCH-2399
-
MoreLikeThis queries: HSEARCH-2395
-
@IndexedEmbedded.indexNullAs
: HSEARCH-2389 -
@AnalyzerDiscriminator
: HSEARCH-2428 -
Mixing Lucene based indexes and Elasticsearch based indexes (partial support is here though)
-
Hibernate Search does not make use of nested objects nor parent child relationship mapping HSEARCH-2263. This is largely mitigated by the fact that Hibernate Search does the denormalization itself and maintain data consistency when nested objects are updated.
-
There is room for improvements in the performances of the MassIndexer implementation
-
Our new Elasticsearch integration module does not work in OSGi environments. If you need this, please vote for: HSEARCH-2524.
Depending on the Elasticsearch version you use, you may encounter bugs that are specific to that version. Here is a list of known Elasticsearch bugs, and what to do about it.
-
Mapping
java.time.ZonedDateTime
won’t work with Elasticsearch 2.4.1 because of a JodaTime bug affecting Elasticsearch: HSEARCH-2414.Fix: Upgrade to Elasticsearch 2.4.2.
More information about Elasticsearch can be found on the Elasticsearch website and its reference documentation.