Pivotal GemFire allows Indexes (or Indices) to be created on Region data to improve the performance of OQL queries.
In Spring Data GemFire (SDG), Indexes are declared with the index
element:
<gfe:index id="myIndex" expression="someField" from="/SomeRegion" type="HASH"/>
In Spring Data GemFire’s XML schema (a.k.a. SDG namespace), Index
bean declarations are not bound to a Region,
unlike GemFire’s native cache.xml
. Rather, they are top-level elements just like <gfe:cache>
. This allows
a developer to declare any number of Indexes on any Region whether they were just created or already exist,
a significant improvement over GemFire’s native cache.xml
format.
An Index
must have a name. A developer may give the Index
an explicit name using the name
attribute,
otherwise the bean name (i.e. value of the id
attribute) of the Index
bean definition is used as
the Index
name.
The expression
and from
clause form the main components of an Index
, identifying the data to index
(i.e. the Region identified in the from
clause) along with what criteria (i.e. expression
) is used
to index the data. The expression
should be based on what application domain object fields are used
in the predicate of application-defined OQL queries used to query and lookup the objects stored
in the Region.
For example, if I have a Customer
that has a lastName
property…
@Region("Customers")
class Customer {
@Id
Long id;
String lastName;
String firstName;
...
}
And, I also have an application defined SD[G] Repository to query for Customers
…
interface CustomerRepository extends GemfireRepository<Customer, Long> {
Customer findByLastName(String lastName);
...
}
Then, the SD[G] Repository finder/query method would result in the following OQL statement being executed…
SELECT * FROM /Customers c WHERE c.lastName = '$1'
Therefore, I might want to create an Index
like so…
<gfe:index id="myIndex" name="CustomersLastNameIndex" expression="lastName" from="/Customers" type="HASH"/>
The from
clause must refer to a valid, existing Region and is how an Index
gets applied to a Region.
This is not Sprig Data GemFire specific; this is a feature of Pivotal GemFire.
The Index
type
maybe 1 of 3 enumerated values defined by Spring Data GemFire’s
IndexType
enumeration: FUNCTIONAL
, HASH
and PRIMARY_KEY
.
Each of the enumerated values correspond to one of the QueryService
create[|Key|Hash]Index
methods invoked when the actual Index
is to be created (or "defined"; more on "defining"
Indexes below). For instance, if the IndexType
is PRIMARY_KEY
, then the
QueryService.createKeyIndex(..)
is invoked to create a KEY
Index
.
The default is FUNCTIONAL
and results in one of the QueryService.createIndex(..)
methods
being invoked.
See the Spring Data GemFire XML schema for a full set of options.
For more information on Indexing in Pivotal GemFire, see Working with Indexes in Pivotal GemFire’s User Guide.
In addition to creating Indexes upfront as Index
bean definitions are processed by Spring Data GemFire
on Spring container initialization, you may also define all of your application Indexes prior to creating
them by using the define
attribute, like so…
<gfe:index id="myDefinedIndex" expression="someField" from="/SomeRegion" define="true"/>
When define
is set to true
(defaults to false
), this will not actually create the Index
right then and there.
All "defined" Indexes are created all at once, when the Spring ApplicationContext
is "refreshed", or, that is,
when a ContextRefreshedEvent
is published by the Spring container. Spring Data GemFire registers itself as
an ApplicationListener
listening for the ContextRefreshedEvent
. When fired, Spring Data GemFire will call
QueryService.createDefinedIndexes().
Defining Indexes and creating them all at once helps promote speed and efficiency when creating Indexes.
See Creating Multiple Indexes at Once for more details.
Two Spring Data GemFire Index
configuration options warrant special mention here: ignoreIfExists
and override
.
These options correspond to the ignore-if-exists
and override
attributes on the <gfe:index>
element
in Spring Data GemFire’s XML schema, respectively.
Warning
|
Make sure you absolutely understand what you are doing before using either of these options. These options can
affect the performance and/or resources (e.g. memory) consumed by your application at runtime. As such, both of
these options are disabled (i.e. set to false ) in SDG by default.
|
Note
|
These options are only available in Spring Data GemFire and exist to workaround known limitations with Pivotal GemFire; there are no equivalent options or functionality available in GemFire itself. |
Each option significantly differs in behavior and entirely depends on the type of GemFire Index
Exception thrown.
This also means that neither option has any effect if a GemFire Index-type Exception is not thrown. These options
are meant to specifically handle GemFire IndexExistsExceptions
and IndexNameConflictExceptions
, which can occur
for various, sometimes obscure reasons. But, in general…
-
An IndexExistsException is thrown when there exists another
Index
with the same definition but different name when attempting to create anIndex
. -
An IndexNameConflictException is thrown when there exists another
Index
with the same name but possibly different definition when attempting to create anIndex
.
Spring Data GemFire’s default behavior is to fail-fast, always! So, neither Index
Exception will be "handled"
by default; these Index
Exceptions are simply wrapped in a SDG GemfireIndexException
and rethrown. If you wish
for Spring Data GemFire to handle them for you, then you can set either of these Index
bean definition options.
IgnoreIfExists
always takes precedence over Override
, primarily because it uses less resources given it returns
the "existing" Index
in both exceptional cases.
When an IndexExistsException
is thrown and ignoreIfExists
is set to true
(or <gfe:index ignore-if-exists="true">
),
then the Index
that would have been created by this Index
bean definition / declaration will be "ignored",
and the "existing" Index
will be returned.
There is very little consequence in returning the "existing" Index
since the Index
"definition" is the same,
as deemed by GemFire itself, not SDG.
However, this also means that no Index
with the “name” specified in your Index
bean definition / declaration
will "actually" exist from GemFire’s perspective either (i.e. with
QueryService.getIndexes()).
Therefore, you should be careful when writing OQL query statements that use Query Hints, especially Hints that refer
to the application Index
being "ignored". Those Query Hints will need to be changed.
Now, when an IndexNameConflictException
is thrown and ignoreIfExists
is set to true
(or <gfe:index ignore-if-exists="true">
),
then the Index
that would have been created by this Index
bean definition / declaration will also be "ignored",
and the "existing" Index will be returned, just like when an IndexExistsException
is thrown.
However, there is more risk in returning the "existing" Index
and "ignoring" the application’s definition
of the Index
when an IndexNameConflictException
is thrown since, for a IndexNameConflictException
, while the "names"
of the conflicting Indexes are the same, the "definitions" could very well be different! This obviously could have
implications for OQL queries specific to the application, where you would presume the Indexes were defined specifically
with the application data access patterns and queries in mind. However, if like named Indexes differ in definition,
this might not be the case. So, make sure you verify.
Note
|
SDG makes a best effort to inform the user when the Index being ignored is significantly different
in its definition from the "existing" Index . However, in order for SDG to accomplish this, it must be able to "find"
the existing Index , which is looked up using the GemFire API (the only means available).
|
When an IndexExistsException
is thrown and override
is set to true
(or <gfe:index override="true">
), then
the Index
is effectively "renamed". Remember, IndexExistsExceptions
are thrown when multiple Indexes exist,
all having the same "definition" but different "names".
Spring Data GemFire can only accomplish this using GemFire’s API, by first "removing" the "existing" Index
and then "recreating" the Index
with the new name. It is possible that either the remove or subsequent
create invocation could fail. There is no way to execute both actions atomically and rollback this joint operation
if either fails.
However, if it succeeds, then you have the same problem as before with the "ignoreIfExists" option. Any existing OQL
query statement using "Query Hints" referring to the old Index
by name must be changed.
Now, when an IndexNameConflictException
is thrown and override
is set to true
(or <gfe:index override="true">
),
then potentially the "existing" Index
will be "re-defined". I say "potentially", because it is possible for the
"like-named", "existing" Index
to have exactly the same definition and name when an IndexNameConflictException
is thrown.
If so, SDG is smart and will just return the "existing" Index as is, even on override
. There is no harm in this
since both the "name" and the "definition" are exactly the same. Of course, SDG can only accomplish this when
SDG is able to "find" the "existing" Index
, which is dependent on GemFire’s APIs. If it cannot find it,
nothing happens and a SDG GemfireIndexException
is thrown wrapping the IndexNameConflictException
.
However, when the "definition" of the "existing" Index
is different, then SDG will attempt to "recreate" the Index
using the Index
definition specified in the Index
bean definition /declaration. Make sure this is what you want
and make sure the Index
definition matches your expectations and application requirements.
It is probably not all that uncommon for IndexExistsExceptions
to be thrown, especially when
multiple configuration sources are used to configure GemFire (e.g. Spring Data GemFire, GemFire Cluster Config,
maybe GemFire native cache.xml
, the API, etc, etc). You should definitely prefer 1 configuration method here
and stick with it.
However, when does an IndexNameConflictException
get thrown?
One particular case is an Index
defined on a PARTITION
Region (PR). When an Index
is defined on
a PARTITION
Region (e.g. "X"), GemFire distributes the Index
definition (and name) to other peer members
in the cluster that also host the same PARTITION
Region (i.e. "X"). The distribution of this Index
definition
to and subsequent creation of this Index
by peer members on a "need-to-know" basis (i.e. those hosting the same PR)
is performed asynchronously.
During this window of time, it is possible that these "pending" PR Indexes
will not be identifiable by GemFire,
such as with a call to QueryService.getIndexes()
or with QueryService.getIndexes(:Region),
or even with QueryService.getIndex(:Region, indexName:String).
As such, the only way for SDG or other GemFire cache client applications (not involving Spring) to know for sure,
is to just attempt to create the Index
. If it fails with either an IndexNameConflictException
,
or even an IndexExistsException
, then you will know. This is because the QueryService
Index
creation waits on
"pending" Index
definitions, where as the other GemFire API calls do not.
In any case, SDG makes a best effort and attempts to inform the user what has or is happening along with
the corrective action. Given all GemFire QueryService.createIndex(..)
methods are synchronous, "blocking" operations,
then the state of GemFire should be consistent and accessible after either of these Index-type Exceptions are thrown,
in which case, SDG can inspect the state of the system and respond/act accordingly, based on the user’s
desired configuration.
In all other cases, SDG will simply fail-fast!