Skip to content

Entity Stores

Vyacheslav Lukianov edited this page Jul 1, 2020 · 15 revisions

The Entity Stores layer is designed to access data as entities with attributes and links. Use a transaction to create, modify, read and query data. Transactions are quite similar to those on the Environments layer, though the Entity Store API is much richer in terms of querying data. The API and the implementation live in the jetbrains.exodus.entitystore package.

PersistentEntityStore
Transactions
Entities
Properties
Links
Blobs
Queries
    Iterating over All Entities of Specified Type
    EntityIterable
    EntityIterator
    Searching by Property Value
    Searching in Range of Property Values
    Traversing Links
    SelectDistinct and SelectManyDistinct
    Binary Operations
    Searching for Entities Having Property, Link, Blob
    Sorting
    Other Goodies
Sequences

PersistentEntityStore

To open or create an entity store, create an instance of PersistentEntityStore with the help of the PersistentEntityStores utility class:

PersistentEntityStore entityStore = PersistentEntityStores.newInstance("/home/me/.myAppData");

PersistentEntityStore works over Environment, so the method that is shown above implicitly creates an Environment with the same location. Each PersistentEntityStore has a name. You can create several entity stores with different names over an Environment. If you don't specify a name, the default name is used.

PersistentEntityStore has different methods to create an instance of a PersistentEntityStore. In addition to the underlying Environment, you can specify the BlobValut and PersistentEntityStoreConfig. BlobVault is a base class that describes an interface to binary large objects (BLOBs) that are used internally by the implementation of a PersistentEntityStore. If you don't specify a BlobVault when you create a PersistentEntityStore, an instance of the FileSystemBlobVault class is used. If you don't specify the PersistentEntityStoreConfig when you create a PersistentEntityStore, PersistentEntityStoreConfig.DEFAULT is used.

Like ContextualEnvironment, PersistentEntityStore is always aware of the transaction that is started in the current thread. The getCurrentTransaction() method returns the transaction that is started in the current thread or null if there is no such transaction.

When you are finished working with a PersistentEntityStore, call the close() method.

Transactions

Entity store transactions are quite similar to the Environment layer transactions. To manually start a transaction, use beginTransaction():

final StoreTransaction txn = store.beginTransaction(); 

or beginReadonlyTransaction():

final StoreTransaction txn = store.beginReadonlyTransaction();

An attempt to modify data in a read-only transaction fails with a ReadonlyTransactionException.

Any transaction should be finished, meaning it is either aborted or committed, where abort() is the only way to finish a read-only transaction. The transaction can also be flushed or reverted. The methods commit() and flush() return true if they succeed. If any method returns false, a database version mismatch has occurred. In this case, there are two possibilities: to abort the transaction and finish or revert the transaction and continue. An unsuccessful flush implicitly reverts the transaction and moves it to the latest (newest) database snapshot, so database operations can be repeated against it:

StoreTransaction txn = beginTransaction();
try {
    do {
        // do something
        // if txn has already been aborted in user code
        if (txn != getCurrentTransaction()) {
            txn = null;
            break;
        }
    } while (!txn.flush());
} finally {
    // if txn has not already been aborted in execute()
    if (txn != null) {
        txn.abort();
    }
}

If you don't care for such spinning and don't want to control the results of flush() and commit(), you can use the executeInTransaction(), executeInExclusiveTransaction(), executeInReadonlyTransaction(), computeInTransaction() and computeInReadonlyTransaction() methods.

Entities

Entities can have properties and blobs, and can be linked. Each property, blob, or link is identified by its name. Although entity properties are expected to be Comparable, only Java primitive types, Strings, and ComparableSet values can be used by default. Use the PersistentEntityStore.registerCustomPropertyType() method to define your own property type.

Imagine that your application must include a user management system. All further samples imply that you have accessible StoreTransaction txn. Let's create a new user:

final Entity user = txn.newEntity("User");

Each Entity has a string entity type and its unique ID which is described by EntityId:

final String type = user.getType();
final EntityId id = user.getId();

The entity ID may be used as a part of URL or in any other way to load the entity:

final Entity user = txn.getEntity(id);

Properties

Let's create a user with a specific loginName, fullName, email and password:

final Entity user = txn.newEntity("User");
user.setProperty("login", loginName);
user.setProperty("fullName", fullName);
user.setProperty("email", email);
final String salt = MessageDigestUtil.sha256(Double.valueOf(Math.random()).toString());
user.setProperty("salt", salt);
user.setProperty("password", MessageDigestUtil.sha256(salt + password));

The MessageDigestUtil class from the utils module is used to encrypt the password.

Links

The user management system should probably be able to save additional information about a user, including age, bio, and avatar. It's reasonable not to save this information directly in a User entity, but to create a UserProfile one and link it with the user:

final Entity userProfile = txn.newEntity("UserProfile");
userProfile.setLink("user", user);
user.setLink("userProfile", userProfile);
userProfile.setProperty("age", age);

Reading profile of a user:

final Entity userProfile = user.getLink("userProfile");
if (userProfile != null) {
    // read properties of userProfile
}

The method setLink() sets the new link and overrides previous one. It is also possible to add a new link that does not affect existing links. Suppose users can be logged in with the help of different AuthModules, such as LDAP or OpenID. It makes sense to create an entity for each auth module and link it with the user:

final Entity authModule = txn.newEntity("AuthModule");
authModule.setProperty("type", "LDAP");
user.addLink("authModule", authModule);
authModule.setLink("user", user);

Iterating over all user's auth modules:

for (Entity authModule: user.getLinks("authModule")) {
    // read properties of authModule
}

It's also possible to delete a specific auth module:

user.deleteLink("authModule", authModule);

or delete all available auth modules:

user.deleteLinks("authModule");

Blobs

Some properties cannot be expressed as Strings or primitive types, or their values are too large. For these cases, it is better to save large strings (like the biography of a user) in a blob string instead of a property. For raw binary data like images and media, use blobs:

userProfile.setBlobString("bio", bio);
userProfile.setBlob("avatar", file);

A blob string is similar to a property, but it cannot be used in Search Queries. To read a blob string, use the Entity.getBlobString() method.

The value of a blob can be set as java.io.InputStream or java.io.File. The second method is preferred when setting a blob from a file. To read a blob, use the Entity.getBlob() method. You are not required to and should not close the input stream that is returned by the method. Concurrent access to a single blob within a single transaction is not possible.

Queries

StoreTransaction contains a lot of methods to query, sort, and filter entities. All of them return an instance of EntityIterable.

Iterating over All Entities of Specified Type

Let's iterate over all users and print their full names:

final EntityIterable allUsers = txn.getAll("User);
for (Entity user: allUsers) {
    System.out.println(user.getProperty("fullName"));
}

As you can see, EntityIterable is Iterable<Entity>.

EntityIterable

EntityIterate lets you lazily iterate over entities. EntityIterable is valid only against a particular database snapshot, so finishing the transaction or moving it to the newest snapshot (flush(), revert()) breaks the iteration. If you need to flush the current transaction during an iteration over an EntityIterable, you have to manually load the entire entity iterable to a list and then iterate over the list.

You can find out the size of EntityIterable without iterating:

final long userCount = txn.getAll("User).size();

Even though the size() method performs faster than an iteration, it can be quite slow for some iterables. Xodus does a lot of caching internally, so sometimes the size of an EntityIterable can be computed quite quickly. You can check if it can be computed quickly by using the count() method:

final long userCount = txn.getAll("User").count();
if (userCount >= 0) {
    // result for txn.getAll("User") is cached, so user count is known
}

The count() method checks if the result (a sequence of entity ids) is cached for the EntityIterable. If the sequence is cached, the size is returned quickly. If the result is not cached, the count() method returns -1.

In addition to size() and count(), which always return an actual value (if not -1), there are eventually consistent methods getRoughCount() and getRoughSize(). If the result for the EntityIterable is cached, these methods return the same value as count() and size() do. If the result is not cached, Xodus can internally cache the value of last known size of the EntityIterable. If the last known size is cached, getRoughCount() and getRoughSize() return it. Otherwise, getRoughCount() returns -1 and getRoughSize() returns the value of size().

Use the isEmpty() method to check if an EntityIterable is empty. In most cases, it is faster than getting size(), and is returned immediately if the EntityIterable's result is cached.

EntityIterator

EntityIterator is an iterator of EntityIterable. It is an Iterator<Entity>, but it also lets you enumerate entity IDs instead of entities using method nextId(). Getting only IDs provides better iteration performance.

EntityIterator contains the dispose() method that releases all resources that the iterator possibly consumes. The shouldBeDisposed() method definitely says if it does. You can call the dispose() method implicitly in two cases: if an iteration finishes and hasNext() returns false and if the transaction finishes or moves to the latest snapshot (any of commit(), abort(), flush() or revert() is called). Sometimes, it makes sense to call dispose() manually. For example, to check whether an EntityIterable is empty can look like this:

boolean isEmptyIterable(final EntityIterable iterable) {
    final EntityIterator it = iterable.iterator();
    final boolean result = !it.hasNext();
    if (!result && it.shouldBeDisposed()) {
        it.dispose();
    }
    return result;
}

Searching by Property Value

To log in a user with the provided credentials (loginName and password), you must first find all of the users with the specified loginName:

final EntityIterable candidates = txn.find("User", "login", loginName);

Then, you have to iterate over the candidates and check if the password matches:

Entity loggedInUser = null;
for (Entity candidate: candidates) {
    final String salt = candidate.getProperty("salt");
    if (MessageDigestUtil.sha256(salt + password).equals(candidate.getProperty("password"))) {
        loggedInUser = candidate;
        break;
    }
}

return loggedInUser; 

If you want to log in users with email also, calculate candidates as follows:

final EntityIterable candidates = txn.find("User", "login", loginName).union(txn.find("User", "email", email));

To find user profiles of users with a specified age:

final EntityIterable little15Profiles = txn.find("UserProfile", "age", 15);

Please note that search by string property values is case-insensitive.

Searching in Range of Property Values

To search for user profiles of users whose age is in the range of [17-23], inclusively:

final EntityIterable studentProfiles = txn.find("UserProfile", "age", 17, 23);

Another case of range search is to search for entities with a string property that starts with a specific value:

final EntityIterable userWithFullNameStartingWith_a = txn.findStartingWith("User", "fullName", "a");

Please note that search by string property values is case-insensitive.

Traversing Links

One method for traversing links is already mentioned above: Entity.getLinks(). It is considered as a query because it returns an EntityIterable. It lets you iterate over outgoing links of an entity with a specified name.

It is also possible to find incoming links. For example, let's search for user who uses a particular auth module:

final EntityIterable ldapUsers = txn.findLinks("User", ldapAuthModule, "authModule");
final EntityIterator ldapUsersIt = ldapUsers.iterator();
return ldapUsersIt.hasNext() ? ldapUsersIt.next() : null; 

SelectDistinct and SelectManyDistinct

To search for users whose age is in the range of [17-23], inclusively:

final EntityIterable studentProfiles = txn.find("UserProfile", "age", 17, 23);
final EntityIterable students = studentProfiles.selectDistinct("user");

To get all auth modules of users whose age is in the range of [17-23], inclusively:

final EntityIterable studentProfiles = txn.find("UserProfile", "age", 17, 23);
final EntityIterable students = studentProfiles.selectDistinct("user");
final EntityIterable studentAuthModules = students.selectManyDistinct("authModule");

Use the selectDistinct operation if the corresponding link is single, meaning that it is set using the setLink() method. If the link is multiple, meaning thatit is set using the addLink() method, use selectManyDistinct. Results of both the selectDistinct and selectManyDistinct operations never contain duplicate entities. In addition, the result of selectManyDistinct can contain null. For example, if there is a user with no auth module.

Binary operations

There are four binary operations that are defined for EntityIterable: intersect(), union(), minus() and concat(). For all of them, the instance is a left operand, and the parameter is a right operand.

Let's search for users whose login and fullName start with "xodus" (case-insensitively):

final EntityIterable xodusUsers = txn.findStartingWith("User", "login", "xodus").intersect(txn.findStartingWith("User", "fullName", "xodus"));

Users whose login or fullName start with "xodus":

final EntityIterable xodusUsers = txn.findStartingWith("User", "login", "xodus").union(txn.findStartingWith("User", "fullName", "xodus"));

Users whose login and not fullName start with "xodus":

final EntityIterable xodusUsers = txn.findStartingWith("User", "login", "xodus").minus(txn.findStartingWith("User", "fullName", "xodus"));

There is no suitable sample for the concat() operation, it just concatenates results of two entity iterables.

The result of a binary operation (EntityIterable) itself can be an operand of a binary operation. You can use these results to construct a query tree of an arbitrary height.

Searching for Entities Having Property, Link, Blob

The StoreTransaction.findWithProp method returns entities of a specified type that have a property with the specified name. There are also methods StoreTransaction.findWithBlob and StoreTransaction.findWithLinks.

For example, if we do not require a user to enter a full name, the fullName property can be null. You can get users with or without full name by using findWithProp:

final EntityIterable usersWithFullName = txn.findWithProp("User", "fullName");
final EntityIterable usersWithoutFullName = txn.getAll("User").minus(txn.findWithProp("User", "fullName"));

To get user profiles with avatars using findWithBlob:

final EntityIterable userProfilesWithAvatar = txn.findWithBlob("UserProfile", "avatar");

The findWithBlob method is also applicable to blob strings:

final EntityIterable userProfilesWithBio = txn.findWithBlob("UserProfile", "bio");

To get users with auth modules:

final EntityIterable usersWithAuthModules = txn.findWithLinks("User", "authModule");

Sorting

To sort all users by login property:

final EntityIterable sortedUsersAscending = txn.sort("User", "login", true);
final EntityIterable sortedUsersDescending = txn.sort("User", "login", false);

To sort all users that have LDAP authentication by login property:

// at first, find all LDAP auth modules
final EntityIterable ldapModules = txn.find("AuthModule", "type", "ldap"); // case-insensitive!
// then select users
final EntityIterable ldapUsers = ldapModules.selectDistinct("user");
// finally, sort them
final EntityIterable sortedLdapUsers = txn.sort("User", "login", ldapUsers, true);

Sorting can be stable. For example, to get users sorted by login in ascending order and by fullName in descending (users with the same login name are sorted by full name in descending order):

final EntityIterable sortedUsers = txn.sort("User", "login", txn.sort("User", "fullName", false), true);

You can implement custom sorting algorithms with EntityIterable.reverse(). Wrap the sort results from EntityIterable with EntityIterable.asSortResult(). This lets the sorting engine recognize the sort result and use a stable sorting algorithm. If the source is not a sort result, the engine uses a non-stable sorting algorithm which is generally faster.

Sequences

Sequences let you get unique successive non-negative long IDs. Sequences are named. You can request a sequence by name with the StoreTransaction.getSequence() method. Sequences are persistent, which means that any flushed or committed transaction saves all dirty (modified) sequences which were requested by transactions created against the current EntityStore.