Skip to content

Change Tracking and Lazy Loading

Martin Ledvinka edited this page May 27, 2024 · 4 revisions

There are two important features whose implementation is based on similar concepts:

  1. Change tracking
  2. Lazy loading
Change Tracking

Change tracking determines which entity attributes have changed during a transaction and how and writes these changes into the underlying repository. Change tracking works in two ways - if an application loads an entity into the persistence context and makes changes to it, these changes are automatically written out into the repository. The other way is when the application merges a detached entity into the persistence context.

From the change tracking point of view, merging detached instance is simpler - it just means comparing the merged object with the existing data and registering the changes.

Tracking changes on a managed object is more difficult, because entities are just regular POJOs with no special hooks that would register calls to getters and setters.

Lazy Loading

Similarly to tracking changes on managed objects, lazy loading requires a mechanism that would notify the persistence provider that a lazily loaded attribute has been accessed and it should load its value.

Since 2.0.0

JOPA 2.0.0 brought a major rewrite of the change tracking and lazy loading strategies.

Change Tracking

Since version 2.0.0, JOPA uses two strategies of change tracking. One, compatible with the change tracking in versions prior to 2.0.0 uses generated proxy classes that subclass entity classes and override setters with versions that notify the correspond persistence context about setter invocation, so that changes are tracked and immediately propagated into the persistence context. This strategy is default when using the Jena and OWLAPI OntoDriver, which implement transactions by creating transactional snapshots of the underlying repository. This way, transactional changes can be written into this snapshot and used in reasoning during the transaction itself. However, this strategy may be problematic in case the application relies on the exact type of entity classes and the generated subclasses do not match this (for example, using entity classes as keys in a Map and then trying to find the corresponding value by entity.getClass().

The second strategy does not use generated proxies. Instead, it lets the application to make any changes it wants during a transaction. At commit time, it calculates the changes made to objects that were managed in the transactional persistence context and propagates these changes to the underlying repository. This strategy is by default used by the RDF4J OntoDriver as it is the best compatible with the way RDF4J API works. In addition, it has the benefit of not requiring generated subclasses for change tracking, which may be a benefit in some scenarios.

Lazy Loading

Lazy loading is also implemented using generated proxy objects (similar to the way Hibernate does). These proxy objects extend the corresponding attribute type and whenever an operation is triggered on them, they load the attribute value and are replaced by this value. Consider the following example entity:

@OWLClass(iri = "skos:Concept")
public class Concept {

  @Id
  private URI id;

  @OWLAnnotationProperty(iri = "skos:prefLabel")
  private MultilingualString label;

  @OWLObjectProperty(iri = "skos:related", fetch = FetchType.LAZY)
  private Set<Concept> related;
}

Now, if we load an instance, the following will be true

assert entity.getRelated() instanceof LazyLoadingProxy;
assert !(entity.getRelated() instanceOf HashSet);
assert entity.getRelated().isEmpty() == false;
assert entity.getRelated() instanceof HashSet;

We can see that first the entity.getRelated() returns a proxy object. Once we call a method on it, it fetches the underlying data and is replaced by them. Singular attributes work the same way.

This also means that lazy loading has to be triggered only when an entity is managed in a persistence context. Once an entity is detached, lazy loading proxies are replaced by empty values (empty collection for plural attributes, null for singular ones).

Since 2.0.0, all @Types, @OWLDataProperty and @OWLAnnotationProperty attributes are always loaded eagerly. It is simpler to implement and makes no difference on repository access level.

Prior to 2.0.0

Before JOPA 2.0.0, AspectJ aspects were used to provide change tracking/lazy loading capabilities. JOPA's entity listening aspect join points had to be weaved into entity classes either at compile or load time. The setter join point would call the setter aspects whenever entity setter was called, so that the corresponding persistence context would be notified of the modification. Analogously, getter join point would notify the persistence context so that it could check whether the attribute was lazily loaded and its value should be fetched.

Weaving was usually done at compile time, requiring the AspectJ Maven plugin to process the entity classes. Getters and setters of a compiled entity class may thus look as follows:

public String getFirstName() {
  JoinPoint var1 = Factory.makeJP(ajc$tjp_0, this, this);
  BeanListenerAspect.aspectOf().beforeGetter(var1);
  return this.firstName;
}

public void setFirstName(String firstName) {
  JoinPoint var2 = Factory.makeJP(ajc$tjp_1, this, this, firstName);
  this.firstName = firstName;
  BeanListenerAspect.aspectOf().afterSetter(var2);
}

However, the weaving step would often bring issues when entity classes would not be correctly woven, and the application would thus fail. This usually happened in IDEs during development, full-blown Maven build typically resolved the issue. The weaving step also brought problems when working with annotation processors and similar compilation-related tools like Lombok.