Skip to content
Nextgen Spring Data module for Neo4j supporting (not only) reactive data access and immutable support
Branch: master
Clone or download

README.adoc

Spring Data Neo4j⚡️RX

This is a work in progress project determining a possible future form of Spring Data Neo4j. Expect the README and even more the project to change quite a lot in the future weeks.

Architectural guidelines and principles

The next version of Spring Data Neo4j should be designed with the following principles in mind:

  • Rely completely on the Neo4j Java Driver, without introducing another "driver" or "transport" layer between the mapping framework and the driver.

  • Immutable entities and thus full support for Kotlin’s data classes right from the start.

  • Work result item / record and not result set oriented, thus not reading the complete result set before the mapping starts, but make a "row" the foundation for any mapping. This encourages generation of optimized queries, which should greatly reduce the object graph impedance mismatch we see in some projects using Neo4j-OGM.

  • Follow up on the reactive story for database access in general. Being immutable and row oriented are two main requirements for making things possible.

Modules

So far we have identified the following modules:

  • Schema: Should read a set of given classes and create a schema (Metagraph) from it

  • Mapping: Should take care of hydrating from results to domain objects and dehydrating vice versa. It can depend on schema, but only as a provider for property matching

  • Lifecycle: Lifecycle must not depend directly on mapping, but should only care whether an Object and its Relations are managed or not

  • Querying: Generates cypher queries, depends on schema

Those will be reassembled as packages inside Spring Data Neo4j RX. There are no short-term planes to create additional artifacts from those.

Schema

We used the new Spring Data JDBC project as blueprint for some ideas now. Spring Data JDBC doesn’t build up the schema upfront. Each time, a persistent entity is requested from the mapping context, that entity is read and fully described, including properties and all associations. We can implement it in the same way. The mapping context would return instances of Neo4jPersistentEntity which implements a Spring Data interface. To fulfill the contract however, we would read the classes and store them in a schema that is free of Spring dependencies. That way we we can avoid a compile time dependency to Spring Data and have an independent schema module, in which the mapping context is the connecting adapter.

Spring Data JDBC doesn’t restrict the supported or scanned classes from Spring Data sides. Our schema should also support non-annotated classes and be smart about naming things, but we will require at least the @Node or @Relationship annotation to the outside world.

The schema will life independent from Spring classes in org.neo4j.springframework.data.core.schema. Each property of a class that is not identified as a simple type by org.neo4j.springframework.data.core.schema.Neo4jSimpleTypes will be considered describing a relationship and thus required to be part of the schema as well.

Context

Note
Context in this sections refers especially to dirty tracking and dealing with state of entities.

We decided against a context for tracking changes, much like Spring Data JDBC did.

Other principles

  • Ensure that the underlying store (Neo4j) leaks as little as possible into the mapping. I.e. reuse @Id etc. and avoid adding custom variants of annotation whenever possible.

The embedded "problem"

Supporting the embedded use case will be solved on the drivers level.

Relationships

Simple relationships

We provide @Relationship for mapping relationships without properties. This annotation shall be used for 1:1 and 1:n mappings. It provides an attribute to specifiy the name of the relationship. When a relationship is mapped on both sides (that is a circular mapping in the domain model from start to endnode and back) the annotation provides the inverse attribute to cater the following use case:

@Node("User")
static class UserNode {

	@Relationship(type = "OWNS", inverse = "owner")
	List<BikeNode> bikes;
}

static class BikeNode {

	UserNode owner;

	UserNode renter;
}

Without the inverse that mapping would lead to three relationships:

  1. (user) - [:OWNS] → (bike)

  2. (user) ← [:OWNER] - (bike)

  3. (user) ← [:RENTER] - (bike)

Which is probably not the intended graph. Specifying inverse removes ambiguity here.

"Rich" relationships

There should be no means of using a relationship as aggregate root in SDN-RX (like it is today the case with @RelationshipEntity). Instead we suggest that properties of relationships are mapped to POJOs. This has the following requirements: On the node representing the start node, the relationship (either 1:1 or 1:n) has to be annotated with @Relationship specifying the type of the end node like this:

@Relationship("HAS_MEMBER", endNodeType = SoloArtistEntity.class)
private List<Member> member = new ArrayList<>();

The POJO, in this case Member is required to have the following structure :

public static class Member {

	private SoloArtistEntity artist;

	private Year joinedIn;

	private Year leftIn;
}

It must have exactly one attribute of endNodeType. All other attributes are mapped from the properties of the relationship.

The motivation behind this is that a relationship needs to be manifested in the domain model, but as the domain model usually isn’t a graph, it manifests itself as a thing, not a relationship as is. We prefer that people use the mapping framework domain centric, not database centric. In the relational world it is an anti pattern to map out n:m (intersection) tables. If they have attributes, a schema is usually refactored into a 1:n and a n:1 table and an entity structure. We don’t need another entity, though.

Labels

Note
Do we need support for dynamic labels? We propose a new @Node annotation that takes in an array of strings as labels for that object. We like to get rid of @Label annotation supporting dynamic labels for objects

Integration tests

Integration tests take more time by their very nature. To get fast feedback we have split up the tests in unit and integration tests. Unit tests will run when the test goal is issued and should have a name ending with Test or Tests. Integration tests will get executed withing the verify goal and their class name have to end with IntegrationTest to get picked up.

Configuration

Spring Data Neo4j RX takes a "ready to use" drivers instance and uses that. We won’t provide any additional configuration for aspects that are configurable through the driver. We will however provide support to configure the drivers instance in Spring Boot. The current SDN Spring Boot Starter only configures the Neo4j-OGM transport and not the "real" driver. Our plans for a future starter a have been described separately.

Closing the driver is not the the concern of Spring Data Neo4j RX. The lifecycle of that bean should be managed by the application. Therefore, the starter need to take care of register the drivers instance with the application.

Architecture

This is definitely not the last version of the architecture. It is only meant to be a basic for discussions.

Package structure

A rough outline of the current and maybe future package structure
@startuml
note "Implementation of Spring Data Commons SPI" as SDC_note
package "org.neo4j.springframework.data" {
package "core" {
    interface Neo4jClient
    interface ReactiveNeo4jClient
    package "schema" {
            package "internal" {
                note "Schema description" as schemaDescription
            }
            annotation Node
            annotation Property
        }
    package "mapping" {
            interface Neo4jPersistentEntity
            interface Neo4jPersistentProperty
        }
    package "transaction" {
        class Neo4jTransactionManager
    }
    package "convert" {
        note "conversion support" as conversionNote
    }
}

package "repository" {
SDC_note..config
    package "config" {
        class EnableNeo4jRepository
        class Neo4jRepositoryRegistrar
        class Neo4jRepositoryConfigExtension
    }
    package "query" {
        annotation Query
    }
    package "support" {
        class Neo4jRepositoryFactoryBean
        class SimpleNeo4jRepository
        class Neo4jQueryLookupStrategy
    }
    interface Neo4jRepository
    interface ReactiveNeo4jRepository
}

core-[hidden]--->repository
}

@enduml
Package Comment

core

Neo4jTemplate and related classes.

core.schema

Annotations for marking classes as nodes to be saved as well as internal schema description.

Infrastructure for dirty tracking etc.

core.mapping

Spring mapping information.

core.mapping.internal

Neo4j data mapping.

core.session

Connection to the Driver instance.

core.convert

not used yet place for conversion related classes.

repository

Repository interfaces like Neo4jRepository.

repository.config

Register all needed beans for Spring context.

repository.query

Place where @Query and other query method related annotations go in.

repository.support

Architecture validation

The structure of this project can be explored as a Graph. We use jQAssistant to verify our architecture during the build. Run the following two commands

./mvnw clean compile jqassistant:scan
./mvnw jqassistant:server

and point your browser to http://localhost:7474.

SimpleNeo4jRepository initialization

  1. @EnableNeo4jRepositories defines

    • the repositoryFactoryBeanClass that defaults to Neo4jRepositoryFactoryBean.class. (I)

    • Neo4jRepositoriesRegistrar as a configuration via the @Import annotation.

  2. Neo4jRepositoriesRegistrar connects @EnableNeo4jRepositories with Neo4jRepositoryConfigurationExtension.

  3. Neo4jRepositoryConfigurationExtension creates Neo4jRepositoryFactoryBean (the class defined (I)).

    • Adds manually created Neo4jTemplate (as an implementation of Neo4jOperations) bean by setting it (setNeo4jOperations) in the Neo4jRepositoryFactoryBean. (II)

    • Defines the default/fallback RepositoryFactoryBeanClassName as Neo4jRepositoryFactoryBean.class.getName() in getRepositoryFactoryBeanClassName.

  4. Neo4jRepositoryFactoryBean has a super constructor that gets called from the infrastructure code. As a consequence the neo4jOperations property has to get set in (II) after initialization.

    • Creates a new instance of Neo4jRepositoryFactory with the in (II) provided Neo4jOperations in doCreateRepositoryFactory.

  5. Neo4jRepositoryFactory will then create a SimpleNeo4jRepository.

    • It does this by calling getTargetRepositoryViaReflection in getTargetRepository and providing the neo4jOperations.

  6. SimpleNeo4jRepository (the repository behind every user defined repository) is initialized.

Query execution

Note
This section contains the already straight-forward implemented support for custom queries via @Query. The other execution paths are only drafts right now and marked with a *.

Neo4jRepositoryFactory overrides the getQueryLookupStrategy method to provide the Neo4jQueryLookupStrategy. From our previous experience and handling in other Spring Data stores this would branch off in two (technical three) directions:

  1. StringBasedNeo4jQuery for custom Cypher queries that are provided with the @Query annotation.

  2. StringBasedNeo4jQuery for named queries that are outsourced in property files.

  3. PartTreeNeo4jQuery for derived finder methods.

All three of them will get a custom Neo4jQueryMethod besides Neo4jClient and QueryMethodEvaluationContextProvider (not used yet) provided. This is a wrapper around the java.lang.reflect.Method passed into the resolveQuery method of the Neo4jQueryLookupStrategy to provide additional metadata.

StringBasedNeo4jQuery execution

At the moment the implementation just takes the value of the provided @Query annotation by calling getAnnotatedQuery on the Neo4jQueryMethod and executes it through the neo4jOperations (Neo4jTemplate) class.

Dirty tracking

We considered several approaches of dirty tracking in SDN-RX:

  1. No dirty tracking at all. Not an option when it comes to relationships.

  2. Dirty tracking through hashes. Not on the level of detail (fields) we want to have it.

  3. Using some kind of event / listener to track changes.

  4. Shallow copy of objects to get compared on save. A full copy of the objects will occupy twice the memory.

We have settled with option 1 (See ADR-004), analogue to Spring Data JDBC.

Spring Boot Starter

The Spring Data Neo4j RX Spring Boot Starter provides automatic configuration to

  • Create an instance of the neo4j-java-driver

  • Configure Spring Data Neo4j RX itself inside a Spring Boot application and enabling Spring Data repositories

Read me about it here.

Open questions

  • Dynamic label support

  • Reloading nodes from the database and the affect on already loaded and changed objects.

You can’t perform that action at this time.