Skip to content
This repository has been archived by the owner on Feb 27, 2020. It is now read-only.
Michael Hunger edited this page Oct 7, 2015 · 6 revisions

Tutorial Spring Data Graph

Originally Written by Michael Hunger, updated for Spring Data Neo4j 4.

Allow me to introduce - Cineasts.net

Once upon a time I wanted to build a social movie database myself. First things first - I had a name: "Cineasts" - the people crazy about movies. So I went ahead and bought the domain, cineasts.net. So, the project was almost done.

I had some ideas as well. Of course there should be Actors who play Roles in Movies. I needed the Cineast, too, someone had to rate the movies after all. And while they were there, they could also make friends. Find someone to accompany them to the cinema or share movie preferences. Even better, the engine behind all that could recommend new friends and movies to them, derived from their interests and existing friends.

I looked for possible sources for data, IMDB was my first stop, but they charge 15k for data usage. Fortunately I found themoviedb.org which has liberal terms and conditions and a nice API for fetching the data.

There were many more ideas but I wanted to get something done over the course of one day. So this was the scope I was going to tackle.

Scope: Spring

Being a Spring Developer, I would, of course, choose components of the Spring Framework to do most of the work. I'd already come up with the ideas - that should be enough.

What database would fit both the complex network of cineasts, movies, actors, roles, ratings and friends? And also be able to support the recommendation algorithms that I thought of? I had no idea. But, wait, there was the new Spring Data project that started in 2010 bringing the convenience of the Spring programming model to NoSQL databases. That should fit my experience and help me getting started. I looked at the list of projects supporting the different NoSQL databases. Only one mentioned the kind of social network I was thinking of - Spring Data Graph for Neo4j, a graph database. Neo4j's pitch of "value in relationships" and the accompanying docs looked like what I needed. I decided to give it a try.

Preparations - Required Setup

To setup the project I created a public github account and began setting up the infrastructure for a spring web project using maven as build system. So I added the dependencies for the springframework libraries, put the web.xml for the DispatcherServlet and the applicationContext.xml in the webapp directory.

<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-core</artifactId>
    <version>4.1.4.RELEASE</version>
    <exclusions>
        <exclusion>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-webmvc</artifactId>
    <version>4.1.4.RELEASE</version>
    <exclusions>
        <exclusion>
            <groupId>commons-logging</groupId>
            <artifactId>commons-logging</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-aspects</artifactId>
    <version>4.1.4.RELEASE</version>
</dependency>
<dependency>
    <groupId>org.springframework.security</groupId>
    <artifactId>spring-security-web</artifactId>
    <version>4.0.0.RELEASE</version>
</dependency>
 <dependency>
     <groupId>org.springframework.security</groupId>
     <artifactId>spring-security-config</artifactId>
     <version>4.0.0.RELEASE</version>
 </dependency>
<dependency>
    <groupId>opensymphony</groupId>
    <artifactId>sitemesh</artifactId>
    <version>2.4.2</version>
</dependency>

With this setup I was ready for the first spike: creating a simple MovieController showing a static view. Check. Next was the setup for Spring Data Graph. I looked at the README at github and then checked it with the manual, just one maven dependency.

<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-neo4j</artifactId>
    <version>4.0.0.M1</version>
</dependency>

I spun up jetty to see if there were any obvious issues with the config. Check.

Setting the Stage - Movies Domain

The domain model was the next thing I planned to work on. I wanted to flesh it out first before diving into library details. Going along the ideas outlined before I came up with this. I also peeked in the datamodel of my import data source http://docs.themoviedb.apiary.io/[themoviedb] to confirm that it matched my expectations.

class Movie {
    String id;
    String title;
    Set<Role> roles;
    Set<Director> directors;
    Set<Rating> ratings;
    ...
}


class Person {
    String id;
    String name;
    ...
}

class Actor extends Person {
    Set<Role> roles;
    Role playedIn(Movie movie, String roleName);
}

class Role {
    Movie movie;
    Actor actor;
    String name;
}

class User {
    String login;
    String name;
    String password;
    Set<Rating> ratings;
    Set<User> friends;
    Rating rate(Movie movie, int stars, String comment);
    void addFriend(User friend);
}

class Rating {
    User user;
    Movie movie;
    int stars;
    String comment;
}

I wrote some basic tests to assure that the basic plumbing worked. Check.

Graphs ahead - Learning Neo4j

Then came the unknown - how to put these domain objects into the graph. First I read up about graph databases, especially Neo4j. Their data model consists of nodes and relationships all of which can have properties. Relationships as first class citizens - I liked that. It also has a powerful declarative query language, Cypher, which can be used to efficiently read or write data by expressing patterns in an easy to understand ASCII art form. That all seemed pretty easy.

I also learned that Neo4j was transactional and provided the known ACID guarantees for my data. This was unusual for a NoSQL database but easier for me to get my head around than non-transactional eventual persistence. That also meant that I had to manage transactions somehow. Keep that in mind.

Conjuring Magic - Spring Data Graph

Decorations - Annotated Domain

Using the graph database api in my domain would pollute my classes with lots of database details. I didn't want that. Spring Data Graph promised to do the heavy lifting for me. So I checked that next.

I looked at the documentation again, found a simple Hello-World example and tried to understand it. The entities were annotated with @NodeEntity, that was simple, so I added it too. //Talk about inheritance and the person/actor. Relationships got their own annotation named @RelationshipEntity. Property fields should be taken care of automatically. However Spring Data Neo4j needs a Long field to store the node or relationship id. I'll add this to my domain classes.

 @GraphId
 Long nodeId;

I'm also going to create repositories for my domain objects. This will give us the ability to save and find our entities.

public interface ActorRepository extends GraphRepository<Actor> {
}

public interface DirectorRepository extends GraphRepository<Director> {
}

public interface MovieRepository extends GraphRepository<Movie> {
}

Finally, for the Java-based config, I created a class Application.java that extends Neo4jConfiguration which comes with Spring Data Neo4j, making sure to override neo4jServer(), getSessionFactory() and getSession() and provide the context for my application.

@Configuration
@EnableNeo4jRepositories("org.neo4j.cineasts.repository")
@EnableTransactionManagement
@ComponentScan("org.neo4j.cineasts")
public class Application extends Neo4jConfiguration {

 public static final int NEO4J_PORT = 7474;

 @Override
 public SessionFactory getSessionFactory() {
     return new SessionFactory("org.neo4j.cineasts.domain");
 }

 @Bean
 public Neo4jServer neo4jServer() {
     return new RemoteServer("http://localhost:" + NEO4J_PORT);
 }

 @Override
 @Bean
 @Scope(value = "session", proxyMode = ScopedProxyMode.TARGET_CLASS)
 public Session getSession() throws Exception {
     return super.getSession();
 }
}

Ok lets put this into a test. How to assure that a field was persisted to the graph store? There seemed to be two possibilities. First was to get a GraphRepository injected and use its loadByProperty() method. The other one was a Finder approach which I ignored for now. Lets keep things simple. We can persist the entity using the save() method on the repository.

So my test looked like this.

@Autowired
MovieRepository movieRepository;

@Test public void persistedMovieShouldBeRetrievableFromGraphDb() {
     Movie forrest = new Movie("1", "Forrest Gump");
     forrest = movieRepository.save(forrest);

     Movie foundForrest = findMovieByProperty("title", forrest.getTitle()).iterator().next();
     assertEquals(forrest.getId(), foundForrest.getId());
     assertEquals(forrest.getTitle(), foundForrest.getTitle());
}

That worked, cool. But what about transactions I didn't declare the test to be transactional? After further reading I learned that save() creates an implicit transaction - so that was like an EntityManager would behave. Ok for me. I also learned that for more complex operations on the entities I needed external transactions.

A convincing act - Relationships

Value in Relationships - Creating them

Next were relationships. Direct relationships didn't require any annotation. Unfortunately I had none of those. So I went for the Role relationship between Movie and Actor. It had to be annotated with @RelationshipEntity and the @StartNode and @EndNode had to be marked. This will create an outgoing relation from an actor to a movie with relation type "Role". I prefer the relationship name to be more descriptive, so I'll specify a type on the RelationshipEntity called "ACTS_IN". So my Role looked like this:

@RelationshipEntity(type="ACTS_IN")
class Role {
    @EndNode
    Movie movie;
    @StartNode
    Actor actor;
    String name;
}

When writing a test for that I tried to create the relationship entity with new, setting an actor and movie on it, but found that it wasn't persisted correctly. So I realized that the actor and movie must be navigable in both directions via the Role relationship entity. I added the method for connecting movies and actors to the actor - seemed more natural.

public Role playedIn(Movie movie, String roleName) {
    final Role role = new Role(this, movie, roleName);
    roles.add(role);
    movie.addRole(role);
    return role;
}

Who's there ? - Accessing related entities

What was left - accessing those relationships. I already had the appropriate fields in both classes. Time to annotate them correctly. For the fields providing access to the entities on the other side of the relationship this was straightforward. Providing the target type again (thanks to Java's type erasure) and the relationship type (that I learned from the Neo4j lesson before) there was only the direction left. Which defaults to OUTGOING so only for the movie I had to specify it. Since I specified a relationship type for Role called "ACTS_IN", I also need to specify the same on the fields in both Actor and Movie.

@NodeEntity
class Movie {
    @GraphId
    Long nodeId;
    String id;
   
        @Relationship(type = "ACTS_IN", direction = Relationship.INCOMING)
    Set<Role> roles = new HashSet<>();
}

@NodeEntity
public class Actor extends Person {

    @Relationship(type = "ACTS_IN", direction = Relationship.OUTGOING)
    Set<Role> roles = new HashSet<Role>();

    public Role playedIn(Movie movie, String roleName) {
        final Role role = new Role(this, movie, roleName);
        roles.add(role);
        movie.addRole(role);
        return role;
    }
}

Now a test to make sure we can create and access relationship entities-

 @Test
    public void shouldAllowActorToActInMovie() {
        Movie forrest = new Movie("1", "Forrest Gump");
        
        Actor tomHanks = new Actor("1", "Tom Hanks");
        tomHanks.playedIn(forrest, "Forrest Gump");
        tomHanks = actorRepository.save(tomHanks);

        Actor foundTomHanks = findActorByProperty("name", tomHanks.getName()).iterator().next();
        assertEquals(tomHanks.getName(), foundTomHanks.getName());
        assertEquals(tomHanks.getId(), foundTomHanks.getId());
        assertEquals("Forrest Gump", foundTomHanks.getRoles().iterator().next().getName());
    }

Requisites - Populating the database

Time to put this on display. But I needed some test data first. So I wrote a small class, DatabasePopulator, for populating the database which could be called from my controller. To make it safe to call it several times I added lookups to check for existing entries. A simple /populate endpoint for the controller that called it would be enough for now - I added a populate endpoint in MovieController, which invoked DatabasePopulator.populateDatabase

@Transactional
public List<Movie> populateDatabase() {
    importService.importImageConfig();
    User me = userRepository.save(new User("micha", "Micha", "password", User.SecurityRole.ROLE_ADMIN, User.SecurityRole.ROLE_USER));
    User ollie = new User("ollie", "Olliver", "password", User.SecurityRole.ROLE_USER);
    me.addFriend(ollie);
    userRepository.save(me);
    List<Integer> ids = asList(19995 , 194, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 13, 20526, 11, 1893, 1892, 1894, 168, 193, 200, 157, 152, 201, 154, 12155, 58, 285, 118, 22, 392, 5255, 568, 9800, 497, 101, 120, 121, 122);
    List<Movie> result = new ArrayList<Movie>(ids.size());
    for (Integer id : ids) {
        result.add(importService.importMovie(String.valueOf(id)));
    }

    return result;
}

Showing off - Web views

After I had the means to put some data in the graph database, I also wanted to show it. So adding the controller method to show a single movie with its attributes and cast in a jsp was straightforward. Actually just using the repository to look the movie up and add it to the model. Then forward to the /movies/show view and voilá.

What was his name? - Searching //todo

The next thing was to allow users to search for some movies. So I needed some fulltext-search capabilities. As the index provider implementation of Neo4j builds on lucene I was delighted to see that fulltext indexes are supported out of the box.

So I happily annotated the title field of my Movie class with @Index(fulltext=true) and was told with an exception that I have to specify a separate index name for that. So it became @Indexed(fulltext = true, indexName = "search"). The corresponding finder method is called findAllByQuery. So there was my second repository method for searching movies. To restrict the size of the returned set I just added a limit for now that cuts the result after that many entries.

public void List<Movie> searchForMovie(String query, int count) {
    List<Movie> movies=new ArrayList<Movie>(count);
    for (Movie movie : movieFinder.findAllByQuery("title", query)) {
        movies.add(movie);
        if (count-- == 0) break;
    }
    return movies;
}

Look what i've found - Listing Results

I then used this result in the controller to render a list of movies driven by a search box. The movie properties and the cast was accessed by the getters in the domain classes.

<c:forEach items="${movies}" var="movie">
    <li>
      <div class="search-result-details">
      <c:set var="image" value="${movie.imageUrl}"/>
      <c:if test="${empty image}"><c:set var="image" value="/images/movie-placeholder.png"/></c:if>
      <a class="thumbnail" href="<c:url value="/movies/${movie.id}" />"> <img src="<c:url value="${image}" />" /></a>
        <a href="/movies/${movie.id}">${movie.title}</a> <img alt="${movie.stars} stars" src="/images/rated_${movie.stars}.png"/>
        <p><c:out value="${movie.tagline}" escapeXml="true" /></p>
      </div>
    </li>
</c:forEach>

Movies 2.0 - Adding social

But this was just a plain old movie database (POMD). My idea of socializing this business was not realized.

See, mom a Cineast! - Users

So I took the User class that I already coded up before and made it a full fledged Spring Data Graph member.

@NodeEntity
class User {
    @GraphId
    Long nodeId;
    String login;
    String name;
    String password;

    @Relationship(type = "RATED")
    private Set<Rating> ratings = new HashSet<>();

    
    @Relationship(type = FRIEND, direction = Relationship.UNDIRECTED)
    Set<User> friends = new HashSet<>();
    
    public Rating rate(Movie movie, int stars, String comment) {
        if (ratings == null) {
            ratings = new HashSet<>();
        }

        Rating rating = new Rating(this, movie, stars, comment);
        ratings.add(rating);
        movie.addRating(rating);
        return rating;
    }

    public void addFriend(User friend) {
        this.friends.add(friend);
    }
}

@RelationshipEntity(type="RATED")
public class Rating {

    @GraphId
    private Long id;
    @StartNode
    private User user;
    @EndNode
    private Movie movie;
    private int stars;
    private String comment;

    ...
}

Beware, Critics - Rating

I also put a ratings field into the movie to be able to show its ratings. And a method to average the stars it got.

@NodeEntity
class Movie {
    @Relationship(type = "RATED", direction = Relationship.INCOMING)
    private Set<Rating> ratings = new HashSet<>();

    public int getStars() {
        Iterable<Rating> allRatings = ratings;

        if (allRatings == null) {
            return 0;
        }
        int stars = 0, count = 0;
        for (Rating rating : allRatings) {
            stars += rating.getStars();
            count++;
        }
        return count == 0 ? 0 : stars / count;
    }
}

I also added a few user and ratings to the database population code. And three methods to register users, lookup users and add friends.

public interface CineastsUserDetailsService extends UserDetailsService {
    @Override
    CineastsUserDetails loadUserByUsername(String login) throws UsernameNotFoundException;

    User getUserFromSession();

    @Transactional
    User register(String login, String name, String password);

    @Transactional
    void addFriend(String login, final User userFromSession);
}

Protecting Assets - Adding Security

To use the user in the webapp I had to put it in the session and add login and registration pages. Of course the pages that only worked with a valid user account had to be secured as well.

I used Spring Security to that, writing a simple authentication provider that used my repository for looking up the users and validating their credentials.

<security:http> <!-- use-expressions="true" -->
        <security:intercept-url pattern="/admin/*" access="hasRole('ROLE_ADMIN')"/>
        <security:intercept-url pattern="/import/*" access="hasRole('ROLE_ADMIN')"/>
        <security:intercept-url pattern="/user/**" access="hasRole('ROLE_USER')"/>
        <security:intercept-url pattern="/auth/login" access="isAnonymous()"/>
        <security:intercept-url pattern="/auth/register" access="isAnonymous()"/>
        <security:intercept-url pattern="/resources/**" access="permitAll"/>
        <security:intercept-url pattern="/images/**" access="permitAll"/>
        <security:intercept-url pattern="/**" access="isAnonymous() || hasRole('ROLE_USER')"/>
        <security:form-login login-page="/auth/login" authentication-failure-url="/auth/login?login_error=true"
        default-target-url="/user" login-processing-url="/j_spring_security_check" username-parameter="username" password-parameter="password"/>
        <security:logout logout-success-url="/" invalidate-session="true" logout-url="/j_spring_security_logout"/>
        <security:access-denied-handler error-page="/auth/denied" />
        <security:csrf/>
    </security:http>

    <security:authentication-manager>
        <security:authentication-provider user-service-ref="userRepository">
            <security:password-encoder hash="md5">
                <security:salt-source system-wide="cewuiqwzie"/>
            </security:password-encoder>
        </security:authentication-provider>
    </security:authentication-manager>
public class UserRepositoryImpl implements CineastsUserDetailsService {

    @Override
    @Transactional
    public User register(String login, String name, String password) {
        User found = findByLogin(login);
        if (found != null) {
            throw new RuntimeException("Login already taken: " + login);
        }
        if (name == null || name.isEmpty()) {
            throw new RuntimeException("No name provided.");
        }
        if (password == null || password.isEmpty()) {
            throw new RuntimeException("No password provided.");
        }
        User user=userRepository.save(new User(login,name,password, User.SecurityRole.ROLE_USER));
        setUserInSession(user);

        return user;
    }

    @Override
    public CineastsUserDetails loadUserByUsername(String login) throws UsernameNotFoundException {
        final User user = findByLogin(login);
        if (user == null) {
        throw new UsernameNotFoundException("Username not found: " + login);
        }
        return new CineastsUserDetails(user);
        }

        private User findByLogin(String login) {
        return IteratorUtil.firstOrNull(findByProperty("login", login).iterator());
    }

After that a logged in user was available in the session and could so be used for all the social interactions. Most of the work done next was adding controller methods and JSPs for the views.

Oh the Glamour - More UI

The dusty archives - Importing Data

Now it was time to pull the data from themoviedb.org. Registering there and getting an API key was simple, using the API on the commandline with curl too. Looking at the JSON returned for movies and people I decided to pimp my domain model and add some more fields so that the representation in the UI was worth the effort.

For the import process I created a separate importer that used Jackson's ObjectMapper and JSON to fetch and parse the data and then some transactional methods to actually insert it as movies, roles and actors. I also created a version of the importer that read the json files from local disk, so that I didn't have to strain the remote API that much and that often.

@Service
public class MovieDbImportService {

...

private Movie doImportMovie(String movieId) {
        
    Movie movie = movieRepository.findById(movieId);
    if (movie == null) { // Not found: Create fresh
        movie = new Movie(movieId, null);
    }

    Map data = loadMovieData(movieId);
    if (data.containsKey("not_found")) {
        throw new RuntimeException("Data for Movie " + movieId + " not found.");
    }
    movieDbJsonMapper.mapToMovie(data, movie, baseImageUrl);
    movieRepository.save(movie);
    relatePersonsToMovie(movie, (Map) data.get("credits"));
    return movie;
    }

...

Movies! Friends! Bargains! - Recommendations

In the last part of this exercise I wanted to add some recommendation algorithms to my app. One was the recommendation of movies that my friends liked very much (and their friends in descending importance). The second was recommendations for new friends that also liked the movies that I liked most.

Doing this kind of ranking algorithms is the real fun with graph databases and so simple with Cypher.

Lets say I'm only interested in the top 10 recommendations each. I wrote up a Cypher query and annotated a repository method with @Query. The results of this query are mapped to a POJO called MovieRecommendation which is annotated with @QueryResult.

@Query( "match (user:User {login: {0}})-[r:RATED]->(movie)<-[r2:RATED]-(other)-[r3:RATED]->(otherMovie) " +
                    " where r.stars >= 3 and r2.stars >= r.stars and r3.stars >= r.stars " +
                    " and not((user)-[:RATED]->(otherMovie)) " +
                    " with otherMovie, toInt(round(avg(r3.stars))) as rating, count(*) as cnt" +
                    " order by rating desc, cnt desc" +
                    " return otherMovie.id as movieId, otherMovie.title as title, otherMovie.tagline as tagline, rating as rating limit 10" )
    List<MovieRecommendation> getRecommendations(String login);

@QueryResult
public class MovieRecommendation {

    String movieId;
    String title;
    String tagline;

    int rating;

    //Getters and setters

}