Skip to content

Buzzardo/gs-accessing-data-cassandra

 
 

Repository files navigation

This guide walks you through the process of Spring Data Cassandra to build an application that stores data in and retrieves it from Apache Cassandra, a high-performance distributed database.

What you’ll build

You will store and retrieve data from Apache Cassandra by using Spring Data Cassandra.

Build with Gradle

Build with Maven

Setting up a database

Before you can build the application, you need to set up a Cassandra database. Apache Cassandra is an open-source NoSQL data store optimized for fast reads and fast writes in large datasets. In the next subsections, you can choose between using DataStax Astra DB Cassandra-as-a-Service or running it locally on a Docker container. This guide will describe using the free tier of DataStax Astra Cassandra-as-a-Service so you can create and store data in your Cassandra database in a matter of minutes.

Add the following properties in your application.properties (src/main/resources/application.properties) to configure Spring Data Cassandra:

spring.data.cassandra.schema-action=CREATE_IF_NOT_EXISTS
spring.data.cassandra.request.timeout=10s
spring.data.cassandra.connection.connect-timeout=10s
spring.data.cassandra.connection.init-query-timeout=10s

The spring.data.cassandra.schema-action property defines the schema action to take at startup and can be none, create, create-if-not-exists, recreate or recreate-drop-unused. We’ll be using create-if-not-exists to create the required schema. See the documentation for details.

Note
It is a good security practice to set this to none in production, to avoid the creation / recreation of the database at startup.

We will also be increasing the default timeouts which might be needed when first creating the schema or with slow remote network connections.

Astra DB Setup

To use a managed database, you can use the rebust free tier of DataStax Astra DB Cassandra-as-a-Service. It will scale to zero when unused. Follow the instructions in the following link to create a database and keystore named spring_cassandra.

The Spring Boot Astra starter will pull in and autoconfigure all the required dependencies, add it to your pom.xml:

<dependency>
	<groupId>com.datastax.astra</groupId>
	<artifactId>astra-spring-boot-starter</artifactId>
	<version>0.1.13</version>
</dependency>
Note
For Gradle, add implementation 'com.datastax.astra:astra-spring-boot-starter:0.1.13' to your build.gradle

The Astra auto-configuration needs the configuration to connect to our cloud database:

  • Define the credentials: client ID, client secret and application token.

  • Select your instance with the cloud region, database id and keyspace (spring_cassandra).

Add these extra properties in your application.properties (src/main/resources/application.properties) to configure Astra:

# Credentials to Astra DB
astra.client-id=<CLIENT_ID>
astra.client-secret=<CLIENT_SECRET>
astra.application-token=<APP_TOKEN>

# Select an Astra instance
astra.cloud-region=<DB_REGION>
astra.database-id=<DB_ID>
astra.keyspace=spring_cassandra

Docker Setup

If you prefer to run Cassandra locally in a containerized environment, execute the following docker run command:

docker run -p 9042:9042 --rm --name cassandra -d cassandra:3.11

After the container is created, access the Cassandra query language shell:

docker exec -it cassandra cqlsh

And create a keyspace for the application:

CREATE KEYSPACE spring_cassandra WITH replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1};

Now that you have your database running, configure Spring Data Cassandra to access your database.

Add the following properties in your application.properties (src/main/resources/application.properties) to connect to your local database:

spring.data.cassandra.local-datacenter=datacenter1
spring.data.cassandra.keyspace-name=spring_cassandra

Alternatively, for a convenient bundle of Cassandra and related Kubernetes ecosystem projects, you can spin up a single node Cassandra cluster on K8ssandra in about 10 minutes.

Create the Cassandra Entity

In this example, you’ll define a Vet (Veterinarian) entity. The following listing shows the Vet class (in src/main/java/com/example/accessingdatacassandra/Vet.java):

link:complete/src/main/java/com/example/accessingdatacassandra/Vet.java[role=include]

The Vet class is annotated with @Table which maps it to a Cassandra Table. Each property will be mapped to a column.

The class uses a simple @PrimaryKey of type UUID. Choosing the right primary key is essential, this will determine our partition key and cannot be changed later.

Note
Why is it so important? The partition key not only defines data uniqueness but also controls data locality. When inserting data, the primary key is hashed and used to choose the node where to store the data, this way we know the data will always be found in that node.

Cassandra denormalizes data and does not need table joins like SQL/RDBMS does, which allows you to retrieve data much faster. For that reason, we have modelled our specialties as a Set<String>.

Create simple queries

Spring Data Cassandra is focused on storing data in Apache Cassandra. But it inherits functionality from the Spring Data Commons project, including the ability to derive queries. Essentially, you need not learn the query language of Cassandra. Instead, you can write a handful of methods and let the queries be written for you.

To see how this works, create a repository interface that queries Vet entities, as the following listing (in src/main/java/com/example/accessingdatacaddandra/VetRepository.java) shows:

link:complete/src/main/java/com/example/accessingdatacassandra/VetRepository.java[role=include]

VetRepository extends the CassandraRepository interface and specifies types for the generic type parameters for both the value and the key that the Repository works with, i.e. Vet and UUID, respectively. Out-of-the-box, this interface comes with many operations, including basic CRUD (CREATE, READ UPDATE, DELETE) and simple query (e.g. findById(..)) data access operations. CassandraRepository doesn’t extend from PagingAndSortingRepository, because classic paging patterns using limit/offset are not applicable to Cassandra.

You can define other queries as needed by simply declaring their method signature. However, you can only perform queries that include the primary key. The method findByFirstName is a valid Spring Data method but won’t be allowed in Cassandra as firstName is not part of the primary key.

Note
Some generated methods in the repository might require a full table scan. One example is the findAll method, which requires querying all nodes in the cluster. Such queries are not recommended with large datasets as they can impact performance.

Adding a CommandLineRunner

Define a bean of type CommandLineRunner and inject the VetRepository to set up some data and use its methods.

Spring Boot automatically handles those repositories as long as they are included in the same package (or a sub-package) of your @SpringBootApplication class. For more control over the registration process, you can use the @EnableCassandraRepositories annotation.

Note
By default, @EnableCassandraRepositories scans the current package for any interfaces that extend one of Spring Data’s repository interfaces. You can use its basePackageClasses=MyRepository.class to safely tell Spring Data Cassandra to scan a different root package by type if your project layout has multiple projects and it does not find your repositories.

Spring Data Cassandra uses the CassandraTemplate to execute the queries behind your find* methods. You can use the template yourself for more complex queries, but this guide does not cover that. (see the Spring Data Cassandra Reference Guide[https://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#reference]).

The following listing shows the finished AccessingDataCassandraApplication class (at /src/main/java/com/example/accessingdatacassandra/AccessingDataCassandraApplication.java):

link:complete/src/main/java/com/example/accessingdatacassandra/AccessingDataCassandraApplication.java[role=include]

Summary

Congratulations! You’ve just developed a Spring application that uses Spring Data Cassandra to access distributed data.

About

Accessing Data with Cassandra :: Learn how to persist data in Cassandra.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 90.2%
  • Shell 9.8%