Skip to content
This repository

AllRowsReader All rows query 

miket-ap edited this page · 3 revisions
Clone this wiki locally

A common (and arguably bad) use case for cassandra clients is to read all the data in a column family. Astyanax provides a recipe to perform this operation in parallel and using pagination so as not to put excessive heap pressure on the Cassandra nodes.

boolean result = new AllRowsReader.Builder<String, String>(keyspace, CF_STANDARD1)
        .withPageSize(100) // Read 100 rows at a time
        .withConcurrencyLevel(10) // Split entire token range into 10.  Default is by number of nodes.
        .withPartitioner(null) // this will use keyspace's partitioner
        .forEachRow(new Function<Row<String, String>, Boolean>() {
            @Override
            public Boolean apply(@Nullable Row<String, String> row) {
                // Process the row here ...
                // This will be called from multiple threads so make sure your code is thread safe
                return true;
            }
        })
        .build()
        .call();

Note: Astyanax uses the "Function" class from com.google.common.base. "@Nullable" comes from javax.annotations located in the com.google.code.findbugs:jsr305 artifact. The jsr305 is not included automatically by Maven because Google defines its scope as "provided". So if you get any errors in your IDE, make sure to include it.

Reading only the row keys

boolean result = new AllRowsReader.Builder<String, String>(keyspace, CF_STANDARD1)
        .withColumnRange(null, null, false, 0)
        .withPartitioner(null) // this will use keyspace's partitioner
        .forEachRow(new Function<Row<String, String>, Boolean>() {
            @Override
            public Boolean apply(@Nullable Row<String, String> row) {
                // Process the row here ...
                return true;
            }
        })
        .build()
        .call();
Something went wrong with that request. Please try again.