Cassandra backup/restore solution - alternative to cqlsh COPY FROM/TO command
Build Status

Cassandra CSV export/import. Better than cqlsh COPY FROM/TO. Requires Java 8

Tested to work on Cassandra 2.1.15 and 2.2.4


Because COPY FROM/TO did not work for escaped characters (or any unicode). It also produces invalid CSV format if you have stringified JSON as a value.

Suppose you have the value of "ABC\nDEF" where ABC and DEF are splitted using newline character. What COPY TO does is to store it as a literal "ABC\nDEF" in the CSV. However, when COPY FROM reads it, it ignoes \ character - so the value restored will be ABCnDEF.

Another example is a stringified JSON value if you have "{"list": [1,2,3]}" as a value, COPY TO will store it as "{\"list\": [1,2,3]}" which doesn't work with csv readers.

How kani solves this issue is simple: instead of escaping characters or dealing with quotes, we simply store any text values in hexadecimal format.


No support for Cassandra custom type and COUNTER


Using jar build

Export / import both schema and tables for a specific keyspace

java -Xmx2g -Xms2g -jar kani-standalone.jar -c "config.edn" [-d "data"] (export | import)


  • -h --help
  • -c --config - EDN config files
  • -d --directory (optional) - directory where export/import will write/read files to/from

Note: Recommended heap size is about 2GB

Export / import schema only

java -Xmx2g -Xms2g -cp kani-standalone.jar eyeota.kani.application.schema -c "config.edn" -f "file.cql" (export | import)


  • -h -- help
  • -c --config - EDN config file
  • -f --file - CQL file to be exported/imported to/from

Export / import specific table

java -Xmx2g -Xms2g -cp kani-standalone.jar eyeota.kani.application.table -c "config.edn" (export <table name> | import <table name> <csv file>)


  • -h -- help
  • -c --config - EDN config file

Comparing CSV files

Useful to test if import/export works as expected, for example: db can be exported first to a directory data-original, then db is then imported from data-original and then re-exported to data-re-exported. Once that's done, CSV files in both data-original and data-re-exported can be compared (simply using diff might not work because row ordering might have changed)

java -Xmx6g -Xms6g -cp kani-standalone.jar eyeota.kani.application.compare_csv directory-1 directory-2

OR use the provided shell script

./ directory-1 directory-2

Note: Comparinvg CSV will require much more memory than export / import, it's a good idea to specify 2-3x more heap space than export (just in case).

Using the shell script uses less memory but takes a little bit longer to complete

EDN Config files

{:port              9042                      ; Cassandra native port to connect to (default: 9042)
 :hosts             [""]             ; Cassandra hosts (can specify multiple hosts)
 :keyspace          "db_keyspace"             ; Keyspace to work on
 :fetch-size        2000                      ; Cassandra fetch size (if not specified, default to 5000)
 :null-value        "<null>"                  ; What value should be stored in CSV for null values (default: "<null>")
 :table-fetch-size  {"huge_table" 20}         ; Table-specific fetch size (if the row is huge, might want to reduce the
                                              ; number to avoid timeouts)
 :consistency       :quorum                   ; Cassandra read/write consistency level (default: quorum). Accepted values are:
                                              ; [:all :any :each-quorum :local-one :local-quorum :local-serial :one :quorum
                                              ;  :serial :three: two]
 :excluded-tables   ["do_not_backup_table"]}  ; Table whose contents are not to be imported / exported (although the schema will)

Using leiningen

By default running through leiningen will set the heap size to 2gb

Export / import both schema and tables for a specific keyspace

lein run -c "config.edn" [-d "data"] (export | import)

Export / import schema only

lein run -m eyeota.kani.application.schema -c "config.edn" -f "file.cql" (export | import)

Export / import specific table

lein run -m eyeota.kani.application.table -c "config.edn" (export <table-name> | import <table-name> <csv-file>)

Comparing CSV files

lein with-profile +6gheap run -m directory-1 directory-2



lein with-profile +test check - runs both code quality and tests

lein quality - runs only code quality checks

lein with-profile +test test - runs test


lein build - runs test and copies libraries into target

lein jar - creates jar file

lein uberjar - creates self-contained jar


Copyright © 2016 Eyeota

Distributed under the GNU Lesser General Public License v3.0