GoBigdis

GoBigdis is a persistent database that implements the Redis server protocol. Any Redis client can interface with it and start to use it right away. It's basically a Redis-for-secondary-memory.

The main feature of GoBigdis is that it's very friendly with huge keys and huge values. Much friendlier than Redis itself, as the Redis author states (see the credits section).

It has no external dependencies of any kind. It gets away by simply using the comprehensive Go's standard library. Also, since it uses 1 goroutine per client connection it gets to scale to multiple cores "for free".

Status

GoBigdis is my weekend attempt at Bigdis (see the credits section for further infos).

This is the subset of commands it currently implements:

Command	Status
`PING`	Fully implemented ✔️
`GET`	Fully implemented ✔️
`SET`	Only setting values works, no keys expiration logic as of now 🔧
`DEL`	Fully implemented ✔️
`COMMAND`	Placeholder reply only 🔧
`SELECT`	Fully implemented ✔️
`FLUSHDB`	Does what expected, but only without arguments 🔧

Nothing other than the basic KV type has been implemented as of now.

Command parameters

GoBigdis, with its gobigdis command, currently accepts the following command flags:

-h STRING specifies on what IP the server TCP socket should listen on (defaults to localhost if not set)
-p INTEGER tells on what port (defaults to 6389 if not set)
-d PATH sets the root database directory to use, it proceeds to create it if it doesn't already exist (defaults to $HOME/.gobigdis if not set)

Installation

You need go installed on your system. If you do, simply run:

go install github.com/RcrdBrt/gobigdis@latest

Make sure you have put the go binaries folder in your PATH and you're good to go with the command gobigdis. Launch gobigdis -h for a basic command overview.

Benchmarks

I used the benchmark suite redis-benchmark. It ships with the default redis 6.2.4 package of my Linux distribution. The Redis stock server is used with default configuration provided by said package.

I used the same host for both server and (benchmark) client.

My machine specs:

Ryzen 7 2700x
32GB DDR4 3200Mhz RAM (2x16GB)
NVMe Samsung 970evo 512GB

First configuration (2 kB value size)

100,000 requests; 50 connections; 2,000 bytes size of values; TCP KeepAlive ON

redis-benchmark -t get,set -n 100000 -c 50 -d 2000 -r 2000 -k 1 -q -p PORT

Results:

database	`GET` (req/s)	`GET` delay (p50 msec)	`SET` (req/s)	`SET` delay (p50 msec)
GoBigdis	93545.37	0.279	13681.76	3.199
stock Redis	124533.01	0.207	115340.26	0.215

As expected, GoBigdis shows a hit in terms of performance for the SET command since it has to write the file on the NVMe every time the command is issued and it adopts the Copy-On-Write pattern: there is an order of magnitude of difference both from the requests-per-second and the delay perspective. Still, not that bad though and it performs surprisingly well for the GETs.

Second configuration (20 kB value size)

100,000 requests; 50 connections; 20,000 bytes size of values; TCP KeepAlive ON

redis-benchmark -t get,set -n 100000 -c 50 -d 20000 -r 2000 -k 1 -q -p PORT

Results:

database	`GET` (req/s)	`GET` delay (p50 msec)	`SET` (req/s)	`SET` delay (p50 msec)
GoBigdis	68399.45	0.327	5688.93	4.079
stock Redis	71530.76	0.423	74962.52	0.551

With bigger values, stock Redis starts to suffer a little while GoBigdis keeps going almost at the same speed, narrowing down the difference between the 2. SET performance for GoBigdis halves considering the previous benchmark configuration but access delays stay almost the same.

Third configuration (2 MB value size)

100,000 requests; 50 connections; 2,000,000 bytes size of values; TCP KeepAlive ON

redis-benchmark -t get,set -n 100000 -c 50 -d 2000000 -r 2000 -k 1 -q -p PORT

Results:

database	`GET` (req/s)	`GET` delay (p50 msec)	`SET` (req/s)	`SET` delay (p50 msec)
GoBigdis	945.42	1.847	178.83	178.815
stock Redis	1071.27	31.599	1418.58	18.703

At this point, almost unexpectedly both the databases balloon in terms of delays and performance degradation.

I thought they would have showed better latencies at least. Since 2 MB values is basically a picture for each key stored, it apparently makes them suffer quite a bit. GET commands now have almost the same requests-per-second performance between the two but GoBigdis has an order of magnitude better p50 delays compared to the stock Redis server.

Implementation details

GoBigdis uses the Filesystem-As-A-Database pattern. I respected the convention antirez implemented in his project and extended it by 1 depth level. It currently stores the keys as a SHA256 filename using the following format:

ROOT_DBDIR/DATABASE_NUMBER/FIRST_BYTE_OF_SHA/SECOND_BYTE_OF_SHA/THIRD_BYTE_OF_SHA/FULL_SHA

so the path of a key with SHA256 of f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2 stored in the third database (e. g.) would become:

ROOT_DBDIR/2/f2/ca/1b/f2ca1bb6c7e907d06dafe4687e579fce76b37e4e93b7605022da52e6ccc26fd2

GoBigdis implements the Copy-On-Write pattern, so SET is expensive while GET is relatively cheap. It also has a coarse-grained RWLock for filesystem access. An expansion of this project should take into consideration a more fine-grained approach and probably use some more sophistication on top of or beside the Copy-On-Write. GoBigdis has a cache layer that makes the GET super-fast in case of some non-existent keys by avoiding to hit the filesystem entirely under certain circumstances.

With any experimental database project it should come a reasonable expectation of low overall stability. Although the persistence part simply uses filesystem primitives with no trickery of any sort and could be considered "working good enough", no battle-testing has been done other than the benchmarks above in this README, nevermind put it in production.

Credits

This project is heavily inspired by the TCL lang experiment that antirez - the creator of Redis - did in this repo in July 2010. My project is an answer to the question in his README "Do you think this idea is useful?". I think it really is so I implemented it in Go.

Most parsing code of client requests and replying is taken from here to jumpstart the implementation.

Future

The choice of Go opens up a lot of easy concurrency - it's an oximoron, I know, ndr - optimizations. Check out the DEL command implementation code for an example of it.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
alg		alg
config		config
internal		internal
network		network
storage		storage
.gitignore		.gitignore
AUTHORS		AUTHORS
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GoBigdis

Status

Command parameters

Installation

Benchmarks

First configuration (2 kB value size)

Second configuration (20 kB value size)

Third configuration (2 MB value size)

Implementation details

Credits

Future

About

Releases 1

Packages

Languages

License

RcrdBrt/gobigdis

Folders and files

Latest commit

History

Repository files navigation

GoBigdis

Status

Command parameters

Installation

Benchmarks

First configuration (2 kB value size)

Second configuration (20 kB value size)

Third configuration (2 MB value size)

Implementation details

Credits

Future

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages