Skip to content

Latest commit

 

History

History
128 lines (118 loc) · 11.9 KB

choosing-implementation.md

File metadata and controls

128 lines (118 loc) · 11.9 KB

Choosing an implementation

If you already have a database server running and you know how to administrate it properly, you probably just want to use the appropriate implementation for that. Let's say you're already running a MySQL server. Setting up a Redis server and using the gokv Redis implementation will probably lead to a higher performance, but to be honest, you should always go with the service you already know well, except if you're willing to learn about a new service anyway.

Otherwise (if you don't already have a database server running), you might be overwhelmed with the choice of implementations.

First of all, you need to know the key differences between the store categories. Then you can look into the differences of concrete stores / implementations.

Contents

  1. Categories
  2. Implementations

Categories

  • Local in-memory
    • These are the fastest stores, because the data doesn't need to be transferred anywhere, not even to disk.
    • But if your application crashes, all data is lost, because it's not persisted anywhere.
    • Some of the implementations offer the possibility to limit the memory usage, leading to old data being evicted from the cache.
  • Embedded
    • The data is written to disk. Easy to back up, no servers to handle.
    • Some implementations cache the data and only write to disk periodically, which leads to a high performance, but bears the risk that the data that's not persisted yet is lost when the system crashes.
    • Well fitted for standalone client-side applications.
    • Not fitted for web services, because you usually want to scale your services horizontally. When the DB gets filled with data by your first service instance and then you start another service instance, this second instance won't see any of the data of the first instance. Sharing the DB usually doesn't work due to open file handles and doesn't make sense anyway, because your services are probably (and should be) located on different servers with different disks.
  • Distributed store
    • The database runs as a separate server.
    • The data is persisted (either immediately or periodically, depending on the implementation), so a server crash doesn't lead to data loss.
    • The servers can be run as cluster, leading to the data being even when a database server crashes.
    • Most implementations are specifically engineered to be key-value stores with very high performance.
    • Perfectly fitted for web services, because you can scale horizontally and each service instance can access the same data.
  • Distributed cache
    • Similar to the distributed stores: Runs as separate server, can be run as cluster, engineered as key-value store with high performance
    • But without any persistence, so when a database server crashes, its data is lost.
      • (Some implementations offer optional persistence, but discourage it. If you need a distributed store with persistence, you should use one that's specifically engineered for that use-case, and not a cache.)
    • Can be useful if you only need to cache values temporarily and don't want a database to constantly grow.
  • Cloud
    • Databases running in the cloud can be great if you need a database server (in-memory and embedded are not an option), but you can't or don't want to deal with the server administration.
    • Most DB-as-a-Service providers offer high availability and redundancy across regions, so even if an entire datacenter is unreachable for example due to some natural desaster or some powerlines being accidentally cut off (both happened in the past), your applications and web services still have access to all the data.
    • It might be more expensive than running your own database server, but maybe only when not taking the administrative overhead of managing your own servers into account.
    • It might be slower than running your own database server, depending on if your applications / web services run in the same cloud or not, due to network latency.
  • SQL, NoSQL, NewSQL
    • Most of these implementations are probably only interesting for people who are already running their own database servers.

Implementations