v2.3.5-srh-extra
This is a back-ported release of a soft durability flush interval feature.
Funding for this was provided by ZeroTier, a company that makes network virtualization software that allows networks to be created and managed effortlessly over both LAN and WAN.
This release is called v2.3.5-srh-extra. Its cluster protocol is not compatible with the official 2.3.5 binaries. The file format is, however, downgrade compatible. You can switch to this new version, try it out, and then switch back if you don't like it.
What's new:
- Everything in the official 2.3.6 RC1 release is here
- Soft durability writes now get their data flushed every 5 seconds, instead of being flushed immediately. If you write and rewrite the same documents over and over again, you might see your I/O bandwidth decrease by a factor of 1000. Or more! (Or less.)
- A new field in the table configuration called
'user_value'
has been added. It's initially set to the empty object,{ }
. You can do the following things with it:- Set it to
{ "srh/flush_interval": 23.3 }
. This will change the flush interval to 23.3 seconds, instead of 5 seconds. - Set it to
{ "srh/flush_interval": 0.01 }
, if you want your flushes to happen relatively quickly. - Set it to
{ "srh/flush_interval": "default" }
. This sets the flush interval to the default value, 5 seconds. - Set it to
{ "srh/flush_interval": "never" }
. This makes flushing "never" happen, under ideal conditions. - Set it to whatever you like. Add new fields to the object for your own operational needs.
- Set it to
Under cache memory pressure, a flush could still happen sooner than the flush interval demands (even if set to "never"). Setting "never", or a very long flush interval, is only a good idea for tables that comfortably fit into memory. Otherwise, those tables will tend to hog memory that could be put to better use.
Example queries (in JavaScript) for configuring the flush interval of a table:
r.table('foo').config().update({'user_value': {'srh/flush_interval': 23.3}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 0.01}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 'default'}})
r.table('foo').config().update({'user_value': {'srh/flush_interval': 'never'}})
An example that sets the flush interval to 23.3 seconds, with some other user data:
r.table('foo').config().update({
'user_value': {'srh/flush_interval': 23.3, 'john/blah': "john's data"}
})
It's possible to set the user value to whatever you want:
r.table('foo').config().update({'user_value': "hello world"})
To set the user value back to what it was initially:
r.table('foo').config().update({'user_value': { }})
You might also want to set the default durability to "soft":
r.table('foo').config().update({
'durability': 'soft',
'user_value': {'srh/flush_interval': 'never'}
})
Warning: If you set the user value, then downgrade to RethinkDB 2.3.5, and then upgrade back to 2.3.5-srh-extra, you might lose your user value configuration. (By design, you will lose your user value configuration if you modify the table's configuration at all while running RethinkDB 2.3.5.)
You might also notice some performance improvements in secondary index creation, bringing up new replicas, and running unit tests.
You might also notice performance regressions. Please report them to https://github.com/srh/rethinkdb/issues. My take is, if you're running the official RethinkDB 2.3.5, this release is far more likely to improve performance than hurt it, so it's worth using 2.3.5-srh-extra.
Binaries for Debian Jessie, Ubuntu 16.04, and CentOS 7 are coming soon. Let me know if you'd like your platform on this list.
These changes will (presumably) get pushed upstream for RethinkDB 2.5.