There have been a couple of issues on bitcask and I think at least one on leveldb where stale lock files keep the system or particular vnodes from starting up. There've been repeated requests (basho/bitcask#99, basho/bitcask#163) to manage this at the backend level, but IMO, that's the wrong place to handle it, since there are various corner cases that the backend really shouldn't be responsible for detecting. I lay out some of my thinking here:
The ideal thing in my mind is to have riak at the top/service level guarantee uniqueness, and then to extend the backend API to have a cleanup_locks call when riak/some other containing application has determined that it's totally safe to do so.
This was kind of a distant corner case before, but as containerization gets more common, we're seeing this more and more, and that trend looks only to increase, so best to deal with it soon.
cc @bsparrow435 @joecaswell