This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
commit 3b9620bedd7d3d0c471936b872c5e2a499274012
tree 874b96746e53bc026be1a35df50695d4d43c08dc
parent f38b63b4701b2d5f0e1ee9398c382003049d9219
tree 874b96746e53bc026be1a35df50695d4d43c08dc
parent f38b63b4701b2d5f0e1ee9398c382003049d9219
| name | age | message | |
|---|---|---|---|
| |
.gitignore | Tue Aug 05 05:16:04 -0700 2008 | |
| |
.gitmodules | Sun Sep 21 04:03:52 -0700 2008 | |
| |
MIT-LICENSE | Tue Aug 05 04:04:56 -0700 2008 | |
| |
README | Tue Aug 05 04:05:46 -0700 2008 | |
| |
Rakefile | Thu Sep 11 02:50:47 -0700 2008 | |
| |
TODO | Sun Aug 24 17:54:43 -0700 2008 | |
| |
bin/ | Sun Sep 21 04:03:52 -0700 2008 | |
| |
docs/ | Sun Aug 10 20:22:30 -0700 2008 | |
| |
examples/ | Fri Aug 22 09:27:58 -0700 2008 | |
| |
lib/ | Sun Sep 21 04:03:52 -0700 2008 | |
| |
spec/ | Sat Sep 06 11:34:07 -0700 2008 | |
| |
test/ | Mon Aug 25 08:04:26 -0700 2008 | |
| |
vendor/ | Sun Sep 21 04:03:52 -0700 2008 |
StrokeDB Core. Minimalistic modular database engine. Authors: Oleg Andreev <oleganza@idbns.com> & Yurii Rashkovskii <yrashk@idbns.com> Copyright 2008 IDBNS, inc. = Architecture overview StrokeDB stores a set of versioned documents identified by UUID. A document is a hash-like container of slots: flat set of values tagged with string keys. Slots store arbitrary serializable values (most common types are booleans, numbers, strings, arrays, numbers, time). Basic repository implements three retrieval methods: * get(uuid) * get_version(version) * each{|doc| ... } Efficient indexing is implemented with Views. View is an indexed sorted list of the key-value pairs. Each view relies on map(doc) method to produce key-value pairs (like in map-reduce). View requests are implemented by prefix-based searching using powerful find() method. StrokeDB Core API relies on: * two methods of document: doc["slot"], doc["slot"]= * slots: "uuid", "version", "previous_version" * ability to store Arrays and Strings in a "previous_version" slot. (Array is used when the version is a result of a merge operation.) == Versions Every document references several previous versions. The very first version of a document does not have a reference to any previous version. Regular versions have a reference to only one previous_version. In case of a merge, document references two or more previous versions. Document deletion is just a creation of the new version with the {"deleted"=>true} slot. == Guarantees StrokeDB doesn't guarantee referential integrity. Any document may reference any UUID, even not available in the repository. == Views TODO == Synchronization (fetch and merge operations) Data cases: 1. Many new documents, few versions. 2. Many documents with many versions organized linearly. Branches must be handled efficiently as well, but the first priority is to make first two cases as performant as possible. Fetch contexts: 1. Occasional fetch (hourly, daily etc.) 2. Streamed syncing (i.e. master-slave replication) The concept. 1. Every repository is identified by UUID. 2. Every repository writes a log of commits. 3. Commit tuples: (timestamp, "store", uuid, version) (timestamp, "pull", repo_uuid, repo_timestamp) 4. When you pull from a repository: 1. Find out the latest timestamp in your history (index: repo_uuid -> repo_timestamp) 2. If there is not timestamp yet, pull the whole log. 3. If there is a timestamp for a repository UUID, pull the tail of the log. 4. For each "store" record: fetch the version. 5. For each "pull" record - add to a deferred list of repositories waiting for update. 6. When whole log is fetched, fetch deferred repositories. We have two options here: 1. Fetch from the same repository we've been fetching from few moments ago (say, fetch B log from A) 2. Or, fetch directly from the desired repository (B log from B repository) Partial fetches. Repository may expose only limited set of documents for synchronization, specified with a view. Thus, the view needs to run an update log. TODO: how to efficiently determine which versions are already fetched? Security. We may add git-like security features this way: 1. Each version number is a UUIDv5 (SHA-1) of the doc's content. 2. Each subsequent doc UUID is a UUIDv5 for previous commit content. == Availability











