add godoc badge

carapace · Oct 24, 2018 · a48750f · a48750f
1 parent 2fc7a95
commit a48750f
Showing 1 changed file with 7 additions and 103 deletions.
diff --git a/README.md b/README.md
@@ -2,29 +2,23 @@
 
 [![Build Status](https://travis-ci.com/carapace/cellar.svg?branch=master)](https://travis-ci.com/carapace/cellar)
 [![CircleCI](https://circleci.com/gh/carapace/cellar/tree/master.svg?style=svg)](https://circleci.com/gh/carapace/cellar/tree/master)
-[![Go Report Card](https://goreportcard.com/badge/github.com/carapace/cellar)](https://goreportcard.com/report/github.com/carapace/cellar)
 [![Coverage Status](https://coveralls.io/repos/github/carapace/cellar/badge.svg?branch=master)](https://coveralls.io/github/carapace/cellar?branch=master)
+
 [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
+[![](https://godoc.org/github.com/carapace/cellar?status.svg)](http://godoc.org/github.com/carapace/cellar)
+[![Go Report Card](https://goreportcard.com/badge/github.com/carapace/cellar)](https://goreportcard.com/report/github.com/carapace/cellar)
 
-Cellar is the append-only storage backend in Go designed for the analytical
-workloads. It replaces [geyser-net](https://github.com/abdullin/geyser-net).
+Cellar is the append-only storage backend in Go designed based on Abdullin Cellar.
+This fork is currently being redesigned, so the API should be considered unstable.
 
 Core features:
 
 - events are automatically split into the chunks;
-- chunks are encrypted (LZ4) and compressed;
+- chunks may be encrypted using the Cipher interface;
 - designed for batching operations (high throughput);
 - supports single writer and multiple concurrent readers;
 - store secondary indexes, lookups in the metadata DB.
 
-This storage takes ideas from the [Message Vault](https://github.com/abdullin/messageVault),
-which was based on the ideas of Kafka and append-only storage in [Lokad.CQRS](https://github.com/abdullin/lokad-cqrs)
-
-Analytical pipeline on top of this library was deployed at
-HappyPancake to run real-time aggregation and long-term data analysis
-on the largest social website in Sweden. You can read more about it in
-[Real-time Analytics with Go and LMDB](https://abdullin.com/bitgn/real-time-analytics/).
-
 # Contributors
 
 In the alphabetical order:
@@ -38,100 +32,10 @@ Don't hesitate to send a PR to include your profile.
 
 Cellar stores data in a very simple manner:
 
-- LMDB database is used for keeping metadata (including user-defined);
+- MetaDB database is used for keeping metadata (including user-defined), see metadb.go;
 - a single pre-allocated file is used to buffer all writes;
 - when buffer fills, it is compressed, encrypted and added to the chunk list.
 
-# Writing
-
-You can have **only one writer at a time**. This writer has two operations:
-
-- `Append` - adds new bytes to the buffer, but doesn't flush it.
-- `Checkpoint` - performs all the flushing and saves the checkpoints.
-
-The store is optimized for throughput. You can efficiently execute
-thousands of appends followed by a single call to `Checkpoint`.
-
-Whenever a buffer is about to overflow (exceed the predefined max
-size), it will be "sealed" into an immutable chunk (compressed,
-encrypted and added to the chunk table) and replaced by a new buffer.
-
-See tests in `writer_test.go` for sample usage patters (for both
-writing and reading).
-
-# Reading
-
-At any point in time **multiple readers could be created** via
-`NewReader(folder, encryptionKey)`. You can optionally configure
-reader after creation by setting `StartPos` or `EndPos` to constrain
-reading to a part of the database.
-
-
-Readers have following operations available:
-
-- `Scan` - reads the database by executing the passed function against
-  each record;
-- `ReadDb` - executes LMDB transaction against the metadata database
-  (used to read lookup tables or indexes stored by the
-  custom writing logic);
-- `ScanAsync` - launches reading in a goroutine and returns a buffered
-  channel that will be filled up with records.
-
-Unit tests in `writer_test.go` feature use of readers as well.
-
-Note, that the reader tries to help you in achieving maximum
-throughput. While reading events from the chunk, it will decrypt and
-unpack the entire file in one go, allocating a memory buffer. All
-individual event reads will be performed against this buffer.
-
-# Example: Incremental Reporting
-
-This library was used as a building block for capturing millions and
-billions of events and then running reports on them. Consider a
-following example of building an incremental reporting pipeline.
-
-There is an external append-only storage with billions of events and a
-few terabytes of data (events are compressed separately with an
-equivalent of Snappy). It is located on a remote storage (cloud or a
-NAS). It is required to run custom reports on this data, refreshing
-them every hour.
-
-Cellar storage could be used to serve as a local cache on a dedicated
-reporting machine (e.g. you can find an instance with 32GB of RAM,
-Intel Xeon and 500GB of NNVMe SSD under 100 EUR per month). Since
-Cellar storage compresses events in chunks, high compression ratio
-could be achieved. For instance, protobuf messages tend to get
-compression of 2-10 in chunks.
-
-A solution might include an equivalent of a cron job that will execute
-following apps in sequence:
-
-- import job - a golang console that reads the last retrieved offset
-  from the cellar, requests any new data from the remote storage and
-  stores it locally in raw format;
-- compaction job - a golang console that incrementally pumps data from
-  the "raw" cellar storage to another (using checkpoints to determine
-  the location), while compacting and filtering events to keep only
-  the ones needed for reporting;
-- report jobs - apps that perform a full scan on the compacted data,
-  building reports in memory and then dumping them into the TSV (or
-  whatever is format is used by your data processing framework).
-
-All these steps usually execute fast even on large datasets, since (1)
-and (2) are incremental and operate only on the fresh data. (3) can
-require full DB, however it works with the optimized and compacted
-data, hence it will be fast as well. To get the most performance, you
-might need to structure your messages for very fast reads without
-unnecessary memory allocations or CPU work (e.g. using something like
-FlatBuffers instead of JSON or ProtoBuf).
-
-Note, that the compaction job is optional. However, on fairly large
-datasets, it might make sense to optimize messages for very fast
-reads, while discarding all the unnecessary information. Should the
-job requirements change, you'll need to update the compaction logic,
-discard the compacted store and re-process all the raw data from the
-start.
-
 # License
 
 3-clause BSD license.