Draft of all_docs and dbinfo RFC, still WIP #406

kocolosk · 2019-04-17T21:37:13Z

Overview

Description of implementing _all_docs and GET /dbname info in FDB

Checklist

Documentation is written and is accurate;
make check passes with no errors
Update rebar.config.script with the commit hash once this PR is rebased and merged

rnewson · 2019-04-18T11:13:46Z

rfcs/005-all-docs-index.md

+term representation of the JSON. This is a somewhat awkward choice as the
+internal Erlang term representation is liable to change over time (e.g. with the
+introduction of Maps in newer Erlang releases, or plausibly even a JSON decoder
+that directly emits the format defined in the document storage RFC).


This changed recently: apache/couchdb#1983

rnewson · 2019-04-18T11:41:04Z

Firstly, I feel we can deprecate or change anything in the db info blob as we see fit. We are planning a major version bump, 3.0, to signal the path forward, and a further major version bump, 4.0, which is when the foundationdb backplane appears.

The paragraphs on the _all_docs subspace are clear to me as are the rules on incrementing and decrementing doc_count and doc_del_count. Both of those values retain meaning to the user, given that tombstones will still exist.

It would be good if purge_seq were to vanish, replaced by atomic removal of all trace of a document including secondary indexes, but that does seem unlikely (or, at least, it makes third party indexing difficult to impossible). Purge is a large topic of its own and so I won't say more here.

On sizes, the new couch_ejson_size:encoded_size(Body) calculation should be the same value as today and worth preserving. Beyond that, given the absence of compaction, is there any need for other sizes? The only other obvious one that we can calculate is the sum of the lengths of all the keys and values used to store the database (that is, including all the overhead of the exploded form and the metadata). I think that's doable with atomic operations but, I'm imagining, involves housekeeping in many places. We need to decide what these values are for first, what decisions do they help users make? external exists, largely, for Cloudant, as it represents the data as the user understands it, and Cloudant charges by the GB for that. It is obviously also useful to other users but I suspect most users would only care for the file value, in order to direct compaction efforts or other cleanup (including deleting databases entirely). The distinction between active data and trash is eliminated by FoundationDB and our choice of update semantic (the old revision is atomically removed when the new revision supercedes it), so we should drop that item.

For r/w/n/q I feel strongly we should drop them. They are intimately coupled to the current architecture and I think it would be misleading to carry these values across. The configuration of the FoundationDB cluster can take many forms and can change at runtime; we should cleanly decouple ourselves from it. I don't think we should pass-through any foundationdb status or config either, though our documentation / release notes for 4.0 should include a user guide for the basics. Others might disagree on how much handholding we should be doing. Are we, in effect, embedding FoundationDB or are we a layer above it? As code, it's emphatically the latter, but a user could legitimately hold either perspective. This matters for this RFC as its the deciding factor on the scope of the "info" in /dbname and elsewhere.

+1 on removing total_rows, offset, compact_running, other, data_size and disk_size but I note that the replicator looks for instance_start_time. We can enhance the replicator in 3.0 to no longer do so and then state that 4.0 cannot replicate to < 3.0 systems, but perhaps it wouldn't hurt to include the field either hardcoded to 0 (as we've done from 2.0 onward for compatibility reasons) or set it to the start time of the fdb cluster if that's cheaply available.

janl

+1 mostly to what Bob says modulo my question about offset and total_rows.

Also:

Beyond that, given the absence of compaction, is there any need for other sizes?

I think if a user-need arises for getting anything but the Erlang-term data size (e.g. the exploded docs as @rnewson posits), we can think about adding this later, but I’d leave this and any disk-level data sizes out.

I can’t yet tell how much introspection we get from FDB, but I assume there is enough in there to make storage ops decisions.

janl · 2019-05-10T10:10:13Z

rfcs/005-all-docs-index.md

+
+## HTTP API deprecations
+
+The `total_rows` and `offset` fields are removed from the response to


Why are we dropping those and if we do, how do we tell users to “iterate over all docs in a db”? I realise the semantics of this are going to change with the 5s transaction limit, but users need at least guidance on how to do this, and what to expect.

My initial thinking was to drop offset because there's a fair bit of overhead to compute it (FDB doesn't maintain counts of keys within in a range by default), and it seemed to be of limited value. I thought we had gotten to the point where the stock advice on paging was to use start_key / end_key and limit. And then I just went ahead and dropped total_rows to avoid the extra lookup and for parity with _changes.

If there's an argument to keep total_rows that's a pretty cheap thing to include.

In fact, the current implementation appears to include total_rows. It also includes an offset field set to null

janl · 2019-05-10T10:13:22Z

rfcs/005-all-docs-index.md

+
+* `compact_running`
+
+* `disk_format_version`: this is a tricky one. We define "format versions" for


Just a thought, but would choosing smart values for format versions and ORing them into a chmod-style bitfield work?

If that’s impractical, do we expect format definitions to mix-and-match over time, or could we define “the set of per-let format versions XYZ is 1, XZZ is 2” and so on. This would be close to what we have now where we make arbitrary changes to the db header record and each change gets an inc by one.

I think this would devolve to "CouchDB version 4.x.y writes data using on-disk format version 12"; i.e., it'd be a property of the running CouchDB code rather than a property of a database. Individual databases are just portions of a global key space, and we don't have any background process that's sitting down and rewriting an entire database into the latest format, so at first glance we'll need to support full schema evolution for each data type we're storing in FDB.

janl · 2019-05-10T10:13:45Z

rfcs/005-all-docs-index.md

+atomic operations, and a record of the size of each revision in the ?REVISIONS
+subpsace so that a transaction can compute the delta for each document.
+
+### Clustering


I agree with @rnewson that we should drop these.

Draft of all_docs and dbinfo, still WIP

70f4b5c

rnewson reviewed Apr 18, 2019

View reviewed changes

janl reviewed May 10, 2019

View reviewed changes

wohali mentioned this pull request May 13, 2020

[RFC] Background index building in CouchDB 4 #542

Merged

Merge branch 'master' into rfc/005-all-docs-index

7097ff3

wohali marked this pull request as ready for review August 12, 2020 00:42

Merge branch 'master' into rfc/005-all-docs-index

5ac34ea

wohali merged commit e692ef4 into master Aug 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft of all_docs and dbinfo RFC, still WIP #406

Draft of all_docs and dbinfo RFC, still WIP #406

Uh oh!

kocolosk commented Apr 17, 2019

Uh oh!

rnewson Apr 18, 2019

Uh oh!

rnewson commented Apr 18, 2019

Uh oh!

janl left a comment

Uh oh!

janl May 10, 2019

Uh oh!

kocolosk Sep 26, 2019

Uh oh!

kocolosk Sep 26, 2019

Uh oh!

janl May 10, 2019

Uh oh!

kocolosk Sep 26, 2019

Uh oh!

janl May 10, 2019

Uh oh!

Uh oh!


		## HTTP API deprecations

		The `total_rows` and `offset` fields are removed from the response to


		* `compact_running`

		* `disk_format_version`: this is a tricky one. We define "format versions" for

Draft of all_docs and dbinfo RFC, still WIP #406

Draft of all_docs and dbinfo RFC, still WIP #406

Uh oh!

Conversation

kocolosk commented Apr 17, 2019

Overview

Checklist

Uh oh!

rnewson Apr 18, 2019

Choose a reason for hiding this comment

Uh oh!

rnewson commented Apr 18, 2019

Uh oh!

janl left a comment

Choose a reason for hiding this comment

Uh oh!

janl May 10, 2019

Choose a reason for hiding this comment

Uh oh!

kocolosk Sep 26, 2019

Choose a reason for hiding this comment

Uh oh!

kocolosk Sep 26, 2019

Choose a reason for hiding this comment

Uh oh!

janl May 10, 2019

Choose a reason for hiding this comment

Uh oh!

kocolosk Sep 26, 2019

Choose a reason for hiding this comment

Uh oh!

janl May 10, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!