Skip to content

Commit

Permalink
Reduce size of #leaf.atts keys
Browse files Browse the repository at this point in the history
`#leaf.atts` data structure is a `[{Position, AttachmentLength}, ...]` proplist
which keeps track of attachment lengths and it is used when calculating
external data size of documents. `Position` is supposed to uniquely identify an
attachment in a file stream. Initially it was just an integer file offset. Then,
after some refactoring work it became a list of `{Position, Size}` tuples.

During the PSE work streams were abstracted such that each engine can supply
its own stream implementation. The position in the stream then became a tuple
that looks like `{couch_bt_engine_stream,{<0.1922.0>,[{4267,21}]}}`. This was
written to the file the `#leaf.atts` data structure. While still correct, it is
unnecessarily verbose wasting around 100 bytes per attachment, per leaf.

To fix it use the disk serialized version of the stream position as returned
from `couch_stream:to_disk_term`. In case of the default CouchDB engine
implementation, this should avoid writing the module name and the pid value for
each attachment entry.
  • Loading branch information
nickva committed Aug 15, 2018
1 parent d3453d2 commit 861a3c0
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions src/couch/src/couch_att.erl
Original file line number Diff line number Diff line change
Expand Up @@ -308,8 +308,14 @@ size_info([]) ->
{ok, []};
size_info(Atts) ->
Info = lists:map(fun(Att) ->
[{_, Pos}, AttLen] = fetch([data, att_len], Att),
{Pos, AttLen}
AttLen = fetch(att_len, Att),
case fetch(data, Att) of
{stream, StreamEngine} ->
{ok, SPos} = couch_stream:to_disk_term(StreamEngine),
{SPos, AttLen};
{_, SPos} ->
{SPos, AttLen}
end
end, Atts),
{ok, lists:usort(Info)}.

Expand Down

0 comments on commit 861a3c0

Please sign in to comment.