Skip to content

Conversation

@garrensmith
Copy link
Member

Overview

Add an endpoint that returns the partition size and doc count

Testing recommendations

Create documents in different partitions. Compare the partition size versus the GET /db size information.

Checklist

  • Code is written and works correctly;
  • Changes are covered by tests;
  • Documentation reflects the changes;

garrensmith and others added 23 commits August 6, 2018 14:20
This props list is recorded in each database shard as well as the
shard document in the special _dbs database.

Co-authored-by: Garren Smith <garren.smith@gmail.com>
Co-authored-by: Robert Newson <rnewson@apache.org>
Co-authored-by: Garren Smith <garren.smith@gmail.com>
Co-authored-by: Robert Newson <rnewson@apache.org>
Default to database's partitioned setting if not present in ddoc.
Co-authored-by: Robert Newson <rnewson@apache.org>
Co-authored-by: Paul J. Davis <paul.joseph.davis@gmail.com>
Co-authored-by: Garren Smith <garren.smith@gmail.com>
Co-authored-by: Robert Newson <rnewson@apache.org>
Adds tests to validate the all_docs optimisations works for partitions
* Block design documents with partitioned option in non-partitioned db
* Prohibit javascript reduces in partitioned:true ddocs
* Prohibit include_docs=true for _view in partitioned db
'skip' is implemented efficiently at the worker level but we've
disabled it for clustered views because of the multiple shards (and
not being able to calculate the right skip value to pass to each
worker). With a partitioned query, this problem is gone, as the value
the query specifies will be the right value for all workers (as they
hit the same shard range).

This commit removes the old fix_skip_and_limit function from
fabric_rpc and moves the logic up to the coordinators.
Copy link
Member

@rnewson rnewson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The broad strokes are right but I'd like you to look at how db_info is done and follow it a bit more closely.


%% @doc returns the size of a given partition
-spec get_partition_info(dbname(), Partition::binary()) ->
{ok, {partition_info, non_neg_integer()}}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that spec doesn't look quite right.

case fabric_dict:size(Counters1) =:= 0 of
true ->
[FirstInfo | RestInfos] = Acc2,
PartitionInfo = get_max_partition_size(FirstInfo, RestInfos),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take a look at fabric_db_info.erl for how it merges results.

[FirstInfo | RestInfos] = Acc2,
PartitionInfo = get_max_partition_size(FirstInfo, RestInfos),
{stop,
[{db_name, Name} | format_partition(PartitionInfo)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peculiar indentation (here and elsewhere).

{doc_del_count, DocDelCount},
{active, SizeInfo#size_info.active},
{external, SizeInfo#size_info.external}
{size,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sizes"

send_json(Req, 202, {[{ok, true}]}).


handle_partition_req(#httpd{method='GET',path_parts=[DbName, <<"_partition">>, Partition, <<"_info">>]}=Req, _Db) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put this at /dbname/_partition/partition as a parallel to getting db info at /dbname

Sizes = couch_db_engine:get_partition_info(Db, Partition),
{ok, Sizes};
get_partition_info(_Db, _Partition) ->
throw({badrequest, <<"`partition` is not valid">>}).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad_request

-type epochs() :: [{Node::atom(), UpdateSeq::non_neg_integer()}].
-type size_info() :: [{Name::atom(), Size::non_neg_integer()}].
-type partition_info() :: [
{Partition::atom(), Partition::binary()} |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odd. why aren't we saying {partition, Partition::binary()} here? using the same name twice is confusing and probably doesn't even work?

@rnewson rnewson force-pushed the feature/user-partitioned-databases branch 2 times, most recently from 8b7f146 to fcb1795 Compare October 1, 2018 16:33
{DocCount::atom(), DocCount::non_neg_integer()} |
{DocDelCount::atom(), DocDelCount::non_neg_integer()} |
{Size::atom(), size_info()}
{partition, Partition::binary()} |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db_name is missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't return the db name at this level. We do that at the fabric level.

Copy link
Member

@rnewson rnewson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs some squashing but otherwise +1

Add an endpoint that returns the partition size and doc count
@garrensmith garrensmith force-pushed the add-partition-info-endpoint branch from fac4775 to 2f3aa8b Compare October 2, 2018 12:34
@garrensmith
Copy link
Member Author

This is merged into the partitions branch.

@garrensmith garrensmith closed this Oct 2, 2018
@garrensmith garrensmith deleted the add-partition-info-endpoint branch October 4, 2018 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants