Data aggregation C++ API #248

mellis13 · 2022-04-14T23:05:32Z

This is a draft PR for the data aggregation API in C++. A draft PR has been put up so parallel work can be done multi-threading the dataset retrieval from cluster shards. CI tests will likely fail until all functionality is done.

The following items still need to be addressed before final review and merge of this PR:

codecov-commenter · 2022-04-15T00:03:54Z

Codecov Report

Merging #248 (d0c0d8c) into develop (7f2f9ea) will decrease coverage by 0.69%.
The diff coverage is 80.45%.

@@             Coverage Diff             @@
##           develop     #248      +/-   ##
===========================================
- Coverage    80.85%   80.15%   -0.70%     
===========================================
  Files           51       54       +3     
  Lines         3008     3326     +318     
===========================================
+ Hits          2432     2666     +234     
- Misses         576      660      +84

Impacted Files	Coverage Δ
include/client.h	`100.00% <ø> (ø)`
include/commandlist.h	`100.00% <ø> (ø)`
include/commandreply.h	`100.00% <ø> (ø)`
include/redisserver.h	`33.33% <ø> (ø)`
src/cpp/pipelinereply.cpp	`36.84% <36.84%> (ø)`
include/pipelinereply.h	`50.00% <50.00%> (ø)`
src/cpp/commandlist.cpp	`87.80% <66.66%> (-3.63%)`	⬇️
src/cpp/client.cpp	`83.89% <88.20%> (+0.44%)`	⬆️
include/dataset.h	`100.00% <100.00%> (ø)`
src/cpp/redis.cpp	`54.54% <100.00%> (-4.51%)`	⬇️
... and 14 more

get_list_length() returns type int (not size_t). This seems more consistent.

unit tests.

include/client.h

src/cpp/client.cpp

src/cpp/pipelinereply.cpp

tests/cpp/unit-tests/test_aggregation_list.cpp

retrieval from aggregation lists. Tests now also check that dataset names match.

error handling is easier to understand. This required some changes to PipelineReply ojbect and an additional vector of command pointers in the unordered pipeline execution.

ashao

Really nice work here @mellis13. You and @billschereriii already anticipated/addressed the things that I thought might have been issues. The main change that I'm suggesting is to add/expose pop functionality to the list in the event that you have multiple consumers of the aggregated list.

doc/data_structures.rst

src/cpp/rediscluster.cpp

src/cpp/client.cpp

than or equal.

ashao

Changes all look good to me

mellis13 added 14 commits April 19, 2022 14:57

Initial implementation of data aggregation API.

b4c95b4

Prototype of pipelining metdata command.

b8c4b66

Remove debug output.

3669899

Remove another debug output.

9f48dfe

Protoype of full pipe.

c1ac837

Prototype of zero copy version of pipeline reply.

5e6aeb3

Cleanup of some of the pipeline routines.

86bd35a

More cleanup.

78fc86a

Add run_via_unordered_pipelines to Redis.

e9834fd

Add pipelinereply to CMakeLists.txt

8bdd13e

Change list_length to be of type int. This is done because

c4bf4cf

get_list_length() returns type int (not size_t). This seems more consistent.

Fix type on operator[].

4f4ca1f

Forgot to add source file to last change.

a6c163d

Add copy and rename functions for aggregation lists and

ac17d2c

unit tests.

mellis13 force-pushed the data_aggregation_pipeline branch from c29b2f8 to ac17d2c Compare April 19, 2022 22:02

mellis13 added 2 commits April 19, 2022 15:08

Fix unwanted commit.

f1dc193

Add docs.

63a2640