Skip to content
This repository has been archived by the owner on Feb 8, 2019. It is now read-only.

QUICKSTEP-70-71 Improve aggregation performance #179

Merged
merged 1 commit into from Feb 7, 2017
Merged

Conversation

jianqiao
Copy link
Contributor

@jianqiao jianqiao commented Feb 5, 2017

This PR implements two features that improve aggregation performance:

  1. Adds CollisionFreeVectorTable to support specialized high-performance aggregation.
  2. Adds support for aggregation copy elision that we only materialize intermediate results for non-trivial expressions.

For feature 1, when the group-by attribute is a range-bounded single attribute of INT or LONG type. We can use a vector of type std::vector<std::atomic<StateT>> to store the aggregation states, where StateT is the aggregation state type (currently restricted to LONG and DOUBLE). Then during aggregation, for each tuple, we locate the aggregation state with the group-by key's value as index to the state vector, and concurrently update the state with C++'s atomic primitives.

For feature 2, note that the current implementation of aggregation always creates a ColumnVectorsValueAccessor to store the results of ALL the input expressions. However, we can avoid the creation of a column vector (thus avoiding copying values into the column vector) if the aggregation is on a simple attribute, e.g. SUM(x). Thus by PR, when performing aggregation we prepare two input ValueAccessors: one BASE accessor that is created directly from the input relation's storage block, and one DERIVED accessor that is the temporary result ColumnVectorsValueAccessor. Each aggregation argument may be from the base accessor (meaning that it is a simple attribute) or from the derived accessor (meaning that it is a non-trivial expression that gets evaluated). The two accessors are then properly handled in aggregation handles and aggregation hash tables.

Main changes:
expressions/aggregation: Updated the aggregation handles to support copy elision. Also did some cleanups.
relational_operators: Added InitializeAggregationOperator to support parallel initialization of the aggregation state (just memset the memory to 0) -- because it takes a relatively long time to do the initialization with single thread if the aggregation hash table is large.
storage: Added CollisionFreeVectorTable. Renamed FastHashTable to PackedPayloadHashTable, made it support copy elision, and cleaned up the class to remove unused methods. Refactored AggregationOperationState to support copy elision and support the new aggregation. Moved aggregation code out of StorageBlock.

This PR significantly improves some TPC-H queries' performance. For example, it improves TPC-H Q18 from ~27.5s to ~3.5s, with scale factor 100 on a cloudlab machine.

Below shows the TPC-H performance (scale factor 100 on a cloudlab machine) with recently committed optimizations up to this point:

TPCH SF100 master (ms) w/ optimizations (ms)
Q01 13629 11221
Q02 537 460
Q03 4824 4124
Q04 2185 2203
Q05 5517 5282
Q06 399 401
Q07 18563 3456
Q08 1746 899
Q09 7247 5586
Q10 6745 5665
Q11 1053 247
Q12 1713 1698
Q13 22896 15582
Q14 805 745
Q15 897 431
Q16 9942 9158
Q17 1588 1117
Q18 27459 3507
Q19 1711 1609
Q20 1204 1014
Q21 8671 7886
Q22 6178 724
Total 145509 83016

@asfgit asfgit force-pushed the collision-free-agg branch 2 times, most recently from 7285f90 to fe2ec54 Compare February 5, 2017 18:11
@zuyu
Copy link
Member

zuyu commented Feb 5, 2017

Hi @jianqiao,

A quick question on Feature 1 using a vector-based aggregation: for a group-by w/ a known bounded range (i.e., the min and max value), do we always choose this approach over the hash-based, or depending on the range size (i.e., if the range is too wide, we may fall back to the hash-based)? Thanks!

@jianqiao
Copy link
Contributor Author

jianqiao commented Feb 5, 2017

It is depending on the range size. Currently there is a gflag for setting the range upbound at ExecutionGenerator.cpp line 440.

Copy link
Member

@zuyu zuyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only concern regarding this PR is the way to deal with partitions. I guess we may merge PartitionedHashTablePool and HashTablePool so that the later is the special case of the former w/ a single partition.

@@ -61,7 +61,7 @@ class HashTableStateUpserterFast {
* table. The corresponding state (for the same key) in the destination
* hash table will be upserted.
**/
HashTableStateUpserterFast(const HandleT &handle,
HashTableStateUpserter(const HandleT &handle,
const std::uint8_t *source_state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align with the line above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

estimated_num_groups,
&max_num_groups);

if (can_use_collision_free_aggregation) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, we actually don't need this extra bool variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

#endif

// Supports only single group-by key.
if (aggregate->grouping_expressions().size() != 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For multiple small group-by keys, I think we could create a multi-dimension array for the same goal as the single key.

Copy link
Contributor Author

@jianqiao jianqiao Feb 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we will have a follow-up PR dealing with multiple group-by keys to improve TPC-H Q01.

break;
}
default:
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason of supporting only such types about the overflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can support more types later. For any type/any number of group-by keys, if we can define a one-to-one mapping function that maps the keys to range-bounded integers, then this aggregation is applicable.

}

// TODO(jianqiao): Support AggregationID::AVG.
switch (agg_func->getAggregate().getAggregationID()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor using QUICKSTEP_EQUALS_ANY_CONSTANT defined in utility/EqualsAnyConstant.hpp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.


void *aligned_memory_start = this->blob_->getMemoryMutable();
std::size_t available_memory = num_storage_slots * kSlotSizeBytes;
if (align(alignof(Header),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor using CHECK.

Similarly below.

"StorageBlob used to hold resizable "
"SeparateChainingHashTable is too small to meet alignment "
"requirements of SeparateChainingHashTable::Header.");
} else if (aligned_memory_start != this->blob_->getMemoryMutable()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor using LOG_IF.

// Separate chaining ensures that any resized hash table with more buckets
// than the original table will be able to hold more entries than the
// original.
DEBUG_ASSERT(retry_num == 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use DCHECK_EQ instead.

variable_storage_required;
const std::size_t resized_storage_slots =
this->storage_manager_->SlotsNeededForBytes(resized_memory_required);
if (resized_storage_slots == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor using CHECK.

return true;
} else {
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but we could flip the condition:


  if (*entry_num >= header_->buckets_allocated.load(std::memory_order_relaxed)) {
    return false;
  }

  const char *bucket =
      static_cast<const char *>(buckets_) + (*entry_num) * bucket_size_;
  *key = key_manager_.getKeyComponentTyped(bucket, 0);
  *value = reinterpret_cast<const std::uint8_t *>(bucket + kValueOffset);
  ++(*entry_num);
  return true;

Similarly below.

@jianqiao
Copy link
Contributor Author

jianqiao commented Feb 7, 2017

For the question about PartitionedHashTablePool and HashTablePool. Note that their use patterns are different so perhaps it is not natural to merge them into one class.

PartitionedHashTablePool creates a fixed number of hash tables on its construction. The use pattern is that every AggregationWorkOrder updates all of these hash tables and every FinalizeAggregationWorkOrder finalizes one of these hash tables.

HashTablePool creates hash tables on demand. The current use pattern is that every AggregationWorkOrder checkouts exclusively one hash table, updates the hash table, and returns the hash table back to the pool. Then only one FinalizeAggregationWorkOrder is created to merge all the tables in the pool into the final hash table.

@zuyu
Copy link
Member

zuyu commented Feb 7, 2017

Please resync with the master branch, and I will merge it. Thanks.

@jianqiao
Copy link
Contributor Author

jianqiao commented Feb 7, 2017

Just rebased on master.

@jianqiao
Copy link
Contributor Author

jianqiao commented Feb 7, 2017

There is a CMakeLists to be updated -- do not merge at this moment.

…egation for range-bounded single integer group-by key.

- Supports copy elision for aggregation.
@jianqiao
Copy link
Contributor Author

jianqiao commented Feb 7, 2017

Updated, and tested locally.

@asfgit asfgit merged commit 2d89e4f into master Feb 7, 2017
@asfgit asfgit deleted the collision-free-agg branch February 7, 2017 22:35
@pateljm
Copy link
Contributor

pateljm commented Feb 7, 2017

Very impressive algorithmic change @jianqiao !!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants