-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage dictionary #763
Storage dictionary #763
Conversation
dbms/src/Dictionaries/IDictionary.h
Outdated
@@ -53,6 +55,8 @@ struct IDictionaryBase | |||
|
|||
virtual bool isInjective(const std::string & attribute_name) const = 0; | |||
|
|||
virtual BlockInputStreamPtr getBlockInputStream(const Names & column_names) const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment.
{ | ||
const auto keys_size = key_columns.size(); | ||
size_t sum_keys_size{}; | ||
for (const auto i : ext::range(0, keys_size)) | ||
|
||
Arena arena; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporary Arena for every row is very expensive.
Better to implement overload IColumn::serializeValueIntoArena
for ArenaWithFreeLists
.
(or call it IColumn::serializeValueIntoArenaWithFreeLists
and provide a wrapper here.)
@@ -208,6 +210,10 @@ class CacheDictionary final : public IDictionary | |||
const std::vector<Key> & requested_ids, PresentIdHandler && on_cell_updated, | |||
AbsentIdHandler && on_id_not_found) const; | |||
|
|||
PaddedPODArray<Key> getCachedIds() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should not be public.
BTW, you don't have the same method in ComplexKeyCacheDictionary
and this is inconsistent. Better to remove this method from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to private
} | ||
|
||
using BlockInputStreamType = DictionaryBlockInputStream<ComplexKeyCacheDictionary, UInt64>; | ||
return std::move(std::make_unique<BlockInputStreamType>(*this, keys, column_names)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not std::make_shared
instead of std::move(std::make_unique<BlockInputStreamType>(
?
std::vector<StringRef> ComplexKeyHashedDictionary::getKeys(const Attribute & attribute) const | ||
{ | ||
const ContainerType<T> & attr = *std::get<ContainerPtrType<T>>(attribute.maps); | ||
std::vector<StringRef> keys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reserve space for vector.
In other places too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added reserve call
BlockInputStreamPtr ComplexKeyHashedDictionary::getBlockInputStream(const Names & column_names) const | ||
{ | ||
using BlockInputStreamType = DictionaryBlockInputStream<ComplexKeyHashedDictionary, UInt64>; | ||
return std::move(std::make_unique<BlockInputStreamType>(*this, getKeys(), column_names)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See prev. comment.
|
||
/* | ||
* BlockInputStream implementation for external dictionaries | ||
* read() returns single block consisting of the in-memory contents of the dictionaries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single block will be too large for some dictionaries. It's better to generate blocks of max_block_size.
if (attribute) { | ||
name = attribute->name; | ||
} | ||
if (column_names.find(name) != column_names.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style.
columns.emplace_back(column, attribute.type, attribute.name); | ||
} | ||
} | ||
block = Block(columns); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now std::move
will be appropriate.
dbms/src/Storages/StorageFactory.cpp
Outdated
@@ -262,7 +249,7 @@ StoragePtr StorageFactory::get( | |||
} | |||
else if (name == "View") | |||
{ | |||
return StorageView::create( | |||
return StorageView::create( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should setup your IDE to trim trailing whitespaces.
dbms/src/Storages/StorageFactory.cpp
Outdated
@@ -225,15 +225,6 @@ static void setGraphitePatternsFromConfig(const Context & context, | |||
} | |||
|
|||
|
|||
/// Some types are only for intermediate values of expressions and cannot be used in tables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bad merge. You have deleted my changes.
} | ||
}; | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to setup your IDE to add newlines at end of file. This does not matter a lot (for C++11), but will be more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
…ion in ComplexKeyCacheDictionary; style fixes
1e25259
to
38e9e83
Compare
Added:
IBlockInputStream
representing in-memry stored dictionary dataChanged:
ComplexKeyHashDictionary
andComplexKeyCacheDictionary
key serialization format