Add AVX-512VL to dynamic dispatch and optimise QBit [un]transposition#88445
Add AVX-512VL to dynamic dispatch and optimise QBit [un]transposition#88445
AVX-512VL to dynamic dispatch and optimise QBit [un]transposition#88445Conversation
|
Workflow [PR], commit [47c15c5] Summary: ❌
|
AVX512VL to dynamic dispatch and optimise QBit [un]transpositionAVX-512VL to dynamic dispatch and optimise QBit [un]transposition
c1cfd2c to
49f6a01
Compare
| 11100000 | ||
| 11100000 | ||
| 11100000 | ||
| 11100000 | ||
| 11100000 | ||
| 11100000 | ||
| 00000111 | ||
| 00000111 | ||
| 00000111 | ||
| 00000111 | ||
| 00000111 | ||
| 00000111 |
There was a problem hiding this comment.
Internal representation of values was reversed to simplify serialization, that is why these changed
| SELECT vec.7 FROM qbit ORDER BY id; | ||
| SELECT vec.15 FROM qbit ORDER BY id; | ||
| SELECT vec.23 FROM qbit ORDER BY id; | ||
| SELECT vec.31 FROM qbit ORDER BY id; | ||
| SELECT bin(vec.7) FROM qbit ORDER BY id; | ||
| SELECT bin(vec.15) FROM qbit ORDER BY id; | ||
| SELECT bin(vec.23) FROM qbit ORDER BY id; | ||
| SELECT bin(vec.31) FROM qbit ORDER BY id; |
There was a problem hiding this comment.
Makes sense to look at underlying binary values, because displaying bytes as characters doesn't tell us what went wrong
| if (size > DEFAULT_MAX_STRING_SIZE) | ||
| throw Exception(ErrorCodes::TOO_LARGE_ARRAY_SIZE, "Too large QBit dimension (maximum: {})", DEFAULT_MAX_STRING_SIZE); |
There was a problem hiding this comment.
We have a setting max_binary_array_size in FormatSettings for binary formats. Let's use it instead of DEFAULT_MAX_STRING_SIZE.
| /// If the dimension % 8 != 0, the buffer will contain padding floats. Thus, `size` can be larger, equal, but never smaller than dimension | ||
| if (size < dimension) | ||
| throw Exception( | ||
| ErrorCodes::SERIALIZATION_ERROR, "Size of the read QBit {} doesn't match expected size {}", size, (dimension / 8) * 8); | ||
|
|
||
| return size; |
There was a problem hiding this comment.
Wait, does it mean that in RowBinary format we output padding floats? If yes, we need to fix this, we should output array with dimension floats. Otherwise user will get unexpected 0-s in their vectors during deserialization
There was a problem hiding this comment.
Isn't client aware of the dimension? It is one of the members of SerializationQBit. If not, it might also be a good idea to remove
if (size != dimension)
throw Exception(
ErrorCodes::SERIALIZATION_ERROR, "Dimension of the read QBit {} doesn't match expected dimension {}", size, dimension);in validateAndReadQBitSize too
There was a problem hiding this comment.
I removed trailing zeroes
| } | ||
|
|
||
| const char * value_bytes = reinterpret_cast<const char *>(value_floats.data()); | ||
| /// We do not need to worry about skipping padding floats at the tail here like we do in deserializeFloatsToQBitTuple(...) . |
There was a problem hiding this comment.
We just should not have padding floats at all in any format
5877065 to
eda0769
Compare
Avogar
left a comment
There was a problem hiding this comment.
Just 2 small comments, everything else looks good
| /// Transpose data | ||
| std::vector<char> transposed_bytes(bytes_per_fixedstring * element_size); | ||
| transposeBits<Word>(reinterpret_cast<const Word *>(value_bytes), reinterpret_cast<Word *>(transposed_bytes.data()), padded_n); | ||
| while (i < dimension) |
There was a problem hiding this comment.
Now it can be just simple for loop.
Co-authored-by: Pavel Kruglov <48961922+Avogar@users.noreply.github.com>
…se/ClickHouse into qbit-transposition-optimisation
Not for changelog because
QBithasn't been released yet. New things in this PR:VLinstruction set as an option forAVX-512dispatch. Previously, it was only available withVBMIinstructions, but there are machines that haveVLwithoutVBMI. This change was needed for ↓.QBituntransposition, speeding up distance calculations on it. To make this possible, simplified serialisation and removed MSB -> LSB -> MSB trickery we used before.QBitserialisation andgtest_qbit_serialization.cpptest fix.QBit[de]serialised trailing padding zeroes thatQBithad. This was not necessary, as we read/write vector. Thus, zeroes were removed.Changelog category (leave one):