Refragmenting table #628

akroviakov · 2023-08-10T14:03:51Z

This PR introduces a table refragmentation functionality, which allows to avoid doing an import of the same table just for a new fragment size.
It is relatively cheaper to refragment a table (as opposed to re-inserting it), since we don't have to modify the physical layout/location of the data in memory. The price we pay for refragmenting is recalculating the metadata (basically fragment's offset, row_count + small materialized aggregates per fragment) in order to change the logical view of the table.

Addresses #572 .

Spends quite some time in ArrowStorage::computeStats(), the code for FixedLengthEncoder::updateStatsEncoded() appears to be scalar, possible performance improvement opportunity for calculating min/max/detecting nulls using SIMD.

ienkovich · 2023-08-10T15:11:36Z

It is not enough to simply refragment a table inside the storage. Assigning the new fragment size means you change data assigned to already used chunk IDs. This means all data cached in buffer managers becomes irrelevant but you would still use it in the following queries.

There is other data that is referenced by chunk ID and can be cached, e. g. chunk stats.

So, refragmenting needs additional cleanups but it's not clear how to do it all atomically.

Did you consider creating a new table as a result of refragmentation? Same price, no conflicts.

akroviakov · 2023-08-11T09:59:47Z

@ienkovich Do we update anything above the storage layer when we append to a table and the last fragment wasn't full and now it starts to span arrow's chunks boundaries (and so we update its metadata)? https://github.com/intel-ai/hdk/blob/ad924b50a2e909a98efba9a6363f9b5415742bf6/omniscidb/ArrowStorage/ArrowStorage.cpp#L876

ienkovich · 2023-08-11T18:31:51Z

@ienkovich Do we update anything above the storage layer when we append to a table and the last fragment wasn't full and now it starts to span arrow's chunks boundaries (and so we update its metadata)?

Appended data was originally supported in buffer managers. The size of the chunk grows and buffer managers ask a storage for new data. If you reduce the size of a chunk then they would simply use the old data.

Devjiu · 2023-08-16T12:08:32Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+  mapd_unique_lock<mapd_shared_mutex> table_lock(table.mutex);
+  data_lock.unlock();
+  const size_t new_frag_count =
+      table.row_count / new_frag_size + ((table.row_count % new_frag_size) != 0);


Maybe std::ceil ? It will be more clear.

Devjiu · 2023-08-16T12:23:55Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+              ? dynamic_cast<const hdk::ir::ArrayBaseType*>(col_type)->elemType()
+              : col_type;
+      bool compute_stats = !col_type->isString();
+      if (compute_stats) {


I think it's better to move this condition out of cycle. Earlier you already checked which frag_ids that have String type.

Devjiu · 2023-08-16T12:26:48Z

omniscidb/ArrowStorage/ArrowStorage.cpp

-  for (auto& pr : column_infos_) {
-    if (pr.second->type->isExtDictionary()) {
-      dicts_to_remove.erase(pr.second->type->as<hdk::ir::ExtDictionaryType>()->dictId());
+  if (!is_view) {


simplify it please.

if (is_view) { return; }

Devjiu · 2023-08-16T12:27:19Z

omniscidb/ArrowStorage/ArrowStorage.cpp

-  for (auto& col_info : col_infos) {
-    if (col_info->type->isExtDictionary()) {
-      dicts_to_remove.insert(col_info->type->as<hdk::ir::ExtDictionaryType>()->dictId());
+  if (!is_view) {


same simplification here, please.

Devjiu · 2023-08-16T12:31:59Z

omniscidb/ArrowStorage/ArrowStorage.h

+  TableInfoPtr createRefragmentedView(const std::string& table_name,
+                                      const std::string& new_table_name,
+                                      const size_t new_frag_size);
+  void refragmentTable(const std::string& table_name, const size_t new_frag_size);


Do we need this methods in public? As far as I see getRefragmentedView uses both this methods, maybe only this method is enough?

kurapov-peter · 2023-08-16T15:49:21Z

omniscidb/ArrowStorage/ArrowStorage.h

@@ -128,7 +131,7 @@ class ArrowStorage : public SimpleSchemaProvider, public AbstractDataProvider {
  void appendParquetFile(const std::string& file_name, int table_id);

  void dropTable(const std::string& table_name, bool throw_if_not_exist = false);
-  void dropTable(int table_id, bool throw_if_not_exist = false);
+  void dropTable(int table_id, bool is_view = false, bool throw_if_not_exist = false);


Do we guarantee the table exists if there's a view for it? What happens to views if a non-view table is dropped?

ienkovich

The general implementation looks fine, but there are some synchronization issues. To correctly create new table you need:

Get data unique lock
Get schema unique lock
Get dict unique lock
Make all checks and add a new table + modify dict data
Release schema and dict locks
Create new table data
Get a table unique lock
Release the data lock
Make the rest of the manipulations with the table
Release the table lock

It's hard to make it using all three methods taking the table name as a parameter. I suggest you have createRefragmentedView as the entry point (better name than getRefragmentedView because it is not a simple getter), do steps 1-8 there, and then call refragmentTable passing TableData reference as a param. Then you should be thread-safe.

ienkovich · 2023-08-16T18:40:25Z

omniscidb/ArrowStorage/ArrowStorage.h

@@ -128,7 +131,7 @@ class ArrowStorage : public SimpleSchemaProvider, public AbstractDataProvider {
  void appendParquetFile(const std::string& file_name, int table_id);

  void dropTable(const std::string& table_name, bool throw_if_not_exist = false);
-  void dropTable(int table_id, bool throw_if_not_exist = false);
+  void dropTable(int table_id, bool is_view = false, bool throw_if_not_exist = false);


I don't see any reason to care about view vs. non-view tables. I think it's better to make no difference here. The fact that two tables share arrow data shouldn't affect anything since this data is immutable and no additional control over the data lifetime is required.

ienkovich · 2023-08-16T18:46:36Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+TableInfoPtr ArrowStorage::createRefragmentedView(const std::string& table_name,
+                                                  const std::string& new_table_name,
+                                                  const size_t new_frag_size) {
+  if (getTableInfoNoLock(db_id_, new_table_name)) {


You shouldn't access the schema provider data with no lock obtained. Since you are going to add a new table, a unique lock should be obtained first. You can release the schema lock after the new table is created.

ienkovich · 2023-08-16T18:48:37Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+    }
+  }
+
+  auto [iter, inserted] = tables_.emplace(new_table_id, std::make_unique<TableData>());


You shouldn't modify tables_ member without obtaining a unique data lock first. After adding a new table, you can unique-lock the table and release the data lock.

ienkovich · 2023-08-16T18:52:24Z

omniscidb/ArrowStorage/ArrowStorage.cpp

@@ -1052,10 +1212,10 @@ void ArrowStorage::dropTable(const std::string& table_name, bool throw_if_not_ex
    }
    return;
  }
-  dropTable(tinfo->table_id);
+  dropTable(tinfo->table_id, tinfo->is_view);


Please don't add is_view usages, this field is legacy and has to be removed.

ienkovich · 2023-08-16T18:54:35Z

omniscidb/ArrowStorage/ArrowStorage.cpp

-                     col_type);
-        }
-      } else {
+      if (col_type->isString()) {


I don't see how this change is related to the new feature. Can we remove it from the patch?

ienkovich · 2023-08-28T20:15:15Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+    throw std::runtime_error("Cannot refragment to fragment size 0");
+  }
+
+  if (table_name.empty()) {


new_table_name?

ienkovich · 2023-08-28T20:17:26Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+void ArrowStorage::refragmentTable(TableData& table,
+                                   const int table_id,
+                                   const size_t new_frag_size) {
+  if (!new_frag_size) {


Duplicated check that makes an impression that we create a new table and then leave it with uninitialized fragments in case of zero fragment size. I'd use CHECK instead here.

ienkovich · 2023-08-28T20:48:01Z

omniscidb/ArrowStorage/ArrowStorage.cpp

+    throw std::runtime_error("Cannot refragment to fragment size 0");
+  }
+  const size_t new_frag_count =
+      std::ceil(static_cast<float>(table.row_count) / new_frag_size);


Float precision is not good enough here. If you have 1'000'000'001 rows and fragment size is 10'000'000 then your result here would be 100 fragments, not 101. Use integer arithmetics to avoid precision issues.

ienkovich · 2023-08-28T20:50:21Z

python/pyhdk/_storage.pxd

@@ -131,7 +131,7 @@ cdef extern from "omniscidb/ArrowStorage/ArrowStorage.h":
    CArrowStorage(int, string, int, shared_ptr[CConfig]) except +;

    CTableInfoPtr createTable(const string&, const vector[CColumnDescription]&, const CTableOptions&) except +
-
+    CTableInfoPtr createRefragmentedView(const string& , const string&, const size_t) except +


Our main API access point is pyhdk.HDK class. Do you use it in your demo? I'd expect usage scenario to be something like:

ht1 = hdk.import_csv(file, fragment_size=10) ... ht2 = ht1.refragmented_view(fragment_size=20) ...

Demo will be redone as a separate PR.

ienkovich

Looks good!

alexbaden · 2023-08-30T20:12:14Z

It would be nice to have a googletest / c++ test for this - also I am wondering what happens if you enable lazy dictionary materialization then re-fragment. Could we add something to ArrowStorageTest.cpp?

akroviakov requested a review from kurapov-peter August 10, 2023 14:04

kurapov-peter requested review from alexbaden and ienkovich August 10, 2023 14:08

Refragmenting table

5f3b8bb

akroviakov force-pushed the akroviak/varied_frag_size branch from 0b6fa4c to 5f3b8bb Compare August 10, 2023 14:35

kurapov-peter requested a review from Devjiu August 10, 2023 14:42

Devjiu reviewed Aug 16, 2023

View reviewed changes

akroviakov force-pushed the akroviak/varied_frag_size branch from d63196e to b7a7893 Compare August 16, 2023 14:28

kurapov-peter reviewed Aug 16, 2023

View reviewed changes

ienkovich suggested changes Aug 16, 2023

View reviewed changes

Refragmented views

bfb856b

akroviakov force-pushed the akroviak/varied_frag_size branch from b7a7893 to bfb856b Compare August 17, 2023 16:20

akroviakov requested a review from ienkovich August 25, 2023 10:25

ienkovich suggested changes Aug 28, 2023

View reviewed changes

akroviakov force-pushed the akroviak/varied_frag_size branch 2 times, most recently from 321b258 to 6775459 Compare August 30, 2023 07:23

Add support for pyHDK

6775459

akroviakov requested a review from ienkovich August 30, 2023 09:12

ienkovich approved these changes Aug 30, 2023

View reviewed changes

refragment ArrowStorage tests

207bd61

kurapov-peter merged commit a75f540 into main Sep 1, 2023
23 checks passed

kurapov-peter deleted the akroviak/varied_frag_size branch September 1, 2023 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refragmenting table #628

Refragmenting table #628

akroviakov commented Aug 10, 2023 •

edited

Loading

ienkovich commented Aug 10, 2023

akroviakov commented Aug 11, 2023

ienkovich commented Aug 11, 2023

Devjiu Aug 16, 2023

Devjiu Aug 16, 2023

Devjiu Aug 16, 2023

Devjiu Aug 16, 2023

Devjiu Aug 16, 2023

kurapov-peter Aug 16, 2023

ienkovich left a comment

ienkovich Aug 16, 2023

ienkovich Aug 16, 2023

ienkovich Aug 16, 2023

ienkovich Aug 16, 2023

ienkovich Aug 16, 2023

ienkovich Aug 28, 2023

ienkovich Aug 28, 2023

ienkovich Aug 28, 2023

ienkovich Aug 28, 2023

akroviakov Aug 30, 2023

ienkovich left a comment

alexbaden commented Aug 30, 2023

Refragmenting table #628

Refragmenting table #628

Conversation

akroviakov commented Aug 10, 2023 • edited Loading

ienkovich commented Aug 10, 2023

akroviakov commented Aug 11, 2023

ienkovich commented Aug 11, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ienkovich left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ienkovich left a comment

Choose a reason for hiding this comment

alexbaden commented Aug 30, 2023

akroviakov commented Aug 10, 2023 •

edited

Loading