Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix plan in case of select operation with custom sharding key #213

Closed
ligurio opened this issue Sep 14, 2021 · 0 comments · Fixed by #242
Closed

Fix plan in case of select operation with custom sharding key #213

ligurio opened this issue Sep 14, 2021 · 0 comments · Fixed by #242
Assignees
Labels
bug Something isn't working

Comments

@ligurio
Copy link
Member

ligurio commented Sep 14, 2021

Cases:

  • pk with fields id and 'name', select with condition by field name with operator == triggered map-reduce if there is no a separate index by filed 'name' and name 'name'
  • pk = {id1, id3}, sk1={id1, id2}, sk2={id3, id4}. When condition uses sk1 and sk2, then we can recover sharding key too (pair of id1, id3)

rough patch:

diff --git a/crud/select/plan.lua b/crud/select/plan.lua
index 528ac54..7990fb4 100644
--- a/crud/select/plan.lua
+++ b/crud/select/plan.lua
@@ -49,6 +49,34 @@ local function get_index_for_condition(space_indexes, space_format, condition)
     end
 end
 
+-- Check that if fields that included in sharding key and conditions
+-- have iterator equal to box.index.EQ or box.index.REQ.
+local function extract_sharding_key_from_conditions(conditions, ddl_sharding_key)
+    if ddl_sharding_key == nil then
+        return nil
+    end
+
+    local conditions_map = {}
+    for _, condition in ipairs(conditions) do
+        conditions_map[condition.operand] = {
+            iter = condition.operator,
+            values = condition.values,
+        }
+    end
+
+    local sharding_key_values = {}
+    for _, field_name in ipairs(ddl_sharding_key) do
+        local condition = conditions_map[field_name]
+        if condition ~= nil and condition.iter == '==' then
+            table.insert(sharding_key_values, condition.values)
+        end
+    end
+
+        local inspect = require('inspect')
+        print(inspect.inspect(sharding_key_values))
+    return sharding_key_values
+end
+
 local function extract_sharding_key_from_scan_value(scan_value, scan_index, sharding_index)
     if #scan_value < #sharding_index.parts then
         return nil
@@ -241,6 +269,9 @@ function select_plan.new(space, conditions, opts)
     if scan_value ~= nil and (scan_iter == box.index.EQ or scan_iter == box.index.REQ) then
         sharding_key = extract_sharding_key_from_scan_value(scan_value, scan_index, sharding_index)
     end
+    if sharding_key == nil then
+        sharding_key = extract_sharding_key_from_conditions(conditions, ddl_sharding_key)
+    end
 
     if sharding_key ~= nil and opts.force_map_call ~= true then
         total_tuples_count = 1

Part of #166

ligurio added a commit that referenced this issue Sep 16, 2021
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
ligurio added a commit that referenced this issue Sep 16, 2021
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
ligurio added a commit that referenced this issue Sep 17, 2021
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
ligurio added a commit that referenced this issue Sep 20, 2021
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
ligurio added a commit that referenced this issue Sep 21, 2021
Previously there were two different ways to obtain bucket id in CRUD:

- calculate bucket id automatically using primary key (default)
- pass it from outside explicitly in options on CRUD operation call

Users who uses DDL module [1] may specify sharding key (that are
actually names of tuple fields), but it was not possible to use DDL
sharding key for bucket id calculation. Now CRUD allows to use that
custom sharding key to calculate bucket id, it will be done
automatically when used DDL schema with non-empty sharding_key [1] or
when space _ddl_sharding_key contains a tuple with space name and it's
sharding key.

Table below describe what operations supports custom sharding key:

| CRUD method                  | Added sharding key support |
| ---------------------------- | -------------------------- |
| get()                        | Yes                        |
| insert() / insert_object()   | Yes                        |
| delete()                     | Yes                        |
| replace() / replace_object() | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| select() / pairs()           | Yes                        |
| update()                     | Yes                        |
| upsert() / upsert_object()   | Yes                        |
| replace() / replace_object() | Yes                        |
| min() / max()                | No (not required)          |
| cut_rows() / cut_objects()   | No (not required)          |
| truncate()                   | No (not required)          |
| len()                        | No (not required)          |

Limitations:

- It's not possible to update sharding keys automatically when schema is
  updated on storages, see [2]. However it is possible to do it manually with
  sharding_key.update_sharding_keys_cache().
- CRUD select may lead map reduce in some cases, see [3].

1. https://github.com/tarantool/ddl
2. #212
3. #213

Closes #166
@kyukhin kyukhin added teamE bug Something isn't working labels Sep 24, 2021
ligurio added a commit that referenced this issue Sep 28, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 28, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 28, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 28, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 28, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212 and #213) with
custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 29, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 30, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 30, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 30, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 30, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Sep 30, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Oct 1, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 18, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 18, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 18, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 18, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 18, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
ligurio added a commit that referenced this issue Nov 19, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 19, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 19, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 19, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
AnaNek pushed a commit that referenced this issue Nov 24, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166
DifferentialOrange added a commit that referenced this issue Nov 24, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 24, 2021
Unit tests for plan module fails to run without cofigured vshard.router
after introduction of improved sharding key extraction in 6c67d9b.
This patch sets up dummy 'crud.common.sharding_key' modules for plan
unit tests.

Follows up #213
DifferentialOrange added a commit that referenced this issue Nov 24, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 24, 2021
Unit tests for plan module fails to run without cofigured vshard.router
after introduction of improved sharding key extraction in 408d1cf.
This patch sets up dummy 'crud.common.sharding_key' modules for plan
unit tests.

Follows up #213
DifferentialOrange added a commit that referenced this issue Nov 25, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 25, 2021
Unit tests for plan module fails to run without cofigured vshard.router
after introduction of improved sharding key extraction in 408d1cf.
This patch sets up dummy 'crud.common.sharding_key' modules for plan
unit tests.

Follows up #213
ligurio added a commit that referenced this issue Nov 25, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Thanks to Oleg Babin (@olegrok) and Alexander Turenko (@Totktonada) for
help with feature implementation.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Thanks to Oleg Babin (@olegrok) and Alexander Turenko (@Totktonada) for
help with feature implementation.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
DifferentialOrange added a commit that referenced this issue Nov 29, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 29, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 30, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 30, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 30, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Nov 30, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Dec 1, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Dec 1, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
DifferentialOrange added a commit that referenced this issue Dec 1, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
Totktonada pushed a commit that referenced this issue Dec 1, 2021
PR #181 introduced support of DDL sharding keys. But if sharding key
hasn't got a separate index in schema, select with equal conditions
for all required sharding key fields still led to map-reduce instead of
a single storage call. This patch introduces impoved support of
sharding keys extraction and fixes the issue.

Closes #213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants