Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

column_copy: add support for weight vector #1379

Merged
merged 6 commits into from
Aug 2, 2022

Conversation

komainu8
Copy link
Member

@komainu8 komainu8 commented Aug 1, 2022

I fixed the result of column_copy is incorrect against Float32 values as below.

Problem:

Values of destination_tags column incorrect.

table_create Tags TABLE_HASH_KEY ShortText
[[0,0.0,0.0],true]
table_create CopyFloat32Value TABLE_HASH_KEY ShortText
[[0,0.0,0.0],true]
column_create CopyFloat32Value source_tags COLUMN_VECTOR|WITH_WEIGHT|WEIGHT_FLOAT32 Tags
[[0,0.0,0.0],true]
column_create CopyFloat32Value destination_tags COLUMN_VECTOR|WITH_WEIGHT|WEIGHT_FLOAT32 Tags
[[0,0.0,0.0],true]
load --table CopyFloat32Value
[
{
  "_key": "Groonga is fast!!!",
  "source_tags": {
    "Groonga": 2.8,
    "full text search": 1.5
  }
}
]
[[0,0.0,0.0],1]
column_copy CopyFloat32Value source_tags CopyFloat32Value destination_tags
[[0,0.0,0.0],true]
select CopyFloat32Value
[
  [
    0,
    0.0,
    0.0
  ],
  [
    [
      [
        1
      ],
      [
        [
          "_id",
          "UInt32"
        ],
        [
          "_key",
          "ShortText"
        ],
        [
          "destination_tags",
          "Tags"
        ],
        [
          "source_tags",
          "Tags"
        ]
      ],
      [
        1,
        "Groonga is fast!!!",
        {
          "Groonga": 2.802597e-45,
          "full text search": 0.0
        },
        {
          "Groonga": 2.8,
          "full text search": 1.5
        }
      ]
    ]
  ]
]

Expected:

Values of destination_tags column same as values of source_tags column as below.

column_copy CopyFloat32Value source_tags CopyFloat32Value destination_tags
[[0,0.0,0.0],true]
select CopyFloat32Value
[
  [
    0,
    0.0,
    0.0
  ],
  [
    [
      [
        1
      ],
      [
        [
          "_id",
          "UInt32"
        ],
        [
          "_key",
          "ShortText"
        ],
        [
          "destination_tags",
          "Tags"
        ],
        [
          "source_tags",
          "Tags"
        ]
      ],
      [
        1,
        "Groonga is fast!!!",
        {
          "Groonga": 2.8,
          "full text search": 1.5
        },
        {
          "Groonga": 2.8,
          "full text search": 1.5
        }
      ]
    ]
  ]
]

@komainu8 komainu8 changed the title test: add a test for supporting float32 value column_copy: add a support for float32 value Aug 1, 2022
@komainu8 komainu8 marked this pull request as ready for review August 1, 2022 07:41
@komainu8
Copy link
Member Author

komainu8 commented Aug 1, 2022

@kou Could you review this PR?

lib/column.c Outdated
GRN_BULK_REWIND(&validated_value);
grn_obj_get_value(ctx, from_column, id, &value);
grn_obj_get_value(ctx, from_column, id, &validated_value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is valid value, it seems that value and validated_value expect into same value.
Therefore, I added a prepared process same as value on validated_value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validated_value has duplicated values because grn_obj_get_value(ctx, &validated_value) and grn_uvector_add_element_record() adds the same value twice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh...
I see. I understand. Thank you for your comment.

lib/column.c Outdated
Comment on lines 269 to 270
grn_obj_get_value(ctx, from_column, id, &value);
GRN_BULK_REWIND(&validated_value);
grn_obj_get_value(ctx, from_column, id, &value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change the order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I thought it would be easier to read if the same processes were grouped together.

@kou
Copy link
Member

kou commented Aug 1, 2022

Does this work?

diff --git a/lib/column.c b/lib/column.c
index 32555a119..0f23f8fe8 100644
--- a/lib/column.c
+++ b/lib/column.c
@@ -262,13 +262,15 @@ grn_column_copy_same_table(grn_ctx *ctx,
   if (need_validate) {
     grn_obj validated_value;
     GRN_RECORD_INIT(&validated_value, to_range_flags, to_range_id);
+    if (to_range_flags & GRN_OBJ_WITH_WEIGHT) {
+      validated_value.header.flags |= GRN_OBJ_WITH_WEIGHT;
+    }
     bool is_uvector = grn_obj_is_uvector(ctx, &validated_value);
     grn_obj *range = grn_ctx_at(ctx, to_range_id);
     GRN_TABLE_EACH_BEGIN(ctx, table, cursor, id) {
       GRN_BULK_REWIND(&value);
-      GRN_BULK_REWIND(&validated_value);
       grn_obj_get_value(ctx, from_column, id, &value);
-      grn_obj_get_value(ctx, from_column, id, &validated_value);
+      GRN_BULK_REWIND(&validated_value);
       if (is_uvector) {
         uint32_t n_elements = grn_uvector_size(ctx, &value);
         uint32_t i;

@komainu8
Copy link
Member Author

komainu8 commented Aug 1, 2022

Thank you for your suggested code.
It works.

@kou
Copy link
Member

kou commented Aug 2, 2022

Could you also add a test for only WITH_WEIGHT (no WEIGHT_FLOAT32)?

It seems that this is not related to WEIGHT_FLOAT32. This is just only related to WITH_WEIGHT.

@kou kou changed the title column_copy: add a support for float32 value column_copy: add support for weight vector Aug 2, 2022
Because this problem is not related to WEIGHT_FLOAT32.
This is just only related to WITH_WEIGHT.
@komainu8
Copy link
Member Author

komainu8 commented Aug 2, 2022

Could you also add a test for only WITH_WEIGHT (no WEIGHT_FLOAT32)?
It seems that this is not related to WEIGHT_FLOAT32. This is just only related to WITH_WEIGHT.

Thank you for your comment. I see.
I added the test at 846e6c3.

@@ -0,0 +1,19 @@
table_create Tags TABLE_HASH_KEY ShortText
Copy link
Member

@kou kou Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you rename this file to weight_float32.test?

Copy link
Member Author

@komainu8 komainu8 Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
I fixed the file name at 7b1ab44.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. weight_float32.test...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh.
I fixed the file name again at 4da11d7.

@kou kou merged commit e68b092 into master Aug 2, 2022
@kou kou deleted the add-support-float32-for-column-copy branch August 2, 2022 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants