Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureStore.get_online_features is broken with int64 vals #1908

Closed
achals opened this issue Sep 25, 2021 · 9 comments · Fixed by #2944
Closed

FeatureStore.get_online_features is broken with int64 vals #1908

achals opened this issue Sep 25, 2021 · 9 comments · Fixed by #2944

Comments

@achals
Copy link
Member

achals commented Sep 25, 2021

Expected Behavior

I should be able to use an int64 value as the join key value for an entity, when the value type is value_type=ValueType.INT64.

Current Behavior

Only values -2147483648 <= number <= 2147483647 can be used as the value. This is equivalent to -2^31 <= number <= 2^31 - 1

Steps to reproduce

Specify a value greater than 2^31 - 1 as a the join key value when looking up values from the feature store.

>>> fs.get_online_features(features=['driver_hourly_stats:conv_rate'],  entity_rows=[{'driver_id': 2147483648}]).to_df()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/achal/tecton/feast/sdk/python/feast/usage.py", line 202, in exception_logging_wrapper
    result = func(*args, **kwargs)
  File "/Users/achal/tecton/feast/sdk/python/feast/feature_store.py", line 897, in get_online_features
    union_of_entity_keys,
  File "/Users/achal/tecton/feast/sdk/python/feast/feature_store.py", line 924, in _populate_result_rows_from_feature_view
    requested_features=requested_features,
  File "/Users/achal/tecton/feast/sdk/python/feast/infra/passthrough_provider.py", line 81, in online_read
    result = self.online_store.online_read(config, table, entity_keys)
  File "/Users/achal/tecton/feast/sdk/python/feast/infra/online_stores/redis.py", line 190, in online_read
    redis_key_bin = _redis_key(project, entity_key)
  File "/Users/achal/tecton/feast/sdk/python/feast/infra/online_stores/helpers.py", line 40, in _redis_key
    key: List[bytes] = [serialize_entity_key(entity_key), project.encode("utf-8")]
  File "/Users/achal/tecton/feast/sdk/python/feast/infra/key_encoding_utils.py", line 41, in serialize_entity_key
    val_bytes, value_type = _serialize_val(v.WhichOneof("val"), v)
  File "/Users/achal/tecton/feast/sdk/python/feast/infra/key_encoding_utils.py", line 17, in _serialize_val
    return struct.pack("<l", v.int64_val), ValueType.INT64
struct.error: 'l' format requires -2147483648 <= number <= 2147483647

Possible Solution

The trivial fix is to make this line use "q" as the format string, which refers to a signed long long value which is 8 bytes in size.

Changing this behavior is a backwards incompatible change, since it would change the bytes encoded for an entity key which takes on int64 values. This means that materialized records oh that value type would would be unreadable.

The "correct" way of fixing this would require us to version the encoding scheme in the feature store.

@woop
Copy link
Member

woop commented Sep 26, 2021

Only values -2147483648 <= number <= 2147483647 can be used as the value. This is equivalent to -2^31 <= number <= 2^31 - 1

Can you explain why only that range is available today? Where is the existing bug?

@achals
Copy link
Member Author

achals commented Sep 27, 2021

Only values -2147483648 <= number <= 2147483647 can be used as the value. This is equivalent to -2^31 <= number <= 2^31 - 1

Can you explain why only that range is available today? Where is the existing bug?

Entity keys of type int64 can actually only take int32 values. A larger value will break materialization.

@m0hitbansal
Copy link

Hi
I have also got the same error after I make changes in struct.pack("<l", v.int64_val), ValueType.INT64 it is working.

@NalinGHub
Copy link
Contributor

NalinGHub commented Jan 21, 2022

Hi! I'm new to feast but I would like to take up this issue if possible. I understand how the error is occurring and the drawbacks of the naive solution. However, for the desired solution, are there any resources that I could look at to understand the encoding scheme and how to work with the versioning? I understand how serialize_entity_key() is used in compute_entity_key() for online stores, but I am unsure on how to 'version' this fix with protobuf. Apologies if that's a beginner question. Thanks!

edit: currently looking at https://googleapis.dev/python/protobuf/latest/ for python protobuf generation, please lmk if this is the right direction! thanks :)

Would it be recommended to edit the Value.proto file? If I am understanding this issue correctly, a new val can be added (e.g. int64 int64long_val) that can be ignored by past versions in the proto message. This will maintain backwards compatibility while also allowing the current encoding scheme to change? This may not be the desired change though, as this is not a clean fix and may not be best practice to just add a new proto field for every issue like this. Again please correct me if I am mistaken!

@NalinGHub
Copy link
Contributor

NalinGHub commented Jan 25, 2022

With regards to versioning the encoding scheme as recommended in the original post, is the recommended action to version the proto files themselves? For example, the grpc repo uses a versioning scheme for their protobuf files (https://github.com/grpc/grpc/tree/master/src/proto/grpc/) in the case of a breaking change. To my understanding, this type of fix would maintain backwards compatibility for clients using the older proto version, while also being able to make this change in the proto message and maintaining forwards compatibility.
This idea is also discussed here (albeit with C#): https://docs.microsoft.com/en-us/aspnet/core/grpc/versioning?view=aspnetcore-6.0
Thank you and please let me know if I should be trying anything different!

@adchia
Copy link
Collaborator

adchia commented Jan 31, 2022

Hi! Sorry for the delayed response.

You can edit the Value.proto directly, but not sure how this addresses this issue since Value has an int64 field already.

Probably it makes more sense to add an encoding version in EntityKeyProto, and then pass it along to the method that achal referred to?

@stale
Copy link

stale bot commented Jun 12, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Jun 12, 2022
@vstrimaitis
Copy link

Would be great to not close this issue because it's still relevant :) Or should the other one be used to track the progress?

@stale stale bot removed the wontfix This will not be worked on label Jun 13, 2022
@ShubhamVashisth7
Copy link

ShubhamVashisth7 commented Jun 29, 2022

I am getting the exact same error while materializingMaterializing 652 feature views to 2022-06-11 00:00:00+00:00 into the sqlite online store.

Feature_view_41 from 2022-05-30 16:57:02+00:00 to 2022-06-11 00:00:00+00:00:
100%|█████████████████████████████████████████████████████████████| 45/45 [00:00<00:00, 1977.57it/s]
Feature_view_545 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|██████████████████████████████████████████████████████████████| 53/53 [00:00<00:00, 730.42it/s]
Feature_view_345 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|███████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 1050.04it/s]
Feature_view_651 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|███████████████████████████████████████████████████████████| 871/871 [00:00<00:00, 7257.81it/s]
Feature_view_261 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 774.94it/s]
Feature_view_557 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|███████████████████████████████████████████████████████████| 639/639 [00:00<00:00, 4738.99it/s]
Feature_view_81 from 2022-05-30 16:57:03+00:00 to 2022-06-11 00:00:00+00:00:
100%|█████████████████████████████████████████████████████████| 2307/2307 [00:00<00:00, 6310.88it/s]
Feature_view_586 from 2022-05-30 16:57:04+00:00 to 2022-06-11 00:00:00+00:00:
100%|███████████████████████████████████████████████████████████| 635/635 [00:00<00:00, 4615.60it/s]
Feature_view_564 from 2022-05-30 16:57:04+00:00 to 2022-06-11 00:00:00+00:00:
100%|████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 446.61it/s]
Feature_view_456 from 2022-05-30 16:57:05+00:00 to 2022-06-11 00:00:00+00:00:
100%|█████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 1418.43it/s]
Feature_view_197 from 2022-05-30 16:57:05+00:00 to 2022-06-11 00:00:00+00:00:
100%|███████████████████████████████████████████████████████████| 751/751 [00:00<00:00, 2131.44it/s]
Feature_view_290 from 2022-05-30 16:57:05+00:00 to 2022-06-11 00:00:00+00:00:
  0%|                                                                       | 0/623 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/envs/kgfarm_env/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/cli.py", line 527, in materialize_incremental_command
    store.materialize_incremental(
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/feature_store.py", line 1043, in materialize_incremental
    provider.materialize_single_feature_view(
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/infra/passthrough_provider.py", line 164, in materialize_single_feature_view
    self.online_write_batch(
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/infra/passthrough_provider.py", line 86, in online_write_batch
    self.online_store.online_write_batch(config, table, data, progress)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/usage.py", line 280, in wrapper
    raise exc.with_traceback(traceback)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/infra/online_stores/sqlite.py", line 98, in online_write_batch
    entity_key_bin = serialize_entity_key(entity_key)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/infra/key_encoding_utils.py", line 57, in serialize_entity_key
    val_bytes, value_type = _serialize_val(v.WhichOneof("val"), v)
  File "/mnt/envs/kgfarm_env/lib/python3.8/site-packages/feast/infra/key_encoding_utils.py", line 17, in _serialize_val
    return struct.pack("<l", v.int64_val), ValueType.INT64
struct.error: 'l' format requires -2147483648 <= number <= 2147483647 features:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants