Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug:postgres source invalid pull_latest_from_table_or_query query output #3804

Closed
david-dest01 opened this issue Oct 18, 2023 · 10 comments · Fixed by #4026
Closed

bug:postgres source invalid pull_latest_from_table_or_query query output #3804

david-dest01 opened this issue Oct 18, 2023 · 10 comments · Fixed by #4026

Comments

@david-dest01
Copy link

david-dest01 commented Oct 18, 2023

I would submit a PR but I'm on an M1 and had some issues getting the env running. Submitting a proposed solution for the sake of available time.

Expected Behavior

When using PostgreSQLSource, you should be able to use a table successfully. When running feast masterialize [START_DATE] [END_DATE], as a user I would expect the table to be selected on successfully with no errors.

    source=PostgreSQLSource(
        name="source_user_events_v0",
        timestamp_field="created_at",
        table="offline_store.user_events"
    )

Current Behavior

pull_latest_from_table_or_query attempts to pull the latest data using the outputted string from get_table_query_string. However on line 97 of postgres.py the query includes open and close parens assuming an inner query. These params should be removed from the query and handled by get_table_query_string (which it is).

Steps to reproduce

Create a source as part of a feature view. Reference an existing table.

    source=PostgreSQLSource(
        name="source_user_events_v0",
        timestamp_field="created_at",
        table="offline_store.user_events"
    )

execute feast materialize [START_DATE] [END_DATE]

expected output.

root@3b9fc17aa598:/usr/app# feast materialize 2023-10-16T00:00:00  2023-10-18T00:00:00
/usr/local/lib/python3.9/site-packages/feast/repo_config.py:233: RuntimeWarning: `entity_key_serialization_version` is either not specified in the feature_store.yaml, or is specified to a value <= 1.This serialization version may cause errors when trying to write fields with the `Long` data type into the online store. Specifying `entity_key_serialization_version` to 2 is recommended for new projects. 
  warnings.warn(
Materializing 1 feature views from 2023-10-16 00:00:00+00:00 to 2023-10-18 00:00:00+00:00 into the postgres online store.

fv_user_events_v0:
Traceback (most recent call last):
  File "/usr/local/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/feast/cli.py", line 546, in materialize_command
    store.materialize(
  File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 299, in wrapper
    raise exc.with_traceback(traceback)
  File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/feast/feature_store.py", line 1395, in materialize
    provider.materialize_single_feature_view(
  File "/usr/local/lib/python3.9/site-packages/feast/infra/passthrough_provider.py", line 254, in materialize_single_feature_view
    raise e
  File "/usr/local/lib/python3.9/site-packages/feast/infra/materialization/local_engine.py", line 156, in _materialize_one
    table = offline_job.to_arrow()
  File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/offline_store.py", line 122, in to_arrow
    return self._to_arrow_internal(timeout=timeout)
  File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/contrib/postgres_offline_store/postgres.py", line 282, in _to_arrow_internal
    cur.execute(query)
psycopg2.errors.SyntaxError: syntax error at or near ")"
LINE 8:                 FROM (offline_store.user_events) a
                                                       ^

root@3b9fc17aa598:/usr/app# 

Specifications

  • Version: 0.34.1
  • Platform: Linux
  • Subsystem: Ubuntu 20.04

Possible Solution

Update this query from:

        query = f"""
            SELECT
                {b_field_string}
                {f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
            FROM (
                SELECT {a_field_string},
                ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
                FROM ({from_expression}) a
                WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
            ) b
            WHERE _feast_row = 1
            """

to:

        query = f"""
            SELECT
                {b_field_string}
                {f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
            FROM (
                SELECT {a_field_string},
                ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
                FROM {from_expression} a
                WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
            ) b
            WHERE _feast_row = 1
            """
@danielsalvador
Copy link

hello! I ran exactly into the same issue and fixed it by applying the fix suggested by @david-dest01.

@david-dest01
Copy link
Author

thanks for testing @danielsalvador

Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Mar 17, 2024
@job-almekinders
Copy link
Contributor

job-almekinders commented Mar 20, 2024

Am I mistaken or was the PR to fix this never merged?

I'm experiencing the same issue at the moment.

@david-dest01
Copy link
Author

@job-almekinders the ticket was closed by maintainers - it was never resolved. #3807

@stale stale bot removed the wontfix This will not be worked on label Mar 20, 2024
@job-almekinders
Copy link
Contributor

I'll re-open one tomorrow one tomorrow, after rebasing on master!

@job-almekinders
Copy link
Contributor

I re-opened a similar PR #4026

@jeremyary
Copy link
Collaborator

@job-almekinders @david-dest01 thanks for re-opening and tackling this. We've recently (over the last several months) transitioned to new maintainers/contributors and it's been a bit tricky dealing with some of the older issues & PR's that we inherited and figuring out what's still relevant & active. We're definitely open to community contributions/issues! Much appreciated :)

@tokoko
Copy link
Collaborator

tokoko commented Mar 21, 2024

@jeremyary the previous PR was most likely eventually closed because of a failed DCO check. just saying... 😄

@david-dest01
Copy link
Author

@jeremyary - glad to see feast getting some love :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants