Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. #3851

Merged
merged 1 commit into from Jan 24, 2024

Conversation

shuchu
Copy link
Collaborator

@shuchu shuchu commented Dec 1, 2023

What this PR does / why we need it:
For snowflake users who only have "USAGE" and "SELECT" privileges, we want to allow them able to run the "feast apply" command and do READ-only access existing Registry tables in snowflake cloud.

Please be aware that the Feast Registry is designed to allow every user can modify the records of registry tables under their own "project" name. This PR didn't change the original design. It only changes the previous implementation logic of snowflake.py which always calls "CREATE" sql command while initializing the SnowflakeRegistry(), which will be called while run statement:

store = FeatureStore()

This PR is tested with snowflake cloud privilege for a Role with setting as below:

for feast Registry database : SELECT - FUTURE TABLE, SELECT - FUTURE VIEW, USAGE
for feast Registry 11 tables: SELECT
for feast schema (default name PUBLIC): SELECT - FUTURE TABLE, SELECT - FUTURE VIEW, USAGE
for warehouse: USAGE

The test will expect the above specific role can call "feast apply" and does not see errors like:
snowflake.connector.errors.ProgrammingError: 003001 (42501): SQL access control error: Insufficient privileges to operate on *****
during the init() function call of SnowflakeRegistry().

Which issue(s) this PR fixes:
Fixes #3844

…ing CREATE sql command. Allow read-only user to call feast apply.

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
@tmihalac
Copy link
Contributor

Please add tests to make sure the fix works as intended

@tmihalac
Copy link
Contributor

Can you please add more details on how to test the fix ?

@shuchu
Copy link
Collaborator Author

shuchu commented Jan 16, 2024

I did few manual tests with my free Snowflake account while I was creating this PR. To test it, we need to configure the Snowflake as I mentioned in the PR and check the result of "feast apply" on the tester's terminal.

It will be great if someone has a Snowflake account and test it.

@etirelli etirelli merged commit 9a3590e into feast-dev:master Jan 24, 2024
23 checks passed
etirelli added a commit to etirelli/feast that referenced this pull request Jan 25, 2024
…ore calling CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)"

This reverts commit 9a3590e.

Signed-off-by: Edson Tirelli <etirelli@redhat.com>
etirelli added a commit that referenced this pull request Jan 25, 2024
Revert "fix: Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. (#3851)"

This reverts commit 9a3590e.

Signed-off-by: Edson Tirelli <etirelli@redhat.com>
@shuchu
Copy link
Collaborator Author

shuchu commented Jan 26, 2024

This PR can not fix the feature that allows READ-only users to access the DB. There is an optimization in the get_historical_features() that will create a temporary table in the (offline store) DB for the point-of-time correct Join. (https://www.hopsworks.ai/dictionary/point-in-time-correct-joins#:~:text=A%20point%2Din%2Dtime%20correct,a%20specific%20point%20in%20time.) For example, in line 815-822 of bigquery.py (https://github.com/feast-dev/feast/blob/master/sdk/python/feast/infra/offline_stores/bigquery.py), there is a short description about this. The temporary table is deleted as in Line 298 after the usage.

On snowflake, if we can separate the offline store DB from the registry repo, I believe we can solve this problem by allow READ-only to registry repo DB, but allow Write+Read to the offline store DB.

tokoko pushed a commit to tokoko/feast that referenced this pull request Feb 6, 2024
…ing CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: tokoko <togurg14@freeuni.edu.ge>
tokoko pushed a commit to tokoko/feast that referenced this pull request Feb 6, 2024
…dev#3907)

Revert "fix: Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)"

This reverts commit 9a3590e.

Signed-off-by: Edson Tirelli <etirelli@redhat.com>
Signed-off-by: tokoko <togurg14@freeuni.edu.ge>
zseta pushed a commit to zseta/feast that referenced this pull request Feb 7, 2024
…ing CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)

Signed-off-by: Shuchu Han <shuchu.han@gmail.com>
Signed-off-by: Attila Toth <hello@attilatoth.dev>
zseta pushed a commit to zseta/feast that referenced this pull request Feb 7, 2024
…dev#3907)

Revert "fix: Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)"

This reverts commit 9a3590e.

Signed-off-by: Edson Tirelli <etirelli@redhat.com>
Signed-off-by: Attila Toth <hello@attilatoth.dev>
tqtensor pushed a commit to tqtensor/feast that referenced this pull request Mar 11, 2024
…dev#3907)

Revert "fix: Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. (feast-dev#3851)"

This reverts commit 9a3590e.

Signed-off-by: Edson Tirelli <etirelli@redhat.com>
franciscojavierarceo pushed a commit that referenced this pull request Apr 16, 2024
# [0.36.0](v0.35.0...v0.36.0) (2024-04-16)

### Bug Fixes

* Add __eq__, __hash__ to SparkSource for correct comparison ([#4028](#4028)) ([e703b40](e703b40))
* Add conn.commit() to Postgresonline_write_batch.online_write_batch ([#3904](#3904)) ([7d75fc5](7d75fc5))
* Add missing __init__.py to embedded_go ([#4051](#4051)) ([6bb4c73](6bb4c73))
* Add missing init files in infra utils ([#4067](#4067)) ([54910a1](54910a1))
* Added registryPath parameter documentation in WebUI reference ([#3983](#3983)) ([5e0af8f](5e0af8f)), closes [#3974](#3974) [#3974](#3974)
* Adding missing init files in materialization modules ([#4052](#4052)) ([df05253](df05253))
* Allow trancated timestamps when converting ([#3861](#3861)) ([bdd7dfb](bdd7dfb))
* Azure blob storage support in Java feature server ([#2319](#2319)) ([#4014](#4014)) ([b9aabbd](b9aabbd))
* Bugfix for grabbing historical data from Snowflake with array type features. ([#3964](#3964)) ([1cc94f2](1cc94f2))
* Bytewax materialization engine fails when loading feature_store.yaml ([#3912](#3912)) ([987f0fd](987f0fd))
* CI unittest warnings ([#4006](#4006)) ([0441b8b](0441b8b))
* Correct the returning class proto type of StreamFeatureView to StreamFeatureViewProto instead of FeatureViewProto. ([#3843](#3843)) ([86d6221](86d6221))
* Create index only if not exists during MySQL online store update ([#3905](#3905)) ([2f99a61](2f99a61))
* Disable minio tests in workflows on master and nightly ([#4072](#4072)) ([c06dda8](c06dda8))
* Disable the Feast Usage feature by default. ([#4090](#4090)) ([b5a7013](b5a7013))
* Dump repo_config by alias ([#4063](#4063)) ([e4bef67](e4bef67))
* Extend SQL registry config with a sqlalchemy_config_kwargs key ([#3997](#3997)) ([21931d5](21931d5))
* Feature Server image startup in OpenShift clusters ([#4096](#4096)) ([9efb243](9efb243))
* Fix copy method for StreamFeatureView ([#3951](#3951)) ([cf06704](cf06704))
* Fix for materializing entityless feature views in Snowflake ([#3961](#3961)) ([1e64c77](1e64c77))
* Fix type mapping spark ([#4071](#4071)) ([3afa78e](3afa78e))
* Fix typo as the cli does not support shortcut-f option. ([#3954](#3954)) ([dd79dbb](dd79dbb))
* Get container host addresses from testcontainers ([#3946](#3946)) ([2cf1a0f](2cf1a0f))
* Handle ComplexFeastType to None comparison ([#3876](#3876)) ([fa8492d](fa8492d))
* Hashlib md5 errors in FIPS for python 3.9+ ([#4019](#4019)) ([6d9156b](6d9156b))
* Making the query_timeout variable as optional int because upstream is considered to be optional ([#4092](#4092)) ([fd5b620](fd5b620))
* Move gRPC dependencies to an extra ([#3900](#3900)) ([f93c5fd](f93c5fd))
* Prevent spamming pull busybox from dockerhub ([#3923](#3923)) ([7153cad](7153cad))
* Quickstart notebook example ([#3976](#3976)) ([b023aa5](b023aa5))
* Raise error when not able read of file source spark source ([#4005](#4005)) ([34cabfb](34cabfb))
* remove not use input parameter in spark source ([#3980](#3980)) ([7c90882](7c90882))
* Remove parentheses in pull_latest_from_table_or_query ([#4026](#4026)) ([dc4671e](dc4671e))
* Remove proto-plus imports ([#4044](#4044)) ([ad8f572](ad8f572))
* Remove unnecessary dependency on mysqlclient ([#3925](#3925)) ([f494f02](f494f02))
* Restore label check for all actions using pull_request_target ([#3978](#3978)) ([591ba4e](591ba4e))
* Revert mypy config ([#3952](#3952)) ([6b8e96c](6b8e96c))
* Rewrite Spark materialization engine to use mapInPandas ([#3936](#3936)) ([dbb59ba](dbb59ba))
* Run feature server w/o gunicorn on windows ([#4024](#4024)) ([584e9b1](584e9b1))
* SqlRegistry _apply_object update statement ([#4042](#4042)) ([ef62def](ef62def))
* Substrait ODFVs for online ([#4064](#4064)) ([26391b0](26391b0))
* Swap security label check on the PR title validation job to explicit permissions instead ([#3987](#3987)) ([f604af9](f604af9))
* Transformation server doesn't generate files from proto ([#3902](#3902)) ([d3a2a45](d3a2a45))
* Trino as an OfflineStore Access Denied when BasicAuthenticaion ([#3898](#3898)) ([49d2988](49d2988))
* Trying to import pyspark lazily to avoid the dependency on the library ([#4091](#4091)) ([a05cdbc](a05cdbc))
* Typo Correction in Feast UI Readme ([#3939](#3939)) ([c16e5af](c16e5af))
* Update actions/setup-python from v3 to v4 ([#4003](#4003)) ([ee4c4f1](ee4c4f1))
* Update typeguard version to >=4.0.0 ([#3837](#3837)) ([dd96150](dd96150))
* Upgrade sqlalchemy from 1.x to 2.x regarding PVE-2022-51668. ([#4065](#4065)) ([ec4c15c](ec4c15c))
* Use CopyFrom() instead of __deepycopy__() for creating a copy of protobuf object. ([#3999](#3999)) ([5561b30](5561b30))
* Using version args to install the correct feast version ([#3953](#3953)) ([b83a702](b83a702))
* Verify the existence of Registry tables in snowflake before calling CREATE sql command. Allow read-only user to call feast apply. ([#3851](#3851)) ([9a3590e](9a3590e))

### Features

* Add duckdb offline store ([#3981](#3981)) ([161547b](161547b))
* Add Entity df in format of a Spark Dataframe instead of just pd.DataFrame or string for SparkOfflineStore ([#3988](#3988)) ([43b2c28](43b2c28))
* Add gRPC Registry Server ([#3924](#3924)) ([373e624](373e624))
* Add local tests for s3 registry using minio ([#4029](#4029)) ([d82d1ec](d82d1ec))
* Add python bytes to array type conversion support proto ([#3874](#3874)) ([8688acd](8688acd))
* Add python client for remote registry server ([#3941](#3941)) ([42a7b81](42a7b81))
* Add Substrait-based ODFV transformation ([#3969](#3969)) ([9e58bd4](9e58bd4))
* Add support for arrays in snowflake ([#3769](#3769)) ([8d6bec8](8d6bec8))
* Added delete_table to redis online store ([#3857](#3857)) ([03dae13](03dae13))
* Adding support for Native Python feature transformations for ODFVs ([#4045](#4045)) ([73bc853](73bc853))
* Bumping requirements ([#4079](#4079)) ([1943056](1943056))
* Decouple transformation types from ODFVs ([#3949](#3949)) ([0a9fae8](0a9fae8))
* Dropping Python 3.8 from local integration tests and integration tests ([#3994](#3994)) ([817995c](817995c))
* Dropping python 3.8 requirements files from the project. ([#4021](#4021)) ([f09c612](f09c612))
* Dropping the support for python 3.8 version from feast ([#4010](#4010)) ([a0f7472](a0f7472))
* Dropping unit tests for Python 3.8 ([#3989](#3989)) ([60f24f9](60f24f9))
* Enable Arrow-based columnar data transfers  ([#3996](#3996)) ([d8d7567](d8d7567))
* Enable Vector database and retrieve_online_documents API ([#4061](#4061)) ([ec19036](ec19036))
* Kubernetes materialization engine written based on bytewax ([#4087](#4087)) ([7617bdb](7617bdb))
* Lint with ruff ([#4043](#4043)) ([7f1557b](7f1557b))
* Make arrow primary interchange for offline ODFV execution ([#4083](#4083)) ([9ed0a09](9ed0a09))
* Pandas v2 compatibility ([#3957](#3957)) ([64459ad](64459ad))
* Pull duckdb from contribs, add to CI ([#4059](#4059)) ([318a2b8](318a2b8))
* Refactor ODFV schema inference ([#4076](#4076)) ([c50a9ff](c50a9ff))
* Refactor registry caching logic into a separate class ([#3943](#3943)) ([924f944](924f944))
* Rename OnDemandTransformations to Transformations ([#4038](#4038)) ([9b98eaf](9b98eaf))
* Revert updating dependencies so that feast can be run on 3.11. ([#3968](#3968)) ([d3c68fb](d3c68fb)), closes [#3958](#3958)
* Rewrite ibis point-in-time-join w/o feast abstractions ([#4023](#4023)) ([3980e0c](3980e0c))
* Support s3gov schema by snowflake offline store during materialization ([#3891](#3891)) ([ea8ad17](ea8ad17))
* Update odfv test ([#4054](#4054)) ([afd52b8](afd52b8))
* Update pyproject.toml to use Python 3.9 as default ([#4011](#4011)) ([277b891](277b891))
* Update the Pydantic from v1 to v2 ([#3948](#3948)) ([ec11a7c](ec11a7c))
* Updating dependencies so that feast can be run on 3.11. ([#3958](#3958)) ([59639db](59639db))
* Updating protos to separate transformation ([#4018](#4018)) ([c58ef74](c58ef74))

### Reverts

* Reverting bumping requirements ([#4081](#4081)) ([1ba65b4](1ba65b4)), closes [#4079](#4079)
* Verify the existence of Registry tables in snowflake… ([#3907](#3907)) ([c0d358a](c0d358a)), closes [#3851](#3851)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow read-only feature while using snowflake as (sql) registry host.
3 participants