Skip to content

Conversation

michaelneely
Copy link
Contributor

What this PR does / why we need it:

There are many additional configuration options that can be specified when creating a Snowflake connection, for example: proxies, the session keep alive heartbeat interval, etc..

One concrete use case I've found already is setting the host name when testing Snowflake locally with LocalStack.

This PR allows setting additional configuration options in the feature_store.yaml file by changing the Snowflake Pydantic models to allow extra fields.

Which issue(s) this PR fixes:

Misc

@michaelneely michaelneely requested a review from a team as a code owner June 4, 2025 09:10
@michaelneely michaelneely force-pushed the flexible-snowflake-connection-model branch 2 times, most recently from 701e28c to 8e2a45b Compare June 4, 2025 09:14
Signed-off-by: michael neely <141048027+michaelneely@users.noreply.github.com>
@michaelneely michaelneely force-pushed the flexible-snowflake-connection-model branch from 8e2a45b to fb8f10b Compare June 4, 2025 09:14
Copy link
Contributor

@jyejare jyejare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rather add config params explicitly because explicit fields are validated by Pydantic, catching typos, wrong data types, or misconfigurations early.

This also allows users to identify what fields are available for configuring the stores pre-hand.

Usability and Maintainability Comments ^

@michaelneely
Copy link
Contributor Author

michaelneely commented Jun 4, 2025

There are a lot of configuration options (see here). I think trying to maintain parity for dozens of options would be a difficult endeavour. Users could easily refer to the Snowflake connector documentation for any parameters considered non-standard.

@jyejare
Copy link
Contributor

jyejare commented Jun 4, 2025

Could we choose between those are regular in use (So we can add explicit) and those are irregular(comes in with extra), that way we will have some control over important params.

@michaelneely
Copy link
Contributor Author

I think the parameters already explicitly listed in the Snowflake models are considered the "standard" options. Allowing extra parameters increases the flexibility for advanced use cases. Maybe we could mention this in the docs?

@michaelneely
Copy link
Contributor Author

Another option would be to add an explicit "connection extras" (not sure on name) parameter that we unpack in the GetSnowflakeConnection class.

@jyejare
Copy link
Contributor

jyejare commented Jun 11, 2025

@michaelneely Do you have a quick example code how it would looks like ?

@michaelneely
Copy link
Contributor Author

michaelneely commented Jun 11, 2025

Under the proposed approach of allowing extras, if I wanted to add the host parameter to my snowflake connection, then my feature_store.yml would look like this:

registry:
  registry_type: snowflake.registry
  account: ...
  user: ...
  password: ...
  role: ...
  warehouse: ...
  database: ...
  schema: ...
  host: snowflake.localhost.localstack.cloud
provider: local
offline_store:
    type: snowflake.offline
    account: ...
    user: ...
    password: ...
    role: ...
    warehouse: ...
    database: ...
    schema: ...
    host: snowflake.localhost.localstack.cloud

And it "just works" because the host argument is passed through to the snowflake connector's connect method

Account, user, password, role, warehouse, database, schema are all currently documented parameters for the registry and offline store.

@jyejare
Copy link
Contributor

jyejare commented Jun 12, 2025

@michaelneely Maybe I did not explicitly mention about the connection extras. Sorry!!

So in above comment I was looking for connection extras method of passing extras. I think the shared example above it using the current extras method.

@michaelneely
Copy link
Contributor Author

Gotcha. I think it would be exactly like above, except host would sit under connection_extras, e.g.,:

registry:
  registry_type: snowflake.registry
  account: ...
  user: ...
  password: ...
  role: ...
  warehouse: ...
  database: ...
  schema: ...
  connection_extras:
    host: snowflake.localhost.localstack.cloud

And I would just update the pydantic model to include connection_extras: Dict[str, Any] and add some code to unpack that dictionary into the snowflake connector's connect() method.

The reason I'm not really a big fan of this approach is that all these parameters (except registry_type) are passed into the snowflake connector's connect() method anyway, so you're writing more code that needs to be maintained for very little gain.

@jyejare
Copy link
Contributor

jyejare commented Jun 13, 2025

Agree! Seems we are getting into more complex path. Rather lets keep this appraoch 👍

@michaelneely
Copy link
Contributor Author

Thanks, @jyejare, how can I get this merged?

Copy link
Member

@ntkathole ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@ntkathole ntkathole merged commit 4349130 into feast-dev:master Jul 9, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants