Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ New: Snowflake Cortex Destination 🚀 #36807

Merged
merged 43 commits into from
May 14, 2024

Conversation

bindipankhudi
Copy link
Contributor

@bindipankhudi bindipankhudi commented Apr 3, 2024

Related to: airbytehq/PyAirbyte#120

Destination is working as expected. All unit tests and integration tests are passing. Majority of the code is generated from the vector-db SDK, or taken from Pinecone destination. The most interesting bit (write logic), specific to Cortex is in SnowflakeCortexIndexer class.

To be done as fast follow

  • Add an option for embedding via Snowflake models.

Copy link

vercel bot commented Apr 3, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 14, 2024 11:16pm

@bindipankhudi bindipankhudi requested a review from a team as a code owner May 12, 2024 21:46
@bindipankhudi bindipankhudi removed the request for review from a team May 12, 2024 21:46
@aaronsteers aaronsteers changed the title New/destination snowflake cortex ✨ New: Snowflake Cortex Destination 🚀 May 13, 2024
Copy link
Collaborator

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! A few changes requested inline, but lmk if you want to push back on any of them, and/or defer.

The one that is probably hardest to implement (and also hardest to change later) is the layout of the password input in the config spec. Happy to help on that if needed.

Comment on lines 11 to 18
class SnowflakeCortexIndexingModel(BaseModel):
account: str = Field(
...,
title="Account",
airbyte_secret=True,
description="Enter the account name you want to use to access the database.",
examples=["xxx.us-east-2.aws"],
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compared these with the Java-based snowflake destination to see if these parameters match the similar destination. If we can make the auth/config flow consistent between the two, this could help end users who might end up using both for different use cases.

It looks like all of our specified inputs match except account, which the Java Snowflake destination calls host ("Host").

Also, it looks like we are not accepting schema ("Default Schema") and we should probably add that so users can control which schema is written to.

The last difference I noted was that the way we are configuring password is slightly different, just because the Java destination supports more auth options (OAuth and Key Pair private key). Even if we just have a single auth option, it probably makes sense to try to mirror the same structure, which is to place "password" under a "credentials" object. The challenge here is that we can't necessarily copy code from the Snowflake source - since it is in Java, but maybe another Python connector that supports multiple auth flows could be a model here.

Here is the spec.json for destination-snowflake: https://github.com/airbytehq/airbyte/blob/72b083cbd259b3d922e522bee70e81d60f88f481/airbyte-integrations/connectors/destination-snowflake/src/main/resources/spec.json

Copy link
Contributor Author

@bindipankhudi bindipankhudi May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! The parameters should ideally match the snowflake destination. Made the following changes:

  • replaced "account" with "host" for UI.
  • reordered params to match the order in snowflake destination
  • added "default_schema" field
  • added a new credentials section with password field.
    Implementing the Oauth option seems to much at the moment, but above changes should help us do so in future.

@bindipankhudi
Copy link
Contributor Author

@aaronsteers PR is ready for another look.

"database": "MYDATABASE",
"default_schema": "MYSCHEMA",
"username": "MYUSERNAME",
"password": "xxxxxxx"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should password be nested under credentials now?

Weird that this didn't throw an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch.. this file is for reference purpose, doesn't get used :) i will update it.

Copy link
Collaborator

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 🚀

I think you can merge when ready. :shipit:

bindipankhudi and others added 4 commits May 14, 2024 14:31
…estination_snowflake_cortex/config.py

Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
…etadata.yaml

Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
@bindipankhudi bindipankhudi merged commit 4b84c63 into master May 14, 2024
35 checks passed
@bindipankhudi bindipankhudi deleted the new/destination-snowflake-cortex branch May 14, 2024 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/destination/snowflake-cortex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants