Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Datastore] Add Snowflake source #1845

Merged
merged 4 commits into from Mar 28, 2022
Merged

Conversation

gtopper
Copy link
Collaborator

@gtopper gtopper commented Mar 28, 2022

ML-1741

Copy link
Contributor

@benbd86 benbd86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much straight forward, just 2 minor comments.

mlrun/datastore/sources.py Outdated Show resolved Hide resolved
# V3IO does not support this level of precision
df = df.withColumn(col_name, funcs.col(col_name).cast("double"))
return df

def write_dataframe(
self, df, key_column=None, timestamp_key=None, chunk_id=0, **kwargs
):
if hasattr(df, "rdd"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess "rdd" somehow means that it is spark engine but maybe we should add a comment so it would be more straight forward

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what it means... It's done in a few other places in the code base and is not affected by this PR. Feel free to PR separately.

Comment on lines +436 to +442
def __init__(
self,
name: str = "",
attributes: Dict[str, str] = None,
key_field: str = None,
time_field: str = None,
schedule: str = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like you're not accepting all of the base class args - path, start_time and end_time are missing, is that on purpose ?

@Hedingber Hedingber changed the title [Datastore] [Feature Store] Add SnowflakeSource [Datastore] Add Snowflake source Mar 28, 2022
@Hedingber Hedingber merged commit 1c7820b into mlrun:development Mar 28, 2022
schedule=schedule,
)

def get_spark_options(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an option "application": "Iguazio" which will indicate the client is from Iguazio? (for the partnership @marcelonyc knows more details about this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants