Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature branch] AST test infra #1701

Merged
merged 15 commits into from
May 30, 2024

Conversation

sfc-gh-azwiegincew
Copy link
Collaborator

@sfc-gh-azwiegincew sfc-gh-azwiegincew commented May 29, 2024

AST Tests

This driver enables testing of the AST generation that will be used in the server-side Snowpark implementation, starting with Phase 0.

All generated AST should be tested using this mechanism. To add a test, create a new file under tests/ast/data. Files look like the following example. The test driver sets up the session and looks at the accumulated lazy values in the resulting environment.

N.B. No eager evaluation is permitted, as any intermediate batches will not be observed. This can easily be changed if necessary, however.

## TEST CASE

df = session.table("test_table")
df = df.filter("STR LIKE '%e%'")

## EXPECTED OUTPUT

res1 = session.table('test_table')

res2 = res1.filter('STR LIKE '%e%'')

To generate the expected output the first time the test is run, or when the AST generation changes, run:

pytest --update-expectations tests/ast

@sfc-gh-azwiegincew sfc-gh-azwiegincew marked this pull request as ready for review May 29, 2024 20:56
@sfc-gh-azwiegincew sfc-gh-azwiegincew requested a review from a team as a code owner May 29, 2024 20:56
@sfc-gh-azwiegincew sfc-gh-azwiegincew requested review from sfc-gh-smirzaei, sfc-gh-yuwang and sfc-gh-aalam and removed request for a team May 29, 2024 20:56
@sfc-gh-lspiegelberg
Copy link
Contributor

Can we maybe avoid checking in the .jar file? It's 22MB which will make any changesets large...

@@ -1133,7 +1133,8 @@ def select(

stmt = self._session._ast_batch.assign()
ast = stmt.expr
ast.sp_dataframe_select__columns.df.sp_dataframe_ref.id.bitfield1 = self._ast_id
# TODO: remove the None guard below once we generate the correct AST.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 666 about? I assume just some random id for now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Once we get full coverage of the API, this won't be necessary to make the mock setup pass.

@@ -795,6 +795,9 @@ def get_result_query_id(self, plan: SnowflakePlan, **kwargs) -> str:
# get the iterator such that the data is not fetched
result_set, _ = self.get_result_set(plan, to_iter=True, **kwargs)
return result_set["sfqid"]

def create_coprocessor(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we add in the future any functionality to this? Or is this a simple stub? In any case maybe a short comment to address either 1.) what needs to get added here or 2.) that this should not be removed, but left because of ... would help :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@sfc-gh-lspiegelberg sfc-gh-lspiegelberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, only concern is the 22MB file. Can we avoid checking that in?

@sfc-gh-azwiegincew sfc-gh-azwiegincew merged commit 037fc60 into server-side-snowpark May 30, 2024
6 of 8 checks passed
@sfc-gh-azwiegincew sfc-gh-azwiegincew deleted the azwiegincew-ast-test branch May 30, 2024 16:46
@github-actions github-actions bot locked and limited conversation to collaborators May 30, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants