Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdk-core: add vdk sql-query command #2512

Merged
merged 4 commits into from
Aug 9, 2023
Merged

Conversation

antoniivanov
Copy link
Collaborator

Add vdk sql-query command which will use the same database as the rest of the SQL interfaces (SQL steps, job_input methods)

This will replace the per-db commends like vdk impala-query, vdk sqlite-query which are just problematic.

This would enable to generalize a lot of the tutorials/examples we have which currently give impression they can be used only wiht "sqlite" (sqlite-query) or trino. Which is primary driver for doing the change now.

The command would be used in functional tests where you want to query the default db without needing ot create a job for that.

The command for now is hidden (hidden feature) since there are a few TODOs that would be resolved in subsequent PRs.

@murphp15
Copy link
Collaborator

murphp15 commented Aug 3, 2023

It pulls the database from env variables?
I actually don't really like that we use env variables instead of a config file.
Because then to run it on the cluster its not clear how to set the properties. Also its very much a "it works on my machine" situation.

It would be much nicer if it used the config.ini of the folder you are in?

@antoniivanov
Copy link
Collaborator Author

It pulls the database from env variables? I actually don't really like that we use env variables instead of a config file. Because then to run it on the cluster its not clear how to set the properties. Also its very much a "it works on my machine" situation.

It would be much nicer if it used the config.ini of the folder you are in?

No, environment variables are just one source. Similar to how in spring variables can be auto-wired from application properties file or environment variables or system properties.

In VDK configuration variable can be auto-wired from env variables, from config.ini file ([vdk] section). With env variables having a higher precedence.

And in the future this would be extended. I think most certainly some "global" local config would be needed (like ~/vdk/config in the spirit of ~/.aws/config) but we will see.

The tests themselves use environment variables because it's the easiest way to pass the configuration values in the test.

But the vdk sql-query re-uses the standard configuration process of VDK so whatever we decide to do it in the future it would continue to work without any changes here.

Add vdk sql-query command which will use the same database as the rest
of the SQL interfaces (SQL steps, job_input methods)

This will replace the per-db commends like vdk impala-query, vdk
sqlite-query which are just problematic.

This would enable to generalize a lot of the tutorials/examples we have
which currently give impression they can be used only wiht "sqlite"
(sqlite-query) or trino. Which is primary driver for doing the change
now.

The command would be used in functional tests where you want to query
the default db without needing ot create a job for that.

The command for now is hidden (hidden feature) since there are a few
TODOs that would be resolved in subsequent PRs.
Add vdk sql-query command which will use the same database as the rest
of the SQL interfaces (SQL steps, job_input methods)

This will replace the per-db commends like vdk impala-query, vdk
sqlite-query which are just problematic.

This would enable to generalize a lot of the tutorials/examples we have
which currently give impression they can be used only wiht "sqlite"
(sqlite-query) or trino. Which is primary driver for doing the change
now.

The command would be used in functional tests where you want to query
the default db without needing ot create a job for that.

The command for now is hidden (hidden feature) since there are a few
TODOs that would be resolved in subsequent PRs.
@antoniivanov antoniivanov merged commit 2c3d492 into main Aug 9, 2023
7 checks passed
@antoniivanov antoniivanov deleted the person/aivanov/vdk-query-cmd branch August 9, 2023 07:05
DeltaMichael pushed a commit that referenced this pull request Aug 14, 2023
This reverts commit 2c3d492.

Why

This commit causes a circular dependency in the following modules

standalone_data_job -> builtin_hook_impl -> query_command_plugin -> standalone_data_job

What

Revert the commit

How was this tested

Ran vdk-ipython tests locally, because they
caught the circular dependency initially

What kind of change is this

Bugfix
DeltaMichael added a commit that referenced this pull request Aug 14, 2023
## Why

This reverts commit 2c3d492, because it
causes a circular dependency in the following modules

standalone_data_job -> builtin_hook_impl -> query_command_plugin ->
standalone_data_job

## What

Revert the commit

## How was this tested

Ran vdk-ipython tests locally, because they caught the circular
dependency initially

## What kind of change is this

Bugfix

Co-authored-by: Dilyan Marinov <mdilyan@vmware.com>
antoniivanov added a commit that referenced this pull request Sep 12, 2023
This was initially introduced in #2512 and reverted in #2549

The underlying issue of cyclic dependencies was resoveld by making the
StandaloneDataJobFactory import locally function scoped and not module
scope
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants