-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbt init
Interactive profile creation
#3625
dbt init
Interactive profile creation
#3625
Conversation
This is very much draft @jtcohen6 but would nevertheless appreciate your review on A. the behaviour and B. the implementation approach. I'm struggling to work out how to easily run the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NiallRees This is slick!!
I left some comments below where I was working through each step in my mind. Then I took a step back, thought through the ideal flow, and threw together a quick flowchart (whimsical link):
What do you think? In particular, I'd love to:
- Prefer
click.prompt()
over CLI flags/args wherever possible - Include backwards-compatible behavior for the old way (
sample_profiles.yml
), while adapter plugin maintainers upgrade to the new way (target_options.yml
) - Ensure that
dbt init
never fails or raises an error. It's fine if it needs to skip steps due to missing information
I think yes to all I believe this addresses all three of your points. |
core/dbt/clients/yaml_helper.py
Outdated
@@ -1,17 +1,16 @@ | |||
import dbt.exceptions | |||
from typing import Any, Dict, Optional | |||
import yaml | |||
import yaml.scanner | |||
import oyaml as yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced yaml with oyaml in order to retain ordering when prompting for user input. Rather than keep both I just replaced every reference with oyaml. It may well be preferred that we leave all other imports as-is, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kwigley do you have any thoughts here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the switch from yaml
to oyaml
has broken one highly specific integration test, which checks the sorting behavior of the toyaml
Jinja context method. The sort_keys
argument is not being respected by oyaml.safe_dump
. Glad we have a test for it!
I think my preference would probably be to avoid switching from yaml
to oyaml
wherever possible. I'm also wondering if there's another way we can preserve prompt order for target_options.yml
, even if it requires an extra attribute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would seem my reasoning for using oyaml was misplaced - I've removed it and apart from some changes to the order of dumped yaml in profiles.yml
the rest of the behaviour is identical. So the order of the questions to the user is still the order of the keys in the target_options.yml
. Hooray!
Ready for your 👀 again @jtcohen6. Once I've got a thumbs up on what this is doing I'll get some integration tests written up. Still curious for your thoughts on running |
Great work here @NiallRees! It's so cool to see this coming together, and in a way that's honestly fun to use. Some misc feedback:
@leahwicz I'd be curious to get your take on this, since you and I did some work on
|
@jtcohen6 @leahwicz Just so I know where to head - should this PR need to concern itself with this, given that you've said dbt Cloud doesn't use the init task currently? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NiallRees Amazing work :)
I only managed to find one bug, which is (I think) a quick fix in the create_profile_using_profile_template
method.
@leahwicz I think this is ready for review from an engineer on the Core team! The big question for me is around leveraging dbt.clients.system
methods in lieu of of direct file access.
else: | ||
return True | ||
|
||
def create_profile_using_profile_template(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I believe this method needs
profile_name
passed as an argument, from therun
entry point - This method will raise an exception if the template file is improperly formatted (e.g. missing a top-level key). It's tough to debug, since dbt doesn't log anything. What do you think of putting the call to
create_profile_using_profile_template
in a try/except that falls back to standard profile prompting/creation if it fails for any reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It retrieves the
profile_name
fromprofile_template.yml
, I'm not sure it needs to be passed from the run entry point - there is a possibility thatprofile_template.yml
's profile name differs from that of the project. - Good idea
@@ -2,12 +2,12 @@ | |||
# Name your project! Project names should contain only lowercase characters | |||
# and underscores. A good package name should reflect your organization's | |||
# name or the intended use of these models | |||
name: 'my_new_project' | |||
name: '{project_name}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
to connect to your database. You can find this file by running: | ||
|
||
{open_cmd} {profiles_path} | ||
Your new dbt project "{project_name}" was created! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(low priority)
This message feels like lot of text after the interactive bite-sized chunks. I wonder if we can do something cool with click
, to either:
- clear the terminal, leaving only the welcome message
- prompting with each item one by one, so that the user "acknowledges" each: project created!, here's a link to the docs, need help?, happy modeling!
core/dbt/clients/yaml_helper.py
Outdated
@@ -1,17 +1,16 @@ | |||
import dbt.exceptions | |||
from typing import Any, Dict, Optional | |||
import yaml | |||
import yaml.scanner | |||
import oyaml as yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the switch from yaml
to oyaml
has broken one highly specific integration test, which checks the sorting behavior of the toyaml
Jinja context method. The sort_keys
argument is not being respected by oyaml.safe_dump
. Glad we have a test for it!
I think my preference would probably be to avoid switching from yaml
to oyaml
wherever possible. I'm also wondering if there's another way we can preserve prompt order for target_options.yml
, even if it requires an extra attribute
return False | ||
logger.debug(f"No sample profile found for {adapter}.") | ||
else: | ||
with open(sample_profiles_path, "r") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere, we probably want to lead on the dbt.clients.system
module for cross-OS support / error handling. I'm thinking about load_file_contents
and write_file
in particular... though we'll need to adjust that method to support the r+
and a
modes you're using here.
cc @leahwicz: I'm pretty fuzzy on this stuff, so you should correct me if I'm wrong here. My sense is, the more we can lean on clients.system
methods for all file operations, the better-served we'll be in a world with storage adapters etc. Alternatively, we could simply say that since init
is a CLI-only task, we don't care about needing to support it outside of the local file system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So two things being unpacked here:
1: Should we use dbt.clients.system
for file read/writes that will work well across platforms?
This makes sense, although I fear that we kind of re-invented the wheel when we made that module. Python 3.4+ contains the pathlib
module which does the same thing (and a lot more). Assuming you only need basic read/writes I would highly suggest using that instead of dbt.clients.system
2: Does it make sense to support SAs here once they're fully available?
Probably not. I guess there might be an edge case where a user would want to bootstrap an adapter skeleton into a remote filesystem or something... but it's the edgiest of edge cases.
TL;DR pathlib
FTW!
self.copy_starter_repo(project_name) | ||
os.chdir(project_name) | ||
with open("dbt_project.yml", "r+") as f: | ||
content = f"{f.read()}".format( | ||
project_name=project_name, | ||
profile_name=project_name | ||
) | ||
f.seek(0) | ||
f.write(content) | ||
f.truncate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think about the "programmatic" entry-points to
init
, e.g. for the dbt Cloud IDE (which doesn't use the realinit
task today, but for consistency's sake, I'd feel better if it could). Do we need to preserve a "streamlined," CLI flag-based version ofinit
that just scaffolds the basic file structure, using a provided project name, and doesn't worry aboutprofiles.yml
at all?
@jtcohen6 @leahwicz Just so I know where to head - should this PR need to concern itself with this, given that you've said dbt Cloud doesn't use the init task currently?
I think you've done it! This snippet of code is exactly what we'd want in the "programmatic" version. I could imagine wrapping this into a create_project_files
method, re-adding a flag to init like --starter-project-only
, and then supporting this workflow as:
def run(self):
"""Entry point for the init task."""
if self.args.starter_project_only:
project_name = self.args.starter_project_only
self.create_project_files(project_name)
return
... otherwise proceed interactively ...
dbt init --starter-project-only my_cool_project_name
Or, the equivalent interface into the dbt-core API ;)
@leahwicz What do you think? I don't think we need to make that change in this PR, but I'm satisfied knowing it would be a small lift.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good to go from my perspective. Amazing work, @NiallRees—we simply must have this in v1.0. The next big undertaking will be writing really clear documentation :)
@leahwicz Could we have a member of the Core team give this a code review? I'm particularly interested in acceptable use of native python file interaction, vs. accessing all local files via dbt.clients.system
module methods. I'll feel much more comfortable giving final approval + merging after an engineer has taken a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NiallRees It's a pleasure and a privilege to approve this PR. On behalf of a few thousand future initializers — thank you!
* Initial * Further dev * Make mypy happy * Further dev * Existing tests passing * Functioning integration test * Passing integration test * Integration tests * Add changelog entry * Add integration test for init outside of project * Fall back to target_options.yml when invalid profile_template.yml is provided * Use built-in yaml with exception of in init * Remove oyaml and fix tests * Update dbt_project.yml in test comparison * Create the profiles directory if it doesn't exist * Use safe_load * Update integration test Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com> automatic commit by git-black, original commits: 11436fe
Resolves #3462
Overhaul
dbt init
to interactively create a profile withinprofiles.yml
based on:profile_template.yml
if configured within the projecttarget_options.yml
and user inputsample_profiles.yml
profilein descending order of preference. The existing
profiles.yml
is updated using the currentdbt_project.yml
's profile name in the case of 2. and 3.Checklist
CHANGELOG.md
and added information about my change to the "dbt next" section.