Move starter project into `dbt` repo #3474

leahwicz · 2021-06-18T12:49:24Z

Addresses issue #3005

Description

I moved the starter_project from its own repo into this repo. I copied the starter project from the dbt-yml-config-version-2 tag.

This will stop cloning the project from the other repo and instead will package the starter project within the dbt release and then copy it to the given directory.

Checklist

I have signed the CLA
I have run this code in development and it appears to resolve the stated issue
This PR includes tests, or tests are not required/relevant for this PR
I have updated the CHANGELOG.md and added information about my change to the "dbt next" section.

leahwicz · 2021-06-18T12:51:41Z

@jtcohen6 - do we need a changelog for changes that aren't customer facing?

@kwigley - can we test packaging this up for distribution to see if it works? I just really want to check that the files are included.

leahwicz · 2021-06-18T12:53:59Z

core/setup.py

-            'py.typed',
-        ]
-    },
+    include_package_data = True,


A couple articles I read said to use include_package_data instead of package_data with the MANIFEST.in file so that's why I made this change. You don't have to list the files then.

Example:
https://newbedev.com/how-include-static-files-to-setuptools-python-package

I really like it! We did the same in dbt-spark a few months ago: dbt-labs/dbt-spark#151. To quote a wise person, "This is much easier."

gshank

LGTM!

jtcohen6

Nice work!! I just took this for a spin, the new approach seems great.

Did we lose the model files (models/example/*.sql) themselves?
Have you checked in with our Experience colleagues about whether this approach is something they could leverage in the future (rather than reimplementing)?
I left some stylistic comments, things I've long been annoyed by but never mustered the energy to change. Now's our chance!
I also left a comment around some related-but-separate improvements to init. It shouldn't be considered blocking to this PR, but if we could find a neat way to do it, it will have been a long time coming.
While taking this for a spin, I played around with a fix to dbt init --adapter default to the available adapter #2814. I'll open a separate PR for that based off this branch.

core/dbt/include/starter_project/dbt_project.yml

core/dbt/include/starter_project/models/example/schema.yml

jtcohen6 · 2021-06-21T20:24:45Z

core/dbt/include/starter_project/dbt_project.yml

+name: 'my_new_project'
+version: '1.0.0'
+config-version: 2
+
+# This setting configures which "profile" dbt uses for this project.
+profile: 'default'


I would love if we could figure out a way to automatically populate name and profile with the project_name argument supplied to dbt init. Right now, that project_name arg is just used to name the file directory, but not the actual project name. Pretty confusing! Also, we tend to discourage using a profile named default, and yet here we are...

I know all we're really doing here is shutil.copytree. Is there any sane way to try editing the files after copying?

For reference, I was just triaging a related issue, so I mentioned this over there as well: #3462 (comment)

We have a lot of expertise with jinja templates :). Perhaps we could make a dbt_project jinja template and substitute the project name. Or just do a python substitution if it's simpler. I guess jinja would only make sense if there are other ways we could leverage it to customize the project file.

Heh, you're not wrong. Back in the day, we did a lot of customization for client projects over here: https://github.com/fishtown-analytics/dbt-init

The purpose there was cross-applying some pretty opinionated configs for a given adapter. I think, for our purposes, we shouldn't go much further beyond name and profile.

I'll take a crack at doing this in another PR to follow this one up. So we want project_name to be the name and profile field in the dbt_project.yml file (just to verify)?

Yes, that's right!

jtcohen6 · 2021-06-21T20:29:01Z

core/setup.py

-            'py.typed',
-        ]
-    },
+    include_package_data = True,


I really like it! We did the same in dbt-spark a few months ago: dbt-labs/dbt-spark#151. To quote a wise person, "This is much easier."

core/dbt/include/starter_project/README.md

leahwicz · 2021-06-21T23:15:24Z

* Did we lose the model files (`models/example/*.sql`) themselves?

🤦‍♀️ I had my global .gitignore file set to filter out sql and I missed this so good catch

Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>

kwigley

👍 looks good!

I confirmed that the source and wheel distributions include the starter project files. In the dbt/core directory, I ran python setup.py sdist bdist_wheel which creates a source (*.tar.gz) and a wheel (*.whl) distribution. I installed these in separate temporary virtualenvs and confirmed the project existed and init command worked for both.

I think if we want a comprehensive project initialization, we could consider using a scaffolding tool like cookiecutter. This might be overkill for the current starter project thought.

jtcohen6 · 2021-06-22T14:22:31Z

I think if we want a comprehensive project initialization, we could consider using a scaffolding tool like cookiecutter. This might be overkill for the current starter project thought.

We've had this idea before: dbt-labs/dbt-init#31. I'm open to it! Agree that it's probably not necessary for now, but might be a good idea if we want init to do something more complex than just copy files (#3462). I also like how interactive cookiecutter is with setup Q&As, it inspired some of my thinking about an interactive install experience over in #3361.

Addresses issue #3005 automatic commit by git-black, original commits: c794600

leahwicz added 6 commits June 11, 2021 09:15

Adding starter repo

3c54ef4

Moved starter project to the include dir

01ba5f0

Changing from clone to copy for the project

9a7987d

Added newline

9d6a77d

Fixing linting errors

b1d7705

More linting fixes

fca7aa8

cla-bot bot added the cla:yes label Jun 18, 2021

leahwicz requested review from a team, gshank, kwigley and jtcohen6 and removed request for a team June 18, 2021 12:49

leahwicz commented Jun 18, 2021

View reviewed changes

leahwicz temporarily deployed to Postgres June 18, 2021 12:54 Inactive

leahwicz temporarily deployed to Bigquery June 18, 2021 12:54 Inactive

leahwicz had a problem deploying to Bigquery June 18, 2021 12:54 Failure

leahwicz temporarily deployed to Redshift June 18, 2021 12:54 Inactive

leahwicz temporarily deployed to Snowflake June 18, 2021 12:54 Inactive

gshank reviewed Jun 21, 2021

View reviewed changes

jtcohen6 reviewed Jun 21, 2021

View reviewed changes

core/dbt/include/starter_project/README.md Outdated Show resolved Hide resolved

jtcohen6 mentioned this pull request Jun 21, 2021

Touch up init, fix missing adapter errors #3483

Merged

4 tasks

Adding missed sql files

522b2f6

leahwicz and others added 2 commits June 21, 2021 19:16

Fixing up indentation

1f69952

Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>

Fixing up community URL

822c277

Co-authored-by: Jeremy Cohen <jeremy@fishtownanalytics.com>

leahwicz temporarily deployed to Postgres June 21, 2021 23:22 Inactive

leahwicz temporarily deployed to Bigquery June 21, 2021 23:22 Inactive

leahwicz temporarily deployed to Snowflake June 21, 2021 23:22 Inactive

leahwicz temporarily deployed to Redshift June 21, 2021 23:22 Inactive

Fixing more spacing in yml files

7f28a4d

leahwicz temporarily deployed to Postgres June 22, 2021 13:21 Inactive

leahwicz temporarily deployed to Redshift June 22, 2021 13:21 Inactive

leahwicz temporarily deployed to Bigquery June 22, 2021 13:21 Inactive

leahwicz temporarily deployed to Snowflake June 22, 2021 13:21 Inactive

kwigley approved these changes Jun 22, 2021

View reviewed changes

leahwicz merged commit c794600 into develop Jun 22, 2021

leahwicz deleted the leahwicz/starter-repo-move branch June 22, 2021 15:03

jtcohen6 mentioned this pull request Jun 23, 2021

Move starter project for init task into dbt repo #3005

Closed

jtcohen6 mentioned this pull request Jul 6, 2021

dbt init: add gitignore, ignore pycache #3536

Merged

3 tasks

iknox-fa pushed a commit that referenced this pull request Feb 8, 2022

Move starter project into dbt repo (#3474)

fa0f907

Addresses issue #3005 automatic commit by git-black, original commits: c794600

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move starter project into `dbt` repo #3474

Move starter project into `dbt` repo #3474

leahwicz commented Jun 18, 2021

leahwicz commented Jun 18, 2021

leahwicz Jun 18, 2021

jtcohen6 Jun 21, 2021

kwigley Jun 22, 2021

gshank left a comment

jtcohen6 left a comment •

edited

jtcohen6 Jun 21, 2021

gshank Jun 21, 2021

jtcohen6 Jun 21, 2021

leahwicz Jun 22, 2021

jtcohen6 Jun 22, 2021

jtcohen6 Jun 21, 2021

leahwicz commented Jun 21, 2021

kwigley left a comment

jtcohen6 commented Jun 22, 2021

Move starter project into dbt repo #3474

Move starter project into dbt repo #3474

Conversation

leahwicz commented Jun 18, 2021

Description

Checklist

leahwicz commented Jun 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gshank left a comment

Choose a reason for hiding this comment

jtcohen6 left a comment • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leahwicz commented Jun 21, 2021

kwigley left a comment

Choose a reason for hiding this comment

jtcohen6 commented Jun 22, 2021

Move starter project into `dbt` repo #3474

Move starter project into `dbt` repo #3474

jtcohen6 left a comment •

edited