Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Use dbt-core starter project as baseline for split projects #206

Merged

Conversation

nicholasyager
Copy link
Collaborator

Description and motivation

A common question for users of dbt-meshify is "why does the split project look ... odd?" In most cases, this is a byproduct of the existing split command creating only necessary directories and files in the split project. While functional, this is not the ideal experience for users.

This PR updates our method for creating projects, to instead start by creating a new starter project in the target directory, add all directories and files of interest, and then update the starter dbt_project.yml file using the sub-project's configuration information. The benefits to this approach are:

  1. Sub-projects have all the standard files and directories for a dbt project.
  2. The sub-project have a familiar dbt_project.yml file experience.

Along the way, I also implemented the following:

  1. Tests that confirm the presence of standard dbt directories in sub-projects on creation.
  2. Added a new Change type (Directorychange), file manager (DirectoryManager) and directory editor (DirectoryEditor) for interacting with directory structures instead of files. Some minor refactoring was implemented along the way.

Resolves: #124
Resolves: #153

Example dbt_project.yml

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'customers'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'split_proj'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths:
  - models
analysis-paths:
  - analyses
test-paths:
  - tests
seed-paths:
  - seeds
  - jaffle_data
macro-paths:
  - macros
snapshot-paths:
  - snapshots

clean-targets:         # directories to be removed by `dbt clean`
  - target
  - dbt_packages
models:
  customers:
    +on_schema_change: append_new_columns
    example:
      +materialized: view
require-dbt-version:
  - '>=1.7.0'
  - <1.8.0
seeds:
  +schema: jaffle_raw
vars:
  truncate_timespan_to: '{{ current_timestamp() }}'

Note the preserved comments and formatting in the bog-standard parts of the YAML file.

@nicholasyager nicholasyager added the enhancement New feature or request label May 5, 2024
@nicholasyager nicholasyager self-assigned this May 5, 2024
Copy link
Collaborator

@dave-connors-3 dave-connors-3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦞

# was getting a weird serialization error from ruamel on this value
# it's been deprecated, so no reason to keep it
contents.pop("version")

# this one appears in the project yml, but i don't think it should be written
contents.pop("query-comment")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i remember this line being pretty /shrug at the time i wrote it -- any chance you tested removing this line and seeing if it worked? I feel like it may not in be in the yml from the starter project, and therefore not something we'd need to worry about anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my tests, if we remove it from here, it will be rendered in the sub-project's dbt_project.yml. Is that the desired behavior?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i wouldn't say so 😅 i think my thought was that it may no longer be in the YML we start with now that we're using the official starter project file. if it's still there, I'd vote not to render it

@nicholasyager nicholasyager merged commit 31e1430 into dbt-labs:main May 5, 2024
1 check passed
@nicholasyager nicholasyager deleted the feature/better_split_projects branch May 5, 2024 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants