Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How would we select scope of which projects and tables to document? #13

Closed
bastienboutonnet opened this issue Dec 22, 2020 · 3 comments
Closed
Labels
documentation Improvements or additions to documentation type: discussion

Comments

@bastienboutonnet
Copy link
Member

bastienboutonnet commented Dec 22, 2020

One thing we'll need to worry about fairly soon is how we manage scopes of dbt projects to document.

Some users might have one repo per dbt projects and some might have several dbt projects in one repo (monorepo-like setup).

We'll probably want to offer in a yaml file or so an ability for users to list the dbt projects (and possibly tables) to include or exclude from the documentation task. It would probably look something like this:

dbt_projects:
    - dbt_dwh:
        - exclude: ['table_1', 'table_2']
    - dbt_prediction

It'll probably change and I don't even know if that yaml is even valid but at least this captures the gist of it.

@bastienboutonnet
Copy link
Member Author

bastienboutonnet commented Dec 31, 2020

@virvirlopez I think this is going to be relevant to your question around where to store paths and stuff.

I was thinking of having a central config file that would look something like that:

defaults:
  sugar_cane: cane_1
  target: dev
sugar_canes:
  - name: cane_1
    dbt_projects:
      - name: dwh
        path: path
        excluded_tables:
          - table_a
  - name: cane_2
    dbt_projects:
      - name: dwh
        path: path
        excluded_tables:
          - table_a
      - name: prediction
        path: path

This gives out the following python dict when parsed:

{
  "defaults": {
    "sugar_cane": "cane_1", 
    "target": "dev"
  }, 
  "sugar_canes": [
    {
      "dbt_projects": [
        {
          "path": "path", 
          "name": "dwh", 
          "excluded_tables": [
            "table_a"
          ]
        }
      ], 
      "name": "cane_1"
    }, 
    {
      "dbt_projects": [
        {
          "path": "path", 
          "name": "dwh", 
          "excluded_tables": [
            "table_a"
          ]
        }, 
        {
          "path": "path", 
          "name": "prediction"
        }
      ], 
      "name": "cane_2"
    }
  ]
}

The user would then have to call dbt-sugar with maybe a --sugar-cane argument let's say in our example it would be the two_dbt_projects config. This would make available all the scope that we need. This means telling dbt sugar where the dbt profiles are, what tables might be excluded or all sorts of other things down the line

We could then of course have a "default" config which would sit on top of the sugar_canes dict which could point to one of the projects config so that users are not reuired to point to that config for every run.

I'm sure this is not perfect so I'd like to hear your thoughts.

@bastienboutonnet
Copy link
Member Author

#31 should get us in a great state but I think I would want to discuss the following:

One thing that is still left to do, I think is to think about where this sugar_config.yml file lives on disk.

Right now it assumes a directory provided with CLI but I think we should have one of the following:

  • if dbt-sugar is ran inside of a dbt_project maybe it shouldn't need any config?
  • if dbt-sugar is ran outside of a dbt project, maybe we want to have this config file version controlled somewhere. If so, we need to require dbt-sugar to run inside of a folder dedicated to it or have some ability to recurse up and to find a sugar_config.yml file

@virvirlopez What do you think?

@bastienboutonnet
Copy link
Member Author

Done in #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation type: discussion
Development

No branches or pull requests

1 participant