Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timle/dbt deps tarball (branch rename) #6302

Closed
wants to merge 39 commits into from

Conversation

timle2
Copy link
Contributor

@timle2 timle2 commented Nov 22, 2022

Continued from
#4689

Continued from
#4220

Revision 3 added Nov 6 2022

in support of Issue [Feature] add new dbt.deps type: url to internally hosted tarball #4205

Proposed solution for feature request 4205

Description

Enable direct linking to tarball urls in packages.yml, for example:

# manufactured test, since you'd want to use hub to install these 
# public tarball used here as example only! 
# this would usually be a tarball hosted  on an internal network
packages:
  - tarball: https://codeload.github.com/dbt-labs/dbt-utils/tar.gz/0.6.5
    name: 'dbt_utils_065'
    version: 0.6.5

  - tarball: https://codeload.github.com/dbt-labs/dbt-utils/tar.gz/0.6.4
    name: 'dbt_utils'
    version: 0.9.2

image

image

Rational:

  • dbt projects being self hosted on larger enterprise environments often don't have a connection to the internet (dbt hubs won't work).
  • dbt users on larger enterprise environments like to build internal private packages for non-public use (help out other dbt users in company with specific functionality)
  • git package install is not a good option at scale for larger enterprise environments
  • internal file hosting service (such as internal artifactory service or internal cloud storage buckets) can be easily configured to host packages for install during deployment, so lets give dbt users a way to install from a direct tar file link

Sketching out doc changes here:
https://github.com/timle2/docs.getdbt.com/blob/dbt-docs-tarball-package-updates/website/docs/docs/building-a-dbt-project/package-management.md#tar-files

Checklist

  • I have signed the CLA
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • [TODO] I have updated the CHANGELOG.md and added information about my change

sha1: allow user to specify sha1 in packages.yaml, will only install if package matches
subdirectory: allow user to specify subdirectory of package in tarfile, if the package is a non standard structure (like with git subdirectory option)
will hit url multiple times instead
- remove tarfile passing, always use tempfile instead
- reorganize system.* functions, removing duplicative code
- more notes on current flow and structure - esp need for pattern of 1) unpack 2) scan for package dir 3) copy to destination.
- cleaning
removing sha1 check to simplify/mirror hub install pattern
to simplify/mirror hub install pattern
- removing sha1 check
- supply name/version to act as our 'metadata' source
simplify with goal of mirroring hub install pattern
- supporting subfolders like git packages, and sha1 checks are removed
- existing code from RegistryPinnedPackage (install() and download_and_untar()) performs the operations
- RegistryPinnedPackage install() and download_and_untar() are not currently set up as functions that can be used across classes - this should be moved to dbt.deps.base, or to a dbt.deps.common file - need dbt labs feedback on how to proceed (or leave as is)
more complex features have been removed (sha1, subfolder) so testing is much simpler!
remove version from package folder name
i'm on the fence if this is right approach, but seems like most sensible after some thought
@timle2 timle2 requested a review from a team November 22, 2022 00:43
@timle2 timle2 requested review from a team as code owners November 22, 2022 00:43
@cla-bot cla-bot bot added the cla:yes label Nov 22, 2022
@timle2
Copy link
Contributor Author

timle2 commented Nov 22, 2022

debugging #4689 where it was suggested that I try a different branch name

@timle2
Copy link
Contributor Author

timle2 commented Nov 22, 2022

thanks for the assist here @dbeatty10 - looks like #4689 has started to run the tests properly again. Should I continue the PR here (new location, update branch name) or resume at #4689 (original location, original branch name)?

@dbeatty10
Copy link
Contributor

@timle2 awesome you got it working again!

You can close this one and resume with the original pull request location and original branch name.

@timle2 timle2 closed this Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants