Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add load_id to _dlt_versions table #321

Closed
rudolfix opened this issue May 9, 2023 · 0 comments
Closed

add load_id to _dlt_versions table #321

rudolfix opened this issue May 9, 2023 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@rudolfix
Copy link
Collaborator

rudolfix commented May 9, 2023

Background
We want the users to be able to link the loads in _dlt_loads to _dlt_versions to identify which load updated the schema without enabling the trace (https://dlthub.com/docs/running-in-production/running#inspect-and-save-the-load-info-and-trace).

Tasks

    • insert load_id into the _dlt_versions, see SqlJobClientBase
    • you need to change the _dlt_versions schema definition (schema/utils.py) - make sure it is NULLABLE so old dlt installations can migrate
    • we need to increase schema engine version and write migration that will add the column to existing schema (schema/utils.py)

Tests

  1. Extend existing tests for updating schema and make sure the load_id is stored (test_job_client.py)
  2. Test the schema migration: mock the schema to not have this column, load data, then load the schema again and load the data again to make sure the schema was added
@rudolfix rudolfix added enhancement New feature or request good first issue Good for newcomers labels May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant