Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 0.4.1 release #849

Merged
merged 69 commits into from
Dec 23, 2023
Merged

master merge for 0.4.1 release #849

merged 69 commits into from
Dec 23, 2023

Conversation

rudolfix
Copy link
Collaborator

Description

Details in: #763

rudolfix and others added 30 commits November 18, 2023 18:44
* Move destination modules to subfolder

* Mockup destination factory

* Destination factory replacing reference and dest __init__

* Update factories

* Defer duckdb credentials resolving in pipeline context

* Simplify destination config resolution

* capabilities are callable

* bigquery, athena factories

* Add rest of factories

* Cleanup

* Destination type vars

* Cleanup

* Fix test

* Create initial config from non-defaults only

* Update naming convention path

* Fix config in bigquery location test

* Only keep non-default config args in factory

* Resolve duckdb credentials in pipeline context

* Cleanup

* Union credentials arguments

* Common tests without dest dependencies

* Forward all athena arguments

* Delete commented code

* Reference docstrings

* Add deprecation warning for credentials argument

* Init docstrings for destination factories

* Fix tests

* Destination name in output

* Correct exception in unknown destination test

---------

Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
change config attribute to platform_dsn
add exeuction context info to pipeline trace
add pipeline name to pipeline trace
* basic schema freezing

* small changes

* temp

* add new schema update mode

* fix linting errors and one bug

* move freeze code to schema

* some work on schema evolution modes

* add tests

* small tests change

* small fix

* fix some tests

* add global override for schema evolution

* finish implemention of global override

* better tests

* carry over schema settings on update

* add tests for single values

* small changes to tests and code

* fix small error

* add tests for data contract interaction

* fix tests

* some PR work

* update schema management

* fix schema related tests

* add nice schema tests

* add docs page

* small test fix

* smaller PR fixes

* more work

* tests update

* almost there

* tmp

* fix freeze tests

* cleanup

* create data contracts page

* small cleanup

* add pydantic dep to destination tests

* rename contract settings

* rename schema contract dict keys

* some work

* more work...

* more work

* move checking of new tables into extract function

* fix most tests

* fix linter after merge

* small cleanup

* post merge code updates

* small fixes

* some cleanup

* update docs

* makes bumping version optional in Schema, preserves hashes on replace schema content

* extracts on single pipeline schema

* allows to control relational normalizer descend with send

* refactors data contract apply to generate filters instead of actual filtering

* detects if bytes string possibly contains pue characters

* applies schema contracts in item normalizer, uses binary stream, detects pue to skip decoding

* methods to remove and rename arrow columns, need arrow 12+

* implements contracts in extract, fixes issues in apply hints, arrow data filtering still missing

* always uses pipeline schema when extracting

* returns new items count from buffered write

* bumps pyarrow to 12, temporary removes snowflake extra

* fixes arrow imports and normalizer config

* fixes normalizer config tests and pipeline state serialization

* normalizes arrow tables before saving

* adds validation and model synth for contracts to pydantic helper

* splits extractor into files, improves pydantic validator

* runs tests on ci with minimal dependencies

* fixes deps in ci workflows

* re-adds snowflake connector

* updates pydantic helper

* improves contract violation exception

* splits source and resource in extract, adds more tests

* temp disable pydantic 1 tests

* fixes generic type parametrization on 3.8

---------

Co-authored-by: Marcin Rudolf <rudolfix@rudolfix.org>
# Conflicts:
#	dlt/pipeline/pipeline.py
* add schema ancestors

* remove name attribute and init arg from dltsource

* fix 2 tests

* fix statekey related errors

* pr fixes

* revert changes on validate dict

* fix one test
rudolfix and others added 19 commits December 3, 2023 23:19
step info (extract, normalize, load) refactor
* fix documentation typos

* test vars

* test custom env variable

* test custom env variable

* fix the skip decorator

* skip sample with Mongo client

* add comments
Co-authored-by: Jorrit Sandbrink <sandbj01@heiway.net>
* keeps metrics of closed data writer files

* fixes handling of multiple load storages in normalize

* separates immutable job id from actual job file name

* allows to import files into data item storage

* import parquet files into data items storage in normalize

* adds buffered writer tests

* bumps to alpha 0.4.1a1

* makes all injection contexts thread affine, except config providers

* tests running parallel pipelines in thread pool

* allows to set start method for process executor

* adds thread id to dlt log

* improves parallel run test

* fixes None in toml config writer

* adds parallel asyncio test

* updates performance docs
* adjusted black exclude logic

* comment job for testing purpopes

* add matrix strategy

* install make on Windows runner

* move make install up

* specify bash shell

* python-version assignment

* comment

* check poetry install

* poetry shell command

* echo PATH var

* echo $GITHUB_PATH

* check make install

* check poetry install

* comment make install

* pwsh

* echo commands

* show make shell

* list dir

* ls

* ls

* echo $PATH

* extend path

* windows python matrix

* extend PATH

* upgrade connectorx

* reversed connectorx upgrade

* extended matrix

* PATH extension

* PATH extension

* cached venv key

* cached venv key

* removed debugging statement

* uncomment step

---------

Co-authored-by: Jorrit Sandbrink <sandbj01@heiway.net>
* conn string case-insensitivity + driver specification

* added missing = sign

* renamed odbc_driver to driver

---------

Co-authored-by: Jorrit Sandbrink <sandbj01@heiway.net>
* exclude docs group from poetry install

* adjust step name

---------

Co-authored-by: Jorrit Sandbrink <sandbj01@heiway.net>
* allows to remove incremental via EMPTY properly

* adds data writer metrics

* adds reference async generator tests

* adds extract metrics

* fixes data_tables incomplete columns bug

* fixes write empty fils in extract

* adds metrics to normalize

* fixes wrong table rename in test

* adds created and last modified metrics to writers + tests

* fixes workflow credentials

* improves shapes of traces

* more robust deleting of files in filesystem dest

* adds finished_at to step info

* fixes Pydantic annotated synth and type hint detection
Copy link

netlify bot commented Dec 22, 2023

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit ebc250b
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/6585e3039d68740008c9f8a5

@rudolfix rudolfix marked this pull request as ready for review December 23, 2023 12:08
@rudolfix rudolfix merged commit 84816c5 into master Dec 23, 2023
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants