-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/fsspec destination #342
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good! decorator to create dynamic type hints seems to work well.
we'll need a few tests for it for a few mock credentials cases, somewhere in tests/common/configuration
8f918c2
to
61b5629
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a few changes
- default buckets are defined in
tests/config.toml
ENV override will still work - changes the scheme from gcs to gs
- some fixes when passing gcp credentials
my biggest issue is lack of test coverage, the client is tested in replace mode. you do not test if append is adding more files. not sure if you test if replace is testing if files are deleted.
you do
I do not see you testing if file layout is exactly as in ticket spec
I do not see tests for merge special case
Please add tests where filesystem is used as destination in pipelines. Maybe some of existing tests in load/pipelines could be converted (separating parts that require sql_client, you could look if files are present instead). it is some work but without that it is hard to say if this thing works
|
||
# - name: Install self | ||
# run: poetry install --no-interaction | ||
|
||
- run: | | ||
poetry run pytest tests/load tests/cli --ignore=tests/load/bigquery -k '(not bigquery)' | ||
poetry run pytest tests/load/filesystem tests/cli --ignore=tests/load/bigquery -k '(not bigquery)' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this change after all tests passed
@@ -117,6 +117,9 @@ def _resolve_config_fields( | |||
unresolved_fields: Dict[str, Sequence[LookupTrace]] = {} | |||
|
|||
for key, hint in fields.items(): | |||
if key in config.__hint_resolvers__: | |||
# Type hint for this field is created dynamically | |||
hint = config.__hint_resolvers__[key](config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please raise a nice configuration error when the key is not available, the key name the available hint resolves and the configuration type being resolved should be present. the message should be clear so user can fix the problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an exception with message when the decorator is used without matching fields, e.g. the message:
The config spec ConfigWithInvalidDynamicType has dynamic type resolvers for fields: ['a', 'b', 'c'] but these fields are not defined in the spec.
When using @resolve_type() decorator, Add the fields with 'Any' or another common type hint, example:
>>> class ConfigWithInvalidDynamicType(BaseConfiguration)
>>> a: Any
>>> b: Any
>>> c: Any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear what you mean with the other cases. The standard config fields missing exception is raised as normal when e.g. aws credentials are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks good now, thanks!
bba25c2
to
1f1c489
Compare
d115f6d
to
e74df63
Compare
def create_merge_job(self, table_chain: Sequence[TTableSchema]) -> NewLoadJob: | ||
return None | ||
|
||
def complete_load(self, load_id: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at our use cases and actually it would be useful to create a file here. the reason: when the file is present we now that load to the bucket was finalized and we can use the loaded data. You could probably execute LoadFileSystemJob. write_disposition: append, the file can be empty, and the table name is LOADS_TABLE_NAME (_dlt_loads)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
63a97ec#diff-524215919bce1cfd1cd9bae6d4ae82c9f753d4b3ffb1215a418dd06277113a5dR107-R110
Will this do? Just creates empty file named {schema_name}.{_dlt_loads}.{load_id}
. Figure extension and file ID is not needed since it's just a marker per load ID?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is not needed. thx!
14ec1b8
to
a0fe702
Compare
0f6315e
to
18c4a1e
Compare
18c4a1e
to
2597381
Compare
@@ -117,6 +117,9 @@ def _resolve_config_fields( | |||
unresolved_fields: Dict[str, Sequence[LookupTrace]] = {} | |||
|
|||
for key, hint in fields.items(): | |||
if key in config.__hint_resolvers__: | |||
# Type hint for this field is created dynamically | |||
hint = config.__hint_resolvers__[key](config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks good now, thanks!
def create_merge_job(self, table_chain: Sequence[TTableSchema]) -> NewLoadJob: | ||
return None | ||
|
||
def complete_load(self, load_id: str) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is not needed. thx!
WIP