Skip to content

Commit

Permalink
#1470 source s3: review fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
davydov-d committed Feb 24, 2023
1 parent fb7272e commit 7ed677d
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -160,8 +160,8 @@ def get_time_ordered_file_infos(self) -> List[FileInfo]:

@property
def _raw_schema(self) -> Mapping[str, Any]:
if self._user_input_schema != {}:
return deepcopy(self._user_input_schema)
if self._user_input_schema and isinstance(self._user_input_schema, dict):
return self._user_input_schema
return self._auto_inferred_schema

@property
Expand Down
4 changes: 2 additions & 2 deletions docs/integrations/sources/s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ As you can probably tell, there are many ways to achieve the same goal with path

## User Schema

Providing a schema allows for more control over the output of this stream. Without a provided schema, columns and datatypes will be inferred from the first file in the bucket matching your path pattern and suffix. This will probably be fine in most cases but there may be situations you want to enforce a schema instead, e.g.:
Providing a schema allows for more control over the output of this stream. Without a provided schema, columns and datatypes will be inferred from the first created file in the bucket matching your path pattern and suffix. This will probably be fine in most cases but there may be situations you want to enforce a schema instead, e.g.:

* You only care about a specific known subset of the columns. The other columns would all still be included, but packed into the `_ab_additional_properties` map.
* Your initial dataset is quite small \(in terms of number of records\), and you think the automatic type inference from this sample might not be representative of the data in the future.
Expand All @@ -144,7 +144,7 @@ For example:

:::note

Please note, the S3 Source connector used to infer schemas from all the available files and then merge them to create a superset schema. Starting from version 1.0.0 the schema inference works based on the first file found only. The first file we consider is the oldest one written to the prefix.
Please note, the S3 Source connector used to infer schemas from all the available files and then merge them to create a superset schema. Starting from version 2.0.0 the schema inference works based on the first file found only. The first file we consider is the oldest one written to the prefix.

:::

Expand Down

0 comments on commit 7ed677d

Please sign in to comment.