-
Notifications
You must be signed in to change notification settings - Fork 50
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #88 from bxparks/develop
merge 1.5.1 into master
- Loading branch information
Showing
11 changed files
with
251 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
__version__ = '1.5' | ||
__version__ = '1.5.1' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
name,surname,age | ||
John,Smith,23 | ||
Michael,Johnson,27 | ||
Maria,Smith,30 | ||
Joanna,Anders,21 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#!/usr/bin/env python3 | ||
# | ||
# Example of using SchemaGenerator as a library instead of a command line | ||
# script. Read the CSV file named 'csvfile.csv' in the current directory, deduce | ||
# its schema, and print it out on the stdout. | ||
# | ||
# This is the equivalent of: | ||
# $ generate-schema | ||
# --input_format=csv | ||
# --infer_mode | ||
# --quoted_values_are_strings | ||
# --sanitize_names | ||
# < csvfile.csv | ||
|
||
import json | ||
import logging | ||
import sys | ||
from bigquery_schema_generator.generate_schema import SchemaGenerator | ||
|
||
FILENAME = "csvfile.csv" | ||
|
||
generator = SchemaGenerator( | ||
input_format='csv', | ||
infer_mode=True, | ||
quoted_values_are_strings=True, | ||
sanitize_names=True, | ||
) | ||
|
||
with open(FILENAME) as file: | ||
schema_map, errors = generator.deduce_schema(file) | ||
|
||
for error in errors: | ||
logging.info("Problem on line %s: %s", error['line_number'], error['msg']) | ||
|
||
schema = generator.flatten_schema(schema_map) | ||
json.dump(schema, sys.stdout, indent=2) | ||
print() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
#!/usr/bin/env python3 | ||
# | ||
# Example of using SchemaGenerator programmatically instead of a command line | ||
# script. This example consumes a JSON data set that has *already* been read | ||
# into memory as a Python array of dict. | ||
|
||
import json | ||
import sys | ||
from bigquery_schema_generator.generate_schema import SchemaGenerator | ||
|
||
generator = SchemaGenerator(input_format='dict') | ||
input_data = [ | ||
{ | ||
's': 'string', | ||
'b': True, | ||
}, | ||
{ | ||
'd': '2021-08-18', | ||
'x': 3.1 | ||
}, | ||
] | ||
schema_map, error_logs = generator.deduce_schema(input_data) | ||
schema = generator.flatten_schema(schema_map) | ||
json.dump(schema, sys.stdout, indent=2) | ||
print() |
Oops, something went wrong.