Bug in benchmark evaluate_load_file.py when config.json passes folder path #462

sunank200 · 2022-06-16T13:43:00Z

Describe the bug
While running the benchmark script for the scenario where the path in config.json is a folder, it throws an error saying no file with a supported extension is found.

Sample config.json:
{ "databases": [ { "name": "bigquery", "params": { "metadata" : { "database": "astronomer-dag-authoring", "schema": "tmp_astro" }, "conn_id": "bigquery" } } ], "datasets": [ { "name": "five_gb", "size": "5G", "path": "gs://astro-sdk/benchmark/trimmed/pypi/", "rows": 385817, "conn_id": "bigquery", } ] }
Error log:
ValueError: Missing file extension, cannot automatically determine filetype from path 'gs://astro-sdk/benchmark/trimmed/pypi/'. Please pass the 'filetype' param with the explicit filetype (e.g. csv, ndjson, etc.). [2022-06-16 10:27:49,417] {dagbag.py:334} ERROR - Failed to import: ./dags/evaluate_load_file.py

Expected behaviour
It should have run the loaded dataset for all the files inside the folder.

The text was updated successfully, but these errors were encountered:

sunank200 self-assigned this Jun 16, 2022

tatiana added bug Something isn't working priority/high High priority labels Jun 16, 2022

sunank200 mentioned this issue Jun 16, 2022

Add filetype for parse in config.json to handle folders in path for dataset #463

Merged

sunank200 closed this as completed in #463 Jun 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in benchmark evaluate_load_file.py when config.json passes folder path #462

Bug in benchmark evaluate_load_file.py when config.json passes folder path #462

sunank200 commented Jun 16, 2022

Bug in benchmark evaluate_load_file.py when config.json passes folder path #462

Bug in benchmark evaluate_load_file.py when config.json passes folder path #462

Comments

sunank200 commented Jun 16, 2022