Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source File: Support JsonL format #2118

Merged
merged 9 commits into from
Feb 18, 2021
Merged

Source File: Support JsonL format #2118

merged 9 commits into from
Feb 18, 2021

Conversation

sherifnada
Copy link
Contributor

@sherifnada sherifnada commented Feb 18, 2021

Closes #2102 by adding support for Newline-delimited-JSON (JSONL/NDJson)

This is technically a new feature, but I did it anyways because:

  1. this is the standard data export format from BQ, so it's kind of ubiquitous
  2. blocking a user from leveraging airbyte

@sherifnada
Copy link
Contributor Author

sherifnada commented Feb 18, 2021

/test connector=airbyte/source-file

❌ airbyte/source-file https://github.com/airbytehq/airbyte/actions/runs/579253981

("parquet", "parquet", 9, 3),
("csv", "csv", 8, 5000, "demo"),
("json", "json", 2, 1, "demo"),
("ndjson", "ndjson", 2, 10, "ndjson_nested"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in destination-local-json we are using .jsonl extensions, is ndjson standard way of referring to it? Should we harmonize in sources & destinations?

@michel-tricot
Copy link
Contributor

I am more familiar with json line as a name. It also has a website with this name: https://jsonlines.org/

@sherifnada
Copy link
Contributor Author

sherifnada commented Feb 18, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/579288538
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/579288538

@sherifnada
Copy link
Contributor Author

@ChristopheDuong @michel-tricot the two look basically the same https://jsonlines.org/ and http://ndjson.org/ (though jsonl explicitly supports UTF8)

happy to call it jsonl if it's more widespread

@sherifnada
Copy link
Contributor Author

@michel-tricot @ChristopheDuong renamed to jsonl, any other feedback?

@sherifnada
Copy link
Contributor Author

sherifnada commented Feb 18, 2021

/test connector=source-file

🕑 source-file https://github.com/airbytehq/airbyte/actions/runs/579425451
✅ source-file https://github.com/airbytehq/airbyte/actions/runs/579425451

@sherifnada sherifnada changed the title Source File: Support NDJson format Source File: Support JsonL format Feb 18, 2021
@sherifnada
Copy link
Contributor Author

sherifnada commented Feb 18, 2021

/publish connector=connectors/source-file

🕑 connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/579519342
✅ connectors/source-file https://github.com/airbytehq/airbyte/actions/runs/579519342

@sherifnada sherifnada merged commit d426781 into master Feb 18, 2021
@sherifnada sherifnada deleted the sherif/support-ndjson branch February 18, 2021 21:54
@sherifnada sherifnada self-assigned this Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source File: Discovering a JSON file schema fails
4 participants