Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader option skip_blank_lines: False in File connector doesn't work as expected. #6061

Closed
sheshan-doye-konvergeai opened this issue Sep 14, 2021 · 1 comment

Comments

@sheshan-doye-konvergeai
Copy link

sheshan-doye-konvergeai commented Sep 14, 2021

Enviroment

  • Airbyte version: example is 0.29.12-alpha
  • OS Version / Instance: Ubuntu 20.04
  • Deployment: Docker
  • Source Connector and version: File -HTTPS- 0.2.5
  • Destination Connector and version: Custom/Local CSV
  • Severity: Medium / High
  • Step where error happened: OutPut file not as expected.

Current Behavior

Tell us what happens.
Trying to pull this data: https://groups.csail.mit.edu/sls/downloads/movie/engtest.bio
Issue faced: Local CSV destination ignoring the empty lines from source. Even after providing "skip_blank_lines: False" in the reader option.
PFB.
Source Data:
B-YEAR right
I-YEAR now

O show
O me
O a

Local CSV destination data:
root@05051086d91a:/weav-data/bio# head -20 _airbyte_raw_Bio_data.csv
"_airbyte_ab_id",_airbyte_emitted_at,_airbyte_data
4100b1ba-b22a-4754-80e4-6d5ab5371efb,1631085587000,"{""col2"":""right"",""col1"":""B-YEAR""}"
6c3972b3-0d0c-4a26-83e8-111040f76318,1631085587000,"{""col2"":""now"",""col1"":""I-YEAR""}"
60e4fa74-139d-4464-b15e-b5089553ecb9,1631085587000,"{""col2"":""show"",""col1"":""O""}"
902547ec-385a-46ee-a527-ecc24179d757,1631085587000,"{""col2"":""me"",""col1"":""O""}"
164c22f4-54e8-4cfa-a9cc-d3c9a31f4eba,1631085587000,"{""col2"":""a"",""col1"":""O""}"

Expected Behavior

Local CSV destination data:
root@05051086d91a:/weav-data/bio# head -20 _airbyte_raw_Bio_data.csv
"_airbyte_ab_id",_airbyte_emitted_at,_airbyte_data
4100b1ba-b22a-4754-80e4-6d5ab5371efb,1631085587000,"{""col2"":""right"",""col1"":""B-YEAR""}"
6c3972b3-0d0c-4a26-83e8-111040f76318,1631085587000,"{""col2"":""now"",""col1"":""I-YEAR""}"
Exxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,Exxxxxxxxxxxxx,"{""col2"":NaN,""col1"":NaN}"
60e4fa74-139d-4464-b15e-b5089553ecb9,1631085587000,"{""col2"":""show"",""col1"":""O""}"
902547ec-385a-46ee-a527-ecc24179d757,1631085587000,"{""col2"":""me"",""col1"":""O""}"
164c22f4-54e8-4cfa-a9cc-d3c9a31f4eba,1631085587000,"{""col2"":""a"",""col1"":""O""}"

Steps to Reproduce

  1. Pull ( https://groups.csail.mit.edu/sls/downloads/movie/engtest.bio ) with file -https connector local csv as destination.
  2. with reader option "{"sep" : "\t", "names": ["Tag","String"], "quoting": 3, "skip_blank_lines": "False"}"
  3. Output still ignores/skips the blank lines.
@sheshan-doye-konvergeai sheshan-doye-konvergeai added the type/bug Something isn't working label Sep 14, 2021
@sheshan-doye-konvergeai
Copy link
Author

This works if we give "skip_blank_lines": false. As spec.json recognise Boolean values in lower case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants