-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logstash not able to read and process .csv file #40
Comments
Can you set the codec for the input to 'line'. The default is JSON and that requires the plugin to learn the start and stop bytes, but for CSV it's not relevant. This plugin doesn't take into account that the first line is a header, I'm planning a new release soon to fix some other issues and I can add some line skipping logic, but probably you can filter the header in the filter stage |
How can I set the codec for the input to 'line'? |
input The codec is a standard logstash parameter, it defines how the content of an event. By default I set it to JSON. But you can set it to line or csv https://www.elastic.co/guide/en/logstash/current/plugins-codecs-csv.html |
I added the plugin below and modified the pipeline, but still not working for me.
I tried both line and csv in the codec. The pipeline is stuck after the below lines.
`input csv { mutate { remove_field => "@timestamp" }} |
Make sure to change the passwords and access_keys. You can set registry_create_policy => "start_over" or delete the data/registry.dat because previous files have already been seen and will not be processed. Only new files or grown files will be processed by default. To output more when starting the plugin you can set debug_until => 1000 where the plugin will show more details for the first 1000 events/lines I processed. |
The pipeline is not able to read headers in the CSV file and add headers as values (check screenshot). Do you have any sample pipeline to which I can refer to read data from CSV file and add it to the elasticsearch. `input csv { mutate { remove_field => "@timestamp" }} |
I am reading .csv file from the Azure storage blob and adding that data in Elasticsearch.
But logstash gets stuck after the below lines.
[2023-06-16T18:30:28,479][INFO ][logstash.inputs.azureblobstorage][main][181f6119611506bdf4dacc3ea01b35a1dee26ed795d3b801c7e8f3d7080a0e8c] learn json one of the attempts failed
[2023-06-16T18:30:28,479][INFO ][logstash.inputs.azureblobstorage][main][181f6119611506bdf4dacc3ea01b35a1dee26ed795d3b801c7e8f3d7080a0e8c] head will be: {"records":[ and tail is set to ]}
`below is the pipeline code:
CSV file contains data:
Unique Key,Vendor Id,Vendor Number # VN,Vendor Name,Reference,Document Date,Purchase order number,Currency,Amount,Due Date,Document Status,Invoice blocked,Payment Reference Number,Payment/Clearing Date,Payment Method,Source System,Document Type,Company Code,Company Name,Company code country,Purchase order item,Delivery note number,FiscalYear,SAP Invoice Number,Invoice Pending with (email id) AMP_000013530327,50148526,50148526,CARTUS RELOCATION CANADA LTD,2000073927CA,10/21/2019,,CAD,2041.85,11/21/2019,Pending Payment,,,,,AMP,Invoice,1552,Imperial Oil-DS Br,CA,,,2019,2019, AMP_000013562803,783053,783053,CPS COMUNICACIONES SA,A001800009476,11/1/2019,,ARS,1103.52,12/1/2019,Pending Payment,,,,,AMP,Invoice,2399,ExxonMobil B.S.C Arg. SRL,AR,,,2019,2019, AMP_000013562789,50115024,50115026,FARMERS ALLOY FABRICATING INC,7667,11/5/2019,4410760848,USD,-38940.48,12/5/2019,In Progress,,,,,AMP,Credit Note,944,EM Ref&Mktg (Div),US,,,0,0,wyman.w.hardison@exxonmobil.com
The text was updated successfully, but these errors were encountered: