Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write errors when running vs HDP 2.3 (==Hadoop 2.7) #15

Closed
Alex-Ikanow opened this issue Feb 15, 2016 · 21 comments
Closed

Write errors when running vs HDP 2.3 (==Hadoop 2.7) #15

Alex-Ikanow opened this issue Feb 15, 2016 · 21 comments
Assignees
Labels

Comments

@Alex-Ikanow
Copy link

I'm running a very boring configuration in Logstash 2.1.2 that reads from a local CSV file and writes to webhdfs (connecting to an 8 node HDFS cluster created by a vanilla HDP install)

input { file { ... } }
filter { csv { ... } }
output {
  if [sourceKey] == "<<hdfs_output_key>>" {
        webhdfs {
            path => '/app/aleph2/data/aleph2_testing/56bc4f59e4b026e56f0a3cc5/ttm/test/bucket1/managed_bucket/import/ready/ls_input_%{+yyyy.MM.dd.hh}.json'
            host => '<<webhdfs.server.name>>'
            port => '50075'
            user => 'tomcat'
            codec => 'json'
            flush_size => 33554432
            idle_flush_time => 10
        }
   }
}

On some ~10MB file with ~200K lines

I get a non-stop stream of the following errors

{:timestamp=>"2016-02-11T19:51:31.812000-0500", :message=>"webhdfs write caused an exception: Failed to connect to host <<webhdfs.server.name>>:50075, Connection reset by peer - Connection reset by peer. Maybe you should increase retry_interval or reduce number of workers. Retrying...", :level=>:warn}
...
{:timestamp=>"2016-02-11T19:51:33.808000-0500", :message=>"Max write retries reached. Events will be discarded. Exception: Failed to connect to host <<webhdfs.server.name>>:50075, Connection reset by peer - Connection reset by peer", :level=>:error}

A random number of the events are discarded each run, ~25-50%

Looking at the server logs, it's filled with things like the following:

2016-02-11 08:54:24,863 WARN  hdfs.StateChange (FSNamesystem.java:appendFileInternal(2690)) - DIR* NameSystem.append: Failed to APPEND_FILE /app/aleph2/data/aleph2_testing/56bc4f59e4b026e56f0a3cc5/ttm/test/bucket1/managed_bucket/import/ready/ls_input_2016.02.11.01.json for DFSClient_NONMAPREDUCE_154673148_72 on 10.12.16.101 because this file lease is currently owned by DFSClient_NONMAPREDUCE_1522645774_64 on 10.12.16.101

Which appears to be caused by multiple threads accessing the same file according to Google. And there's a related but cryptic comment in the the source code here (sic):

  def write_data(path, data)
    # Retry max_retry times. This can solve problems like leases being hold by another process. Sadly this is no
    # KNOWN_ERROR in rubys webhdfs client.

So:

  • Is this actually not an issue, ie or am I just doing something stupid?
  • If it is a legit issue, Is there a workaround? The error says reduce number of workers, but that presumably means the global number of worker threads, which would cripple the performance of the other logstash configs

Thanks in advance of any insight/help!

@dstore-dbap
Copy link
Collaborator

Hi Alex,
thanks for your bug report. Locked file leases are a known issue when appending with multiple threads to the same file via webhdfs. This might be caused by your quite high flush_size configuration value.
The retry logic should help to alleviate this issue.
If you are willing to try another approach, I created a patch that will allow you to use the thread_id as part of the path configuration value. This will take care that only one thread will append to a given file while still running with multiple pipeline workers.
The patch would need to be applied to ./vendor/bundle/jruby/1.9/gems/logstash-output-webhdfs-2.0.2/lib/logstash/outputs/webhdfs.rb.
Your current configuration would need to be adjusted:

input { file { ... } }
filter { csv { ... } }
output {
  if [sourceKey] == "<<hdfs_output_key>>" {
        webhdfs {
            path => '/app/aleph2/data/aleph2_testing/56bc4f59e4b026e56f0a3cc5/ttm/test/bucket1/managed_bucket/import/ready/ls_input_%{+yyyy.MM.dd.hh}_%{thread_id}.json'
            single_file_per_thread => true
                host => '<<webhdfs.server.name>>'
                ....

Let me know if you want to try this patch and if so, if it solves your issue.

Take care and all the best,
Björn

patch_thread_id_in_path.txt

@Alex-Ikanow
Copy link
Author

Björn

Hi! thanks for the reply and the effort in crafting the patch (+ nice simple workaround!). I'll have a play as soon as I get a moment and report back

Alex

@Alex-Ikanow
Copy link
Author

I just tested that and it worked perfectly, thanks

I wonder if you should put the thread id in the @metadata instead of the event?

@dstore-dbap
Copy link
Collaborator

Glad to hear that ;) Putting the thread id in @metadata is a good idea, Will do this and then push it to master. Thanks again for your effort and help in solving this issue!

dstore-dbap referenced this issue Feb 18, 2016
…thread is set to true.

Appending with multiple threads to the same file via webhdfs may cause issues with locked file leases.
Using the thread_id as part of the path configuration value will take care that only one thread will append to a given file while still running with multiple pipeline workers.
@dstore-dbap
Copy link
Collaborator

Hi Alex,

I merged the bugfix into master with #16.
As you suggested, the thread id will now be added to the @metadata field.
So, your configuration would need to reflect this change:

input { file { ... } }
filter { csv { ... } }
output {
  if [sourceKey] == "<<hdfs_output_key>>" {
        webhdfs {
            path => '/app/aleph2/data/aleph2_testing/56bc4f59e4b026e56f0a3cc5/ttm/test/bucket1/managed_bucket/import/ready/ls_input_%{+yyyy.MM.dd.hh}_%{[@metadata][thread_id]}.json'
            single_file_per_thread => true
                host => '<<webhdfs.server.name>>'
                ....

I will close this issue if this fixes your problem.

Take care and all the best,
Björn

@ebuildy
Copy link
Contributor

ebuildy commented Feb 18, 2016

Hello @dstore-dbap , sorry to ask this here, but where did you find information about webhdfs files lock ? I plan to scale logstash horizontally to load-balance the high CPU load, this look a bad idea so :/

Is there any configuration to do on Hadoop size maybe?

Many thanks,

@dstore-dbap
Copy link
Collaborator

Hi ebuildy,
there is a nice slide here: http://de.slideshare.net/dataera/inside-hdfs-append
As webhdfs can not share a file resource between different threads, things can get awry when trying to append to the same file from these threads. One thread is still appending via the RESTAPI, thus holding the unique file lease. Now another append call from another thread reaches the API and it tries to acquire the lease but fails. How long it takes for the lock to be released depends on the data size written, the number of replicas etc.
Still, if the sort order of the lines written is not of high importance to you, this patch should solve the lock issue. In effect, every thread will only write to its own file, so the threads will not compete for one lease/lock.
Does this help in any way?

@ebuildy
Copy link
Contributor

ebuildy commented Feb 18, 2016

Thanks for the fantastic explanation, so I cannot setup 3x logstash instances getting events from a queue and writing this to webhdfs (same file)?

@dstore-dbap
Copy link
Collaborator

It depends on how frequent you are going to append. Though judging from the fact that you want to use multiple logstash instances, it might be quite frequent ;)
I would suggest one file per logstash instance if at all possible.

@Alex-Ikanow
Copy link
Author

@ebuildy what's the end client? Most Hadoop-y applications work on directories of files anyway

@ebuildy
Copy link
Contributor

ebuildy commented Feb 18, 2016

thats true, it should be a problem actually.

I plan to use Hive, Apache Drill and Apache Spark.

@Alex-Ikanow
Copy link
Author

I've used a similar stack with directories, you should be fine, you can just put a wildcard in the RDD builder, it gets handled pretty efficiently

@ebuildy
Copy link
Contributor

ebuildy commented Feb 18, 2016

Nice to heard, thanks.

@dstore-dbap
Copy link
Collaborator

Yes, thats exactly what we do with our hive queries. Works fine ;)

@dstore-dbap
Copy link
Collaborator

Closing this issue now.

@redbaron4
Copy link

redbaron4 commented Aug 3, 2018

Hi @dstore-dbap

I know this bug has been closed but I am investigating issues on WebHDFS side whereby the plugin throws exceptions like

"message":"failed to create file \/data\/2018\/08\/03\/netflow_dir\/2014.part for DFSClient_NONMAPREDUCE_1432502582_132459 for client XXXXXX because current leaseholder is trying to recreate file.","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"

I have already set the single_file_per_thread => true. Despite this I am getting this error. Is it possible that plugin tries to write fresh data before the previous write has succeeded. Is this a problem on plugin side or maybe my datanodes are performing poorly?

My config is

webhdfs {
                        host => "foo"
                        port => 14000
                        single_file_per_thread => true
                        path => "/data/%{+YYYY}/%{+MM}/%{+dd}/%{[@metadata][hdfs_filename]}_dir/%{[@metadata][thread_id]}.part"  # (required)
                        user => "foo"   # This has no effect since we now use kerberos
                        codec => json
                        use_httpfs => true
                        use_kerberos_auth => true
}

and the pipeline configuration is

- pipeline.id: 1_hadoop
  path.config: "/etc/logstash/1_hadoop/lconf_*/*"
  pipeline.workers: 2
  pipeline.batch.size: 500
  queue.type: persisted
  queue.max_bytes: 24gb

Interestingly even if I keep a single worker, 2 distinct thread ID files are created.

@jvosu2
Copy link

jvosu2 commented Aug 14, 2018

I get the error as well.
I tried increasing max retries to no avail.
A small percent of the messages will be dropped from this error.
Using logstasth 6.x

@redbaron4
Copy link

@jvosu2 In our experience it isn't a problem with plugin.
HDFS os not well suited for repeated appends to same file esp. under heavy load. If you have less no. of datanodes checksum errors may trigger replica/pipeline recovery repeatedly.
On the client side this manifests in form of abnormal time taken to write the data. This plugin has a read_timeout setting which defaults to 30 seconds. This means if an append operation does not complete in 30 seconds, the plugin will retry.
In our case increasing this timeout to 90 helped.

@tulingfeng
Copy link

@jvosu2 Hello, I want to know what config you used to fix this problem finally,set pipeline.workers = 1?

@jvosu2
Copy link

jvosu2 commented Jun 6, 2019

No, it still happens even then. The default behavior of the hdfs plugin is to retry 5 times and then drop the message. Instead of drop I write to a file - I ended up writing my own plugin by combining the file writer output plugin with the hdfs writer output plugin, and if it fails after the # of retrys, then I write to a local file. I later use filebeats to upload the files in the error queue. In my setup, we have ssl and httpfs (not webhdfs) so the firewall forces me to send all messages to only one node, which is a bottleneck. Eventually I will instead need to write my own hdfs client.

@tulingfeng
Copy link

@jvosu2 Sorry to bother you. I have to agree with your opinion after many times to fix this problem.I want to write a plugin similar to yours but I am not good at ruby.
Take the liberty to ask, can you share your code or give me some constructive advice? Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants