Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write hdfs failed after the first time successfully written #35

Open
yaauie opened this issue Jun 18, 2018 · 2 comments
Open

write hdfs failed after the first time successfully written #35

yaauie opened this issue Jun 18, 2018 · 2 comments
Assignees

Comments

@yaauie
Copy link
Contributor

yaauie commented Jun 18, 2018

In elastic/logstash#9712, user @hero1122 reports an issue with WebHDFS Output Plugin, indicating that there is some issue with HDFS support for append file:

[2018-06-07T10:26:37,363][ERROR][logstash.outputs.webhdfs ] Max write retries reached. Events will be discarded. Exception: {"RemoteException":{"message":"Failed to APPEND_FILE \/logstash\/dt=2018-06-07\/logstash-02.log for DFSClient_NONMAPREDUCE_-692957599_23 on 192.168.0.3 because this file lease is currently owned by DFSClient_NONMAPREDUCE_382768740_23 on 192.168.0.3","exception":"RemoteException","javaClassName":"org.apache.hadoop.ipc.RemoteException"}}
[2018-06-07T10:26:40,009][DEBUG][logstash.pipeline        ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0x415fbb96 sleep>"}

For all general issues, please provide the following details for fast resolution:

  • Version: 6.2.4
  • Operating System: centos 6.5
  • Config File (if you have sensitive info, please remove it):
    port => "14000"
    use_httpfs => "true"
    path => "/logstash/dt=%{+YYYY-MM-dd}/logstash-%{+HH}-%{+mm}-%{+ss}.log"
    user => "hadoop"
  • Sample Data:
  • Steps to Reproduce:
    the first time will be successful,but the second time will be failed where append to hdfs!

And hdfs no support append file, we mush call file close method after write data info hdfs file!

@hero1122 can you please provide additional context to help us reproduce?

@jakelandis
Copy link
Contributor

@hero1122 - can you help us understand what you would like to do in this scenerio ?

When a client wants to write an HDFS file, it must obtain a lease, which is essentially a lock, to ensure the single-writer semantics. If a lease is not explicitly renewed or the client holding it dies, then it will expire. When this happens, HDFS will close the file and release the lease on behalf of the client.

The lease manager maintains a soft limit (1 minute) and hard limit (1 hour) for the expiration time. If you wait the lease will be released and the append will work....

--- nmaillard via: https://community.hortonworks.com/questions/58195/appendtofile-failed-to-append-filehdfslocationabcc.html

@akshjain83
Copy link

akshjain83 commented Jul 30, 2018

I encountered with the similar issue and this might help some people. With my research and tryouts, it turned out that if the replication factor for HDFS is greater than the number of datanodes available, append would fail to write as it needs to take care of multiple copies. Try setting a low replica factor for the log file. See Logstash and HDFS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants