Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpFS - Do not create file if it does not exist #46

Open
hurtauda opened this issue Jan 12, 2017 · 5 comments
Open

httpFS - Do not create file if it does not exist #46

hurtauda opened this issue Jan 12, 2017 · 5 comments

Comments

@hurtauda
Copy link

Hello,

We are running a MapR custer and webHDFS is not supported by MapR. So we are trying to populate hadoop using httpFS.

Our Webhdfs config :

  @type webhdfs
  host mapr-mapr-master-0
  port 14000
  path "/uhalogs/docker/docker-%M.log"
  time_slice_format %M
  flush_interval 5s
  username mapr
  httpfs true

However when using the fluentd plugin, logs are appended correclty to an existing file. But if the file does not exist (using a timestamp-based filename), we get a WebHDFS::ServerError instead of a WebHDFS::FileNotFoundError that would create the file I guess.

Error 500 received by Mapr :

{
  "RemoteException": {
    "message": "Append failed for file: /uhalogs/docker/testfile.log, error: No such file or directory (2)",
    "exception": "IOException",
    "javaClassName": "java.io.IOException"
  }
}

logs by fluentd-webhdfs plugin :

2017-01-12 13:59:09 +0000 [warn]: failed to communicate hdfs cluster, path: /uhalogs/docker/docker-58.log
2017-01-12 13:59:09 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2017-01-12 14:00:13 +0000 error_class="WebHDFS::ServerError" error="{\"RemoteException\":{\"message\":\"Append failed for file: \\/uhalogs\\/docker\\/docker-58.log, error: No such file or directory (2)\",\"exception\":\"IOException\",\"javaClassName\":\"java.io.IOException\"}}" plugin_id="object:3fe5f920c960"
2017-01-12 13:59:09 +0000 [warn]: suppressed same stacktrace

related code :
https://github.com/fluent/fluent-plugin-webhdfs/blob/master/lib/fluent/plugin/out_webhdfs.rb#L262

What I am not sure and I can't find proper specifications for HttpFS on the web is :

  • Is it a bad implementation of httpFS on MapR side or should we handle this exception as well on the fluentd plugin ?

Thank You
Alban

@enarciso
Copy link

I'm also experiencing this problem but on the cloudera platform. We cannot use webhdfs because it does not have HA capabilities compared to httpfs.

@repeatedly
Copy link
Member

Sorry for missing this issue.
I'm not familiar with HttpFS but if the WebHDFS and HttpFS are incompatibile in several operations, we should care it.

Is it a bad implementation of httpFS on MapR side

From enarciso comment, it seems HttpFS behaviour is same on several distribution. I'm not sure this is a bug of HttpFS or not.
I think append operation should create new file when file doesn't exist.

@enarciso
Copy link

My unfortunate workaround at the moment is to constantly monitor the httpfs logs, watch for string like above and run a touchz to create the file. Thank you for look into this @repeatedly

@tagomoris
Copy link
Member

WebHDFS::ServerError means that the client (fluentd) receives HTTP response code 500 from HttpFs server. WebHDFS server returns 404 for such cases.
IMO it's a bug of HttpFs implementation, because of behavior incompatibility between WebHDFS and HttpFs.

And it (HttpFs) is interoperable with the webhdfs REST HTTP API.
https://hadoop.apache.org/docs/r2.8.0/hadoop-hdfs-httpfs/index.html

@enarciso
Copy link

enarciso commented Oct 5, 2017

Thank you @tagomoris, ive open a case with Cloudera.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants