New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS Uploader: Cannot run program "hadoop": error=2, No such file or directory #1023

Closed
mhajibaba opened this Issue Jul 4, 2016 · 8 comments

Comments

Projects
None yet
4 participants
@mhajibaba

mhajibaba commented Jul 4, 2016

when i decide to run ExclamationTopology example on cluster, i found that need to use hdfs as uploader not local file system.
I follow the instructions on Setting Up HDFS Uploader, but when submit the topology i get the following error!
Cannot run program "hadoop": error=2, No such file or directory

[2016-07-04 13:17:57 +0430] org.apache.curator.framework.state.ConnectionStateManager INFO:  State change: CONNECTED  
[2016-07-04 13:17:57 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Topologies directory: /heron/topologies  
[2016-07-04 13:17:57 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Tmaster location directory: /heron/tmasters  
[2016-07-04 13:17:57 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Physical plan directory: /heron/pplans  
[2016-07-04 13:17:57 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Execution state directory: /heron/executionstate  
[2016-07-04 13:17:57 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Scheduler location directory: /heron/schedulers  
[2016-07-04 13:17:58 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Directory tree initialized.  
[2016-07-04 13:17:58 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Checking existence of path: /heron/topologies/ExclamationTopology  
[2016-07-04 13:17:58 +0430] com.twitter.heron.spi.common.ShellUtils SEVERE:  Failed to check status of packer java.io.IOException: Cannot run program "hadoop": error=2, No such file or directory  
[2016-07-04 13:17:58 +0430] com.twitter.heron.uploader.hdfs.HdfsUploader INFO:  The destination directory does not exist; creating it.  
[2016-07-04 13:17:58 +0430] com.twitter.heron.spi.common.ShellUtils SEVERE:  Failed to check status of packer java.io.IOException: Cannot run program "hadoop": error=2, No such file or directory  
[2016-07-04 13:17:58 +0430] com.twitter.heron.uploader.hdfs.HdfsUploader SEVERE:  Failed to create directory: hdfs://heron/topologies/aurora/ExclamationTopology  
[2016-07-04 13:17:58 +0430] com.twitter.heron.scheduler.SubmitterMain SEVERE:  Failed to upload package.  
[2016-07-04 13:17:58 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Closing the CuratorClient to: 192.168.11.231:2181,192.168.11.232:2181,192.168.11.233:2181  
[2016-07-04 13:17:58 +0430] org.apache.zookeeper.ZooKeeper INFO:  Session: 0x355aabf8eb20041 closed  
[2016-07-04 13:17:58 +0430] org.apache.zookeeper.ClientCnxn INFO:  EventThread shut down  
[2016-07-04 13:17:58 +0430] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager INFO:  Closing the tunnel processes  
Exception in thread "main" java.lang.RuntimeException: Failed to submit topology ExclamationTopology
    at com.twitter.heron.scheduler.SubmitterMain.main(SubmitterMain.java:319)
ERROR: Failed to launch topology 'ExclamationTopology' because User main failed with status 1. Bailing out...```
@mhajibaba

This comment has been minimized.

Show comment
Hide comment
@mhajibaba

mhajibaba Jul 4, 2016

I found the problem!
First need to add hadoop in $PATH.
export PATH=$PATH:$HADOOP_HOME/bin

Then correct HDFS uploader configuration file with a well-formed URL such as follow:
heron.uploader.hdfs.topologies.directory.uri: hdfs://192.168.11.xx:9000/heron/topologies/${CLUSTER}/${TOPOLOGY}
or
heron.uploader.hdfs.topologies.directory.uri: hdfs:///heron/topologies/${CLUSTER}/${TOPOLOGY} with 3 slashes!

But i get a new error:
curl: (1) Protocol hdfs not supported or disabled in libcurl

mhajibaba commented Jul 4, 2016

I found the problem!
First need to add hadoop in $PATH.
export PATH=$PATH:$HADOOP_HOME/bin

Then correct HDFS uploader configuration file with a well-formed URL such as follow:
heron.uploader.hdfs.topologies.directory.uri: hdfs://192.168.11.xx:9000/heron/topologies/${CLUSTER}/${TOPOLOGY}
or
heron.uploader.hdfs.topologies.directory.uri: hdfs:///heron/topologies/${CLUSTER}/${TOPOLOGY} with 3 slashes!

But i get a new error:
curl: (1) Protocol hdfs not supported or disabled in libcurl

@caofangkun

This comment has been minimized.

Show comment
Hide comment
@caofangkun

caofangkun Jul 5, 2016

Contributor

error curl: (1) Protocol hdfs not supported or disabled in libcurl is likely from : ShellUtils/curlPackage

but curl could not operate hdfs directly.

Apache Hadoop provided webhdfs rest api

to support curl files on hdfs

@mhajibaba Could you have a try? don't forget modify hdfs-site.xml and add following config:

<property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
</property>
Contributor

caofangkun commented Jul 5, 2016

error curl: (1) Protocol hdfs not supported or disabled in libcurl is likely from : ShellUtils/curlPackage

but curl could not operate hdfs directly.

Apache Hadoop provided webhdfs rest api

to support curl files on hdfs

@mhajibaba Could you have a try? don't forget modify hdfs-site.xml and add following config:

<property>
 <name>dfs.webhdfs.enabled</name>
 <value>true</value>
</property>
@kramasamy

This comment has been minimized.

Show comment
Hide comment
@kramasamy

kramasamy Jul 5, 2016

Contributor

@mhajibaba - This is because when you used HDFS uploader, your aurora file heron.aurora needs to be changed to use hdfs command line. Essentially, you need to change the fetch_heron_system and fetch_heron_package to use hdfs command line in file

https://github.com/twitter/heron/blob/master/heron/config/src/yaml/conf/aurora/heron.aurora

Contributor

kramasamy commented Jul 5, 2016

@mhajibaba - This is because when you used HDFS uploader, your aurora file heron.aurora needs to be changed to use hdfs command line. Essentially, you need to change the fetch_heron_system and fetch_heron_package to use hdfs command line in file

https://github.com/twitter/heron/blob/master/heron/config/src/yaml/conf/aurora/heron.aurora

@mhajibaba

This comment has been minimized.

Show comment
Hide comment
@mhajibaba

mhajibaba Jul 9, 2016

@caofangkun Thanks a lot! I've configured HDFS and can run heron to use HDFS.
but i use an static command in aurora file and need to change the hdfs:/// URI to http:///webhdfs/v1/ URI!
How can i do it?

@kramasamy Thanks. I changed the heron.aurora file before like follow:

fetch_user_package = Process(
  name = 'fetch_user_package',
  cmdline = '/opt/hadoop-2.7.0/bin/hadoop fs -get  %s  ./%s  && tar zxf %s' % (heron_topology_jar_uri, topology_package_file, topology_package_file)
)

but it need to install hadoop client on all worker machines that is not good. so i need to use curl command.

mhajibaba commented Jul 9, 2016

@caofangkun Thanks a lot! I've configured HDFS and can run heron to use HDFS.
but i use an static command in aurora file and need to change the hdfs:/// URI to http:///webhdfs/v1/ URI!
How can i do it?

@kramasamy Thanks. I changed the heron.aurora file before like follow:

fetch_user_package = Process(
  name = 'fetch_user_package',
  cmdline = '/opt/hadoop-2.7.0/bin/hadoop fs -get  %s  ./%s  && tar zxf %s' % (heron_topology_jar_uri, topology_package_file, topology_package_file)
)

but it need to install hadoop client on all worker machines that is not good. so i need to use curl command.

@maosongfu

This comment has been minimized.

Show comment
Hide comment
@maosongfu

maosongfu Jul 9, 2016

Contributor

@mhajibaba You may need to customize the HDFSUploader to return http:///webhdfs/v1/ URI

Contributor

maosongfu commented Jul 9, 2016

@mhajibaba You may need to customize the HDFSUploader to return http:///webhdfs/v1/ URI

@mhajibaba

This comment has been minimized.

Show comment
Hide comment
@mhajibaba

mhajibaba Jul 9, 2016

@maosongfu What is the proper configuration! It can not be heron.uploader.hdfs.topologies.directory.uri.
How i tell uploader to return my desired URI?

The main reason that i'm going to use HDFS is to use heron on real distributed mode and to upload my program (jar file) in a single location not on all machines. Is HDFS the solution? If true where i find the proper configuration?

mhajibaba commented Jul 9, 2016

@maosongfu What is the proper configuration! It can not be heron.uploader.hdfs.topologies.directory.uri.
How i tell uploader to return my desired URI?

The main reason that i'm going to use HDFS is to use heron on real distributed mode and to upload my program (jar file) in a single location not on all machines. Is HDFS the solution? If true where i find the proper configuration?

@maosongfu

This comment has been minimized.

Show comment
Hide comment
@maosongfu

maosongfu Jul 9, 2016

Contributor

@mhajibaba
For heron-core package, you could configure the uri directly in config: https://github.com/twitter/heron/blob/master/heron/config/src/yaml/conf/aurora/client.yaml#L2
For topology package, https://github.com/twitter/heron/blob/master/heron/spi/src/java/com/twitter/heron/spi/uploader/IUploader.java#L48, the IUploader.uploadPackage() returns the topology URI, so you may need to change the HDFSUploader a little bit. Or a pull request to make HDFSUploader more generic is welcome.

Contributor

maosongfu commented Jul 9, 2016

@mhajibaba
For heron-core package, you could configure the uri directly in config: https://github.com/twitter/heron/blob/master/heron/config/src/yaml/conf/aurora/client.yaml#L2
For topology package, https://github.com/twitter/heron/blob/master/heron/spi/src/java/com/twitter/heron/spi/uploader/IUploader.java#L48, the IUploader.uploadPackage() returns the topology URI, so you may need to change the HDFSUploader a little bit. Or a pull request to make HDFSUploader more generic is welcome.

@kramasamy

This comment has been minimized.

Show comment
Hide comment
@kramasamy

kramasamy Jul 24, 2016

Contributor

@mhajibaba - we fixed one more issue with respect to YARN - especially when heron-shell is started - #1140

Contributor

kramasamy commented Jul 24, 2016

@mhajibaba - we fixed one more issue with respect to YARN - especially when heron-shell is started - #1140

@kramasamy kramasamy closed this Oct 13, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment