-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LOAD DATA unable to locate files #9
Comments
I'm not sure if I understand correctly what you are trying to do but if you use |
Hi @gmouchakis, that is right. What I am trying to do is to push data into HDFS through the REST API of the HDFS system and then to be able to use it from hive, pretty much as I am doing in production environments with my company cluster. I also discovered the /user/hive/warehouse path created by HIVE does not appear on the HDFS filesystem defined by this docker compose environment, which means the hive docker container is somehow pointing to a different HDFS, but I still can't find the root of the problem. If I am right, HIVE uses namenode to find in which data node you have the information in the HDFS filesystem, but not in this containerized environment. Everything should happen by http interactions between docker containers isn't it? |
Hi @enanablancaynumeros! Was a stupid mistake from my side, Hadoop inside Hive container was not setup to work with remote hadoop. Fixed now I've also updated hadoop to 2.8. If you still have this problem, feel free to reopen the issue. |
I broke jdbc connector now, give me a moment. |
First start-up the system as written in README.md. Connect to hive:
Connect to namenode and check that Hive is writing to the right HDFS:
Now if you do the same thing from hive-server container you will see the same thing:
|
Hi @earthquakesan, Thanks for the answer! I got it working if I only use the new entrypoint script, but it doesn't work under other use cases you haven't describe in your previous steps if I try to use the new hadoop and jdbc version. For some reason I haven't had time to identify, it is possible to execute beeline inside of the docker container and now it points to the right HDFS, but if you try to hit the port 10000 from outside or other docker containers it refuses the connection, which affects plain curl calls and external jdbc drivers. Let me know if you can not reproduce the problem. I run docker-compose build --no-cache after your changes a couple of times and still got that error. |
@enanablancaynumeros I need a description of how you run the application, because I have no problems running the example app from here: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC What I did to run it:
It will fail to read a.txt, because it is created on your host and not on remote host. Therefore you first have to copy the file there:
Otherwise, I can not see any problems with this sample application. Here is the output:
|
Hello,
I am trying to load data from files into HIVE using this docker compose environment, but it is unable to locate the files. I am PUTTING the data files through the REST API of the name node, port 50070 (If I remember it correctly) without problems, I can see the files through the file browser and running hdfs commands inside the name node container but the hive server docker container doesn't seem to recognise the same directory tree as the name node, so when I am using the hive instruction to LOAD DATA with the putted path it fails. Inside the docker container of hive server, obviously hdfs doesn't have that file either. It is only able to load files if I create them locally in the hive server container (using a volume because I didn't find an editor :p).
I haven't changed anything in the compose file, appart from using the version 3 and creating a common network and defining dependencies instead of links.
Would you able to point me in the right direction? How can I change the hive configuration to check I am writing to the same name name node?
I am new to hadoop and hive! So apologize for any architecture misconception.
Thanks,
Yeray
The text was updated successfully, but these errors were encountered: