-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Hadoop 3? #657
Comments
@theyaa, No Dr.Elephant currently doesn't support Hadoop3 with ATS v2. But you can use Dr.E with Hadoop3 in prod given that you Yarn REST APIs and history servers are in sync with what Dr.Elephant is excepting. |
Hi @ShubhamGupta29, in HDP3 Hadoop3, all hive queries run using the Tez engine. And Tez is built to send query updates/progress to Yarn ATSv2. Using Yarn timeline server v1 rest api, we can not get Tez query progress information anymore. We have to use Yarn ATSv2. Or read from Hive's sys db tables query_data, dag_data. |
@theyaa, got the need for ATSv2. I will have a look at all the needs and changes for this requirement and prioritize respectively. |
@ShubhamGupta29 thank you very much. Please let me know when you have a working version so I can download and try it out. |
@theyaa Is the Tez UI working in your HDP 3 install? |
Hi @shkhrgpt the value is: org.apache.tez.dag.history.logging.proto.ProtoHistoryLoggingService |
@theyaa That may be the issue why the timeline server is not returning data for Tez. Maybe if you change the value of https://tez.apache.org/tez-ui.html I haven't tested it yet so I don't know if it causes any problem. But maybe you can try? |
Hi @shkhrgpt This will cause issues with Yarn and hive logging since Yarn with Hadoop3 and HDP3 logs to Yarn ATSv2 and the latter uses Protobuf and writes to Hbase. If I switch to the old class for Tez I will loose that logging and cause issues in Yarn. That is why I was asking if there is a way to modify Dr. Elephant to be able to read from Yarn ATSv2. |
Okay @theyaa . |
@theyaa https://github.com/shkhrgpt/tez-logging The goal is that dr elephant should be able to access get data from ATSv1 rest api, and the data should go also be written to protobuf so nothing else. |
Hi @shkhrgpt Tez+Hive in Hive3 do log all query/dag events to a hive database called sys. Under the sys db, there are 2 tables query_data and dag_data. Those are the main two tables. If you can get Dr. Elephant to read from those two tables, then it will be able to process hive queries the same way as before. Cloudera has a tools called "Data Analytics Studio" It does exactly this and presents the query in a web user interface. I believe if Dr. Elephant can parse the below 2 tables from hive's sys db, it will be able to perform the same exact way.
|
Does Dr. Elephant provide support for Hadoop 3 with Yarn ATS V2 please?
The text was updated successfully, but these errors were encountered: