Skip to content

KYLIN-5023 Support cluster deployMode for Standalone#1677

Merged
hit-lacus merged 1 commit into
apache:kylin-on-parquet-v2from
hit-lacus:KYLIN-5023
Jul 12, 2021
Merged

KYLIN-5023 Support cluster deployMode for Standalone#1677
hit-lacus merged 1 commit into
apache:kylin-on-parquet-v2from
hit-lacus:KYLIN-5023

Conversation

@hit-lacus

@hit-lacus hit-lacus commented Jul 1, 2021

Copy link
Copy Markdown
Member

Proposed changes

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves a feature request, be sure to link to that issue.

Design doc

Types of changes

What types of changes does your code introduce to Kylin?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have create an issue on Kylin's jira, and have described the bug/feature there in detail
  • Commit messages in my PR start with the related jira ID, like "KYLIN-0000 Make Kylin project open-source"
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • If this change need a document change, I will prepare another pr against the document branch
  • Any dependent changes have been merged

Further comments

If this is a relatively large or complex change, kick off the discussion at user@kylin or dev@kylin by explaining why you chose the solution you did and what alternatives you considered, etc...

@hit-lacus

hit-lacus commented Jul 3, 2021

Copy link
Copy Markdown
Member Author

Design Doc

root cause and new design Chinese


Basic Test Case

Configuration

Enable standaone cluster mode by following configuration.

kylin.engine.spark.standalone.master.httpUrl=http://cdh-master:10031
kylin.engine.spark-conf.spark.submit.deployMode=cluster
kylin.engine.spark-conf.spark.shuffle.service.enabled=false
kylin.engine.spark-conf.spark.master=spark://cdh-master:10030

Log Output

2021-07-03 17:33:19,757 DEBUG [BadQueryDetector] service.BadQueryDetector:148 : Detect bad query.
2021-07-03 17:33:21,327 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is SUBMITTED .
2021-07-03 17:33:22,703 INFO  [FetcherRunner 1179123013-26] threadpool.DefaultFetcherRunner:117 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 11 already succeed, 2 error, 7 discarded, 0 others
2021-07-03 17:33:31,350 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is SUBMITTED .
2021-07-03 17:33:37,138 DEBUG [http-bio-7070-exec-7] badquery.BadQueryHistoryManager:65 : Loaded 0 Bad Query(s)
2021-07-03 17:33:41,372 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is SUBMITTED .
2021-07-03 17:33:51,396 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is SUBMITTED .
2021-07-03 17:33:52,703 INFO  [FetcherRunner 1179123013-26] threadpool.DefaultFetcherRunner:117 : Job Fetcher: 1 should running, 1 actual running, 0 stopped, 0 ready, 11 already succeed, 2 error, 7 discarded, 0 others
2021-07-03 17:34:01,448 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is SUBMITTED .
2021-07-03 17:34:11,470 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] deploy.SparkApplicationClient:54 : d5225f31-3a64-4541-95be-47e3f6916f3a state is FINISHED .
2021-07-03 17:34:11,471 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] utils.MetaDumpUtil:118 : Ready to load KylinConfig from uri: xiaoxiangyu@hdfs,path=hdfs://cdh-master:8020/LacusDir/xiaoxiangyu/xiaoxiangyu/LACUS_PRJ/job_tmp/d5225f31-3a64-4541-95be-47e3f6916f3a-01/meta
2021-07-03 17:34:11,512 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] common.KylinConfigBase:263 : Kylin Config was updated with kylin.metadata.url.identifier : xiaoxiangyu
2021-07-03 17:34:11,512 INFO  [Scheduler 959451607 Job d5225f31-3a64-4541-95be-47e3f6916f3a-248] common.KylinConfigBase:263 : Kylin Config was updated with kylin.log.spark-executor-properties-file : /root/xiaoxiang.yu/Kylin4/apache-kylin-4.0.0-SNAPSHOT-bin/conf/spark-executor-log4j.properties

Kylin UI

image

image

@hit-lacus hit-lacus marked this pull request as ready for review July 3, 2021 09:37
@hit-lacus

Copy link
Copy Markdown
Member Author

Test Case

Kill the Driver while the spark application is running

It is clear that when Driver status is not the same as the spark application.
image

Comment on lines +70 to +73
val doNothing: PartialFunction[(String, String, Long), (String, String, Long)] = {
case x => x
}
val res: Iterable[(String, String, Long)] = cachedKylinJobMap.values.filter(app => app._1.contains(stepId)).collect(doNothing)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, could these code be simplified to:

val res = cachedKylinJobMap.values.filter(_._1.contains(stepId))

Comment on lines +88 to +92
var respJson = Map.empty[String, Any]
val tree = parseFull(responseStr)
respJson = tree match {
case Some(map: Map[String, Any]) => map
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe simplify to:

val respJson = parseFull(responseStr).get.asInstanceOf[Map[String, Any]]

@xiacongling

Copy link
Copy Markdown

Hi, @hit-lacus. Similar with #1682, compilation failed for Spark3 compatibility. To resolve the problem, L495 of KylinExpressions may be changed like the following:

https://github.com/apache/kylin/pull/1680/files#diff-eabd8d352aa09399e10ce9d87d3142f314006e5f43980911d98bc61a42dd50eaL495-R495

@hit-lacus

Copy link
Copy Markdown
Member Author

Hi, @hit-lacus. Similar with #1682, compilation failed for Spark3 compatibility. To resolve the problem, L495 of KylinExpressions may be changed like the following:

https://github.com/apache/kylin/pull/1680/files#diff-eabd8d352aa09399e10ce9d87d3142f314006e5f43980911d98bc61a42dd50eaL495-R495

Thanks for let me know, thx.

@hit-lacus hit-lacus merged commit b2016f9 into apache:kylin-on-parquet-v2 Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants