Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-2767][CarbonStore] Fix task locality issue #2528

Closed
wants to merge 4 commits into from

Conversation

QiangCai
Copy link
Contributor

@QiangCai QiangCai commented Jul 19, 2018

If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode.

So no need to provide the preferred locations to the task.

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7315/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6079/

@QiangCai QiangCai changed the title [WIP][CarbonStore] Fix task locality issue [CARBONDATA-2767][CarbonStore] Fix task locality issue Jul 23, 2018
@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7388/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6149/

@ravipesala
Copy link
Contributor

SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/5962/

@@ -1882,6 +1882,13 @@

public static final String CARBON_MERGE_INDEX_IN_SEGMENT_DEFAULT = "true";

/**
* config carbon scan task locality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide more detail, like what scheduling behavior will be used for true and false

@@ -87,6 +87,8 @@ class CarbonScanRDD[T: ClassTag](
}
private var vectorReader = false

private val isTaskLocality = CarbonProperties.isTaskLocality
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can be transient

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7444/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6199/

<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version> 1.10.6</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove tab

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7472/

.split
.value
.getLocations
.filter(_ != "localhost")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if I configure TaskLocality and run the job in local machine or local pseudo distributed mode?
Besides, if you really want to exclude local machine, except 'localhost', why host name of local machine is not considered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't want to exclude local machine.
Because maybe all machines have configured "127.0.0.1 localhost", "localhost" is unuseful.

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6229/

pom.xml Outdated
@@ -110,7 +110,7 @@
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<snappy.version>1.1.2.6</snappy.version>
<hadoop.version>2.7.2</hadoop.version>
<hadoop.version>2.8.3</hadoop.version>
Copy link
Contributor

@jackylk jackylk Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add this in Horizon profile only

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/7492/

@jackylk
Copy link
Contributor

jackylk commented Jul 25, 2018

LGTM

asfgit pushed a commit that referenced this pull request Jul 25, 2018
If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode.

This closes #2528
@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/6253/

@QiangCai QiangCai closed this Jul 26, 2018
xubo245 pushed a commit to xubo245/carbondata that referenced this pull request Aug 30, 2018
If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode.

This closes apache#2528
xubo245 pushed a commit to xubo245/carbondata that referenced this pull request Aug 30, 2018
If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode.

This closes apache#2528
xubo245 pushed a commit to xubo245/carbondata that referenced this pull request Sep 3, 2018
If the Spark cluster and the Hadoop cluster are two different machine cluster, the Spark tasks will run in RACK_LOCAL mode.

This closes apache#2528
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants