Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CARBONDATA-3210] Merge common method into CarbonSparkUtil #3032

Closed
wants to merge 15 commits into from
Closed

[CARBONDATA-3210] Merge common method into CarbonSparkUtil #3032

wants to merge 15 commits into from

Conversation

xiaohui0318
Copy link
Contributor

@xiaohui0318 xiaohui0318 commented Dec 28, 2018

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?

  • Any backward compatibility impacted?

  • Document update required?

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

org.apache.carbondata.examples.S3Example$#getKeyOnPrefix
org.apache.carbondata.spark.thriftserver.CarbonThriftServer#getKeyOnPrefix
这三个类中的方法getKeyOnPrefix 合并到spark2/src/main/scala/org/apache/carbondata/spark/util/CarbonSparkUtil.scala
@CarbonDataQA
Copy link

Can one of the admins verify this patch?

@qiuchenjian
Copy link
Contributor

Please describe the change of this PR

README.md Outdated
@@ -84,3 +85,6 @@ To get involved in CarbonData:
## About
Apache CarbonData is an open source project of The Apache Software Foundation (ASF).


## 2018-12-28开始
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what the usage of this description? why it has chinese?

def getKeyOnPrefix(path: String): (String, String, String) = {
val endPoint = "spark.hadoop." + ENDPOINT
if (path.startsWith(CarbonCommonConstants.S3A_PREFIX)) {
("spark.hadoop." + ACCESS_KEY, "spark.hadoop." + SECRET_KEY, endPoint)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated spark.hadoop." literals make the process of refactoring error-prone, since you must be sure to update all occurrences.",I think you can define a variable uniformly.

@xubo245
Copy link
Contributor

xubo245 commented Dec 29, 2018

@xiaohui0318 Please optimize the title, such as change merge to Merge

@@ -117,4 +116,29 @@ object CarbonSparkUtil {
case _ =>
delimiter
}
def getKeyOnPrefix(path: String): (String, String, String) = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add empty line before this line

@xiaohui0318 xiaohui0318 changed the title [CARBONDATA-3210] merge getKeyOnPrefix into CarbonSparkUtil [CARBONDATA-3210] Merge common method into CarbonSparkUtil Dec 29, 2018
}

def getS3EndPoint(args: Array[String]): String = {
if (args.length >= 4 && args(3).contains(".com")) args(3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you optimize it? for example: change args to length and endPoint, not args(3). Because endpoint maybe is the 3rd or 4th number of parameter. It will error when the order change in args

}

def getSparkMaster(args: Array[String]): String = {
if (args.length == 6) args(5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you optimize it? like the previous comments

@@ -62,8 +62,7 @@ object CarbonSparkUtil {
/**
* return's the formatted column comment if column comment is present else empty("")
*
* @param carbonColumn
* @return
* @return comment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you delete @param carbonColumn?

@xubo245
Copy link
Contributor

xubo245 commented Dec 29, 2018

Have you validate with S3?

.appName("S3UsingSDKExample")
.config("spark.driver.host", "localhost")
.config(accessKey, args(0))
.config(secretKey, args(1))
.config(endpoint, getS3EndPoint(args))
.config(endpoint,CarbonSparkUtil.getS3EndPoint(args))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a blank before CarbonSparkUtil

@@ -21,8 +21,8 @@ import java.io.File
import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, ENDPOINT, SECRET_KEY}
import org.apache.spark.sql.{Row, SparkSession}
import org.slf4j.{Logger, LoggerFactory}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove this blank line

@@ -20,10 +20,10 @@ import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.s3a.Constants.{ACCESS_KEY, ENDPOINT, SECRET_KEY}
import org.apache.spark.sql.SparkSession
import org.slf4j.{Logger, LoggerFactory}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove this line

@@ -24,10 +24,10 @@ import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
import org.slf4j.{Logger, LoggerFactory}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove

@xubo245
Copy link
Contributor

xubo245 commented Dec 29, 2018

add to whitelist

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10336/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2082/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2287/

@xiaohui0318
Copy link
Contributor Author

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed?
    NO
  • Any backward compatibility impacted?
    NO
  • Document update required?
    NO
  • Testing done
    NO,only fix style error
  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
    NO

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2088/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2090/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2295/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10344/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2182/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2398/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10438/

*
* @param args require three parameters "Access-key" "Secret-key"
* "table-path on s3" "s3-endpoint" "spark-master"
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaohui0318 please check the indent of comments, it needs to remove one blank.
Check other comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked and fix

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2185/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2187/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10441/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2403/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10443/

@zzcclp
Copy link
Contributor

zzcclp commented Jan 8, 2019

LGTM


/**
* Generate data and write data to S3
* User can generate different numbers of data by specifying the number-of-rows in parameters
*/
object S3UsingSDKExample {
object S3UsingSdkExample {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test it with Huawei OBS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

* CarbonThriftServer support different modes:
* 1. read/write data from/to HDFS or local,it only needs configurate storePath
* 2. read/write data from/to S3, it needs provide access-key, secret-key, s3-endpoint
*/
object CarbonThriftServer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test it with Huawei OBS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import org.apache.spark.sql.{Row, SparkSession}
import org.slf4j.{Logger, LoggerFactory}

import org.apache.carbondata.core.constants.CarbonCommonConstants
import org.apache.carbondata.spark.util.CarbonSparkUtil

object S3Example {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test it with Huawei OBS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2223/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2442/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10480/

@CarbonDataQA
Copy link

Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2225/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/2226/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.2, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/10483/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/2445/

@xubo245
Copy link
Contributor

xubo245 commented Jan 9, 2019

LGTM! Thanks for you contribution!

@asfgit asfgit closed this in 3a41ee5 Jan 9, 2019
asfgit pushed a commit that referenced this pull request Jan 21, 2019
…ample error

1.merge public methods to spark2/src/main/scala/org/apache/carbondata/spark/util/CarbonSparkUtil.scala
	org.apache.carbondata.examples.S3UsingSDKExample#getKeyOnPrefix
	org.apache.carbondata.examples.S3Example$#getKeyOnPrefix
	org.apache.carbondata.spark.thriftserver.CarbonThriftServer#getKeyOnPrefix

2. fix the error of S3UsingSDKExample

This closes #3032
qiuchenjian pushed a commit to qiuchenjian/carbondata that referenced this pull request Jun 14, 2019
…ample error

1.merge public methods to spark2/src/main/scala/org/apache/carbondata/spark/util/CarbonSparkUtil.scala
	org.apache.carbondata.examples.S3UsingSDKExample#getKeyOnPrefix
	org.apache.carbondata.examples.S3Example$#getKeyOnPrefix
	org.apache.carbondata.spark.thriftserver.CarbonThriftServer#getKeyOnPrefix

2. fix the error of S3UsingSDKExample

This closes apache#3032
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants