-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6638] [SQL] Improve performance of StringType in SQL #5303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #29512 has started for PR 5303 at commit |
|
Test build #29512 has finished for PR 5303 at commit
|
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should create a object Literal and add an apply function that converts string into UTF8String, rather than making this a var.
|
Test build #29522 has started for PR 5303 at commit |
|
Test build #29522 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29526 has started for PR 5303 at commit |
|
Test build #29526 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29527 has started for PR 5303 at commit |
|
Test build #29527 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29552 has started for PR 5303 at commit |
|
Test build #29552 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29557 has started for PR 5303 at commit |
|
Test build #29632 has started for PR 5303 at commit |
|
Test build #29633 has started for PR 5303 at commit |
Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala sql/core/src/main/scala/org/apache/spark/sql/parquet/newParquet.scala sql/core/src/main/scala/org/apache/spark/sql/sources/commands.scala
|
Test build #29634 has started for PR 5303 at commit |
|
Test build #29637 has started for PR 5303 at commit |
|
Can we open a new PR and close this one to get rid of the all the Jenkins messages? |
|
Test build #29632 has finished for PR 5303 at commit
|
|
Test FAILed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this function somewhere before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we have a similar one: convertToCatalyst (a: Any, dt: DataType)
|
Test build #29633 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29634 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #29637 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
@rxin I will create a new PR after fixing the rest tests. |
|
Test build #635 has started for PR 5303 at commit |
|
Test build #29658 has started for PR 5303 at commit |
|
Test build #29658 has finished for PR 5303 at commit
|
|
Test FAILed. |
|
Test build #635 has finished for PR 5303 at commit
|
|
Close this one to get rid of all the jenkins comments. |
This PR change the internal representation for StringType from java.lang.String to UTF8String, which is implemented use Array[Byte](encoded in UTF-8).
This PR should not break any public API, Row.getString() will still return java.lang.String.
This is the first step of improve the performance of String in SQL.
cc @rxin