-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-5123][SQL] Expose only one version of the data type APIs (i.e. remove Java-specific APIs) #3925
Conversation
…ve the Java-specific API).
Test build #25141 has started for PR 3925 at commit
|
Test build #25141 has finished for PR 3925 at commit
|
Test FAILed. |
Is this something where we need to make sure to include upgrade details in the release notes? |
Yes - is there a way to label these things in JIRA? |
Test build #25142 has started for PR 3925 at commit
|
cc @marmbrus, @yhuai for SQL changes, and @mengxr, @jkbradley for MLlib changes ... |
Test build #25142 has finished for PR 3925 at commit
|
Test FAILed. |
Test build #25145 has started for PR 3925 at commit
|
Test build #25145 has finished for PR 3925 at commit
|
Test FAILed. |
} | ||
} else { | ||
// ISO8601 with GMT insert | ||
val ISO8601GMT: SimpleDateFormat = new SimpleDateFormat( "yyyy-MM-dd'T'HH:mm:ss.SSSz" ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make ISO8601GMT
this as thread local? or leave a TODO for future improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? I don't think I changed this. Simply copied it from a file that was deleted.
That's a very cool idea to make a unified |
/** | ||
* :: DeveloperApi :: | ||
* | ||
* The data type representing `NULL` values. Please use the singleton [[DataTypes.NullType]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are scala users expected to use DataTypes
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure why not ....
SQL changes look good to me. |
… into sql.types package.
Test build #25169 has started for PR 3925 at commit
|
Test build #25170 has started for PR 3925 at commit
|
Test build #25169 has finished for PR 3925 at commit
|
Test FAILed. |
Test build #25170 has finished for PR 3925 at commit
|
Test FAILed. |
Test build #25173 has started for PR 3925 at commit
|
Test build #25173 has finished for PR 3925 at commit
|
Test FAILed. |
* // nonExisting: StructField = null | ||
* | ||
* // Extract multiple StructFields. Field names are provided in a set. | ||
* // A StructType object will be returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the "Preserve the original order of fields." comment in the apply() method be moved to the doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Test build #25184 has started for PR 3925 at commit
|
@rxin MLlib changes look fine to me, and FWIW the other parts did too. My only remaining comments are ordering imports, but I'll leave those out for now. LGTM |
Test build #25184 has finished for PR 3925 at commit
|
Test PASSed. |
Test build #25193 has started for PR 3925 at commit
|
Test build #25193 has finished for PR 3925 at commit
|
Test PASSed. |
@@ -93,7 +92,6 @@ class HiveInspectorSuite extends FunSuite with HiveInspectors { | |||
val row = data.map(_.eval(null)) | |||
val dataTypes = data.map(_.dataType) | |||
|
|||
import scala.collection.JavaConversions._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the code in Spark SQL uses JavaConversions... I don't think performance is an issue here or anything.
Test build #25260 has started for PR 3925 at commit
|
Test build #25260 has finished for PR 3925 at commit
|
Test PASSed. |
See #3958 |
Having two versions of the data type APIs (one for Java, one for Scala) requires downstream libraries to also have two versions of the APIs if the library wants to support both Java and Scala. I took a look at the Scala version of the data type APIs - it can actually work out pretty well for Java out of the box. As part of the PR, I created a sql.types package and moved all type definitions there. I then removed the Java specific data type API along with a lot of the conversion code. This subsumes #3925 Author: Reynold Xin <rxin@databricks.com> Closes #3958 from rxin/SPARK-5123-datatype-2 and squashes the following commits: 66505cc [Reynold Xin] [SPARK-5123] Expose only one version of the data type APIs (i.e. remove the Java-specific API).
@@ -15,77 +15,74 @@ | |||
* limitations under the License. | |||
*/ | |||
|
|||
package org.apache.spark.sql.api.java; | |||
package org.apache.spark.sql.types; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a java source file, should we move it to sql/catalyst/src/main/java
?
Having two versions of the data type APIs (one for Java, one for Scala) requires downstream libraries to also have two versions of the APIs if the library wants to support both Java and Scala. I took a look at the Scala version of the data type APIs - it can actually work out pretty well for Java out of the box.
As part of the PR, I created a sql.types package and moved all type definitions there. I then removed the Java specific data type API along with a lot of the conversion code.