New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-510 ARROW-582 ARROW-663 ARROW-729: [Java] Added units for Time and Date types, and integration tests #475
Conversation
8a85ab9
to
d10713e
Compare
This is how I would do it. |
@julienledem @wesm where are the roundtrip integration tests so I can add these to them and how do I run them? |
@leifwalsh , you first want to look at the java integration test: Then there is https://github.com/apache/arrow/blob/master/integration/integration_test.py#L659 which invokes the Java integration thing. |
It seems to work but I think the test data doesn't have any timestamps, dates, or times in it. Were the json files constructed manually or programmatically? |
The integration tests for date/times are in the patch for ARROW-510. There's a line commented out that generates the test cases |
@wesm awesome. I created a branch off this and merged yours into that, uncommented the test, and got a failure:
I'm not sure if I'm reading this right but it looks like the schema in the json file is specifying bit widths for |
Yeah, that looks like the case: https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/metadata.cc#L389 Let me fix that in #458 right now and add a check to validate that the bit width is what we expect for each unit |
done 06c4eed |
Okay, I implemented these types in
Funny thing is, those are identical according to |
Interesting. Can you pull my commits from #458 into this patch and I'll take a look to see what's up? |
Why do you not check if Is the expectation that |
Sure. |
Pushed |
looking |
Found the issue, the bit width on Date32Type in arrow/type.h was incorrect. fixed at wesm@2e5570c. integration tests pass now... awesome! |
Wonderful. Can you merge your branch with the tests commented out, I'll
rebase onto that and clean up my commits? Would you like me to review your
PR?
|
I closed my PR -- do you mind doing all the code reviews in this patch? Feel free to review the integration test / C++ stuff (the changes were pretty minimal) |
Also, if you can add "closes #366" into your PR description then that will close that PR when this is merged |
Sure
|
Wow, awesome! Looks like you guys are pretty much all set. Let me know if I can help with anything. |
@icexelloss would appreciate review, especially of what I'm doing with micros and nanos to create joda objects here #475 (comment) |
2e5570c
to
0bfe92e
Compare
Reorganized commits a little |
@leifwalsh looks like you pulled in a few unintentional commits, can you rebase on apache/master? |
|
||
<#elseif minor.class == "TimeStampMilli"> | ||
switch (dateUnit) { | ||
case DAY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this works for both day and millis:
long millisSinceEpoch = timeUnit.toMillis((long) get(index))
DateTime date = new DateTime(millisSinceEpoch, DateTimeZone.getDefault())
long millis; | ||
switch (timeUnit) { | ||
case SECOND: | ||
millis = java.util.concurrent.TimeUnit.SECONDS.toMillis(get(index)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the timeUnit in the class
27c6853
to
49392f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good to me.
@jacques-n ?
@leifwalsh there is a conflict (hopefully not too horrible to resolve) after merging #409 -- the idea is that this will make it easier to add more types that have additional metadata |
@julienledem, I think this patch further propagates the issues we've with using DateTime from Joda instead of LocalDateTime. I think those should probably be addressed separately but wanted to confirm your thoughts? |
@jacques-n: right. This should probably be fixed in a separate change. I'd suggest the new vectors should just use Integer and Long in getObject(). |
I just opened https://issues.apache.org/jira/browse/ARROW-768 and made it a blocker for 0.3 |
@julienledem do you want me to change |
Resolved merge conflicts by rebasing on master, passes java unit and c++/java integration tests. |
…r consistency with FixedSizeList As discussed on JIRA Author: Wes McKinney <wes.mckinney@twosigma.com> Closes apache#473 from wesm/ARROW-733 and squashes the following commits: 0e30af3 [Wes McKinney] Rename FixedWidthBinary to FixedSizeBinary for consistency with FixedSizeList type
Change-Id: I79f5d942f64c275c87568703f9edf6a7e89467ac
49392f2
to
47f83a8
Compare
@leifwalsh: Yes, please make getObject() return Integer and Long (as appropriate) for new vectors. |
@julienledem should I also do that to the existing |
Also, |
@leifwalsh I think for now it's better to do it only for new vectors. Changing the behavior of existing vectors requires more work on the code using them. Which is why I'd rather have those as separate changes. |
Okay then I think this is good to go. |
Cool. +1 -- thanks @leifwalsh for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd better not change the behavior of dateVector.getAccessor().getObject() just yet.
Otherwise, this looks good to me.
{ class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "DateTime" } | ||
{ class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "DateTime" } | ||
{ class: "TimeStampNano", javaType: "long", boxedType: "Long", friendlyType: "DateTime" } | ||
{ class: "DateMilli" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is changing an existing vector, right?
@@ -482,12 +483,15 @@ public long getTwoAsLong(int index) { | |||
|
|||
</#if> | |||
|
|||
<#if minor.class == "Date"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we're changing the behavior of Date.
DateVector is changing anyway, to a new name at least. It either becomes
inconsistent with the other date/time types or changes in this way. Given
that code which deals with it already has to change, I opted for
consistency. Do you prefer the other direction?
On Thu, Apr 6, 2017 at 04:37 Julien Le Dem ***@***.***> wrote:
***@***.**** commented on this pull request.
I think we'd better not change the behavior of
dateVector.getAccessor().getObject() just yet.
Otherwise, this looks good to me.
------------------------------
In java/vector/src/main/codegen/data/ValueVectorTypes.tdd
<#475 (comment)>:
> @@ -70,11 +72,13 @@
{ class: "BigInt"},
{ class: "UInt8" },
{ class: "Float8", javaType: "double" , boxedType: "Double", fields: [{name: "value", type: "double"}], },
- { class: "Date", javaType: "long", friendlyType: "DateTime" },
- { class: "TimeStampSec", javaType: "long", boxedType: "Long", friendlyType: "DateTime" }
- { class: "TimeStampMilli", javaType: "long", boxedType: "Long", friendlyType: "DateTime" }
- { class: "TimeStampMicro", javaType: "long", boxedType: "Long", friendlyType: "DateTime" }
- { class: "TimeStampNano", javaType: "long", boxedType: "Long", friendlyType: "DateTime" }
+ { class: "DateMilli" },
this is changing an existing vector, right?
------------------------------
In java/vector/src/main/codegen/templates/FixedValueVectors.java
<#475 (comment)>:
> @@ -482,12 +483,15 @@ public long getTwoAsLong(int index) {
</#if>
- <#if minor.class == "Date">
Here we're changing the behavior of Date.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#475 (review)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAC1Rs-MwqIn0tIaBnXBS-5w7VvtRs84ks5rtKQxgaJpZM4MwRrM>
.
--
--
Cheers,
Leif
|
@julienledem sorry for merging early, should have waited for your final review. Let's revert what needs to be reverted for 0.3? |
I opened https://issues.apache.org/jira/browse/ARROW-777, will leave it to you all to resolve ahead of 0.3 |
closes #366