Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-420: Align DATE type with Java implementation #238

Closed
wants to merge 4 commits into from

Conversation

xhochy
Copy link
Member

@xhochy xhochy commented Dec 13, 2016

No description provided.

Change-Id: I2dc9e848f5bed9d717477792459dde6ea839b341
@xhochy
Copy link
Member Author

xhochy commented Dec 13, 2016

Things to do:

  • Add (pylist) conversion from datetime.date to DateArray
  • Add (numpy) conversion from datetime.date to DateArray

@wesm
Copy link
Member

wesm commented Dec 13, 2016

Why would we need more than int32_t for date?

@xhochy
Copy link
Member Author

xhochy commented Dec 13, 2016

See the ticket https://issues.apache.org/jira/browse/ARROW-413

The way DATE is implemented in Java is: "Here's a millisecond timestamp but please only look at the days".

Change-Id: I6627e7be236b669600301ff6f073d79cef0f3ed0
Change-Id: Id38076d044006da7bfcf939832f82b345c7ca03b
Change-Id: Ice6cc2a8bf082edf20ab7950fae4bd10114f002e
@@ -369,6 +392,10 @@ inline Status ArrowSerializer<NPY_OBJECT>::Convert(std::shared_ptr<Array>* out)

// TODO: mask not supported here
const PyObject** objects = reinterpret_cast<const PyObject**>(PyArray_DATA(arr_));
{
PyAcquireGIL lock;
PyDateTime_IMPORT;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things that bother me:

  1. I'd like to make this import only once for libpyarrow but I'm unsure how as it requires the GIL.
  2. We don't seem to hold the GIL here but the functions called below suggest that we should.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we should acquire the GIL below before using any of the Py* methods. I'll create a JIRA about going through this module completely and making sure we're releasing the GIL on the Cython side and re-acquiring it in the appropriate places here

@xhochy
Copy link
Member Author

xhochy commented Dec 16, 2016

Ready for review. As parquet-cpp builts without changes, I'd first like to get this here merged and would then add in another iteration support for DATE in parquet-cpp.

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, creating follow up JIRA about GIL questions

@@ -369,6 +392,10 @@ inline Status ArrowSerializer<NPY_OBJECT>::Convert(std::shared_ptr<Array>* out)

// TODO: mask not supported here
const PyObject** objects = reinterpret_cast<const PyObject**>(PyArray_DATA(arr_));
{
PyAcquireGIL lock;
PyDateTime_IMPORT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we should acquire the GIL below before using any of the Py* methods. I'll create a JIRA about going through this module completely and making sure we're releasing the GIL on the Cython side and re-acquiring it in the appropriate places here

@asfgit asfgit closed this in d7845fc Dec 19, 2016
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 2, 2018
Author: Korn, Uwe <Uwe.Korn@blue-yonder.com>

Closes apache#238 from xhochy/PARQUET-867 and squashes the following commits:

e9f79d3 [Korn, Uwe] Remove alt-space from code
5fe849d [Korn, Uwe] Address review comments
fa7a1e0 [Korn, Uwe] Use Slice instead of offset,length
d68e9db [Korn, Uwe] Update Arrow hash
61e3ac0 [Korn, Uwe] Use references instead of pointers
d3c4ec6 [Korn, Uwe] Remove offset,length from public interface
ec0577f [Korn, Uwe] PARQUET-867: Support writing sliced Arrow arrays
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 4, 2018
Author: Korn, Uwe <Uwe.Korn@blue-yonder.com>

Closes apache#238 from xhochy/PARQUET-867 and squashes the following commits:

e9f79d3 [Korn, Uwe] Remove alt-space from code
5fe849d [Korn, Uwe] Address review comments
fa7a1e0 [Korn, Uwe] Use Slice instead of offset,length
d68e9db [Korn, Uwe] Update Arrow hash
61e3ac0 [Korn, Uwe] Use references instead of pointers
d3c4ec6 [Korn, Uwe] Remove offset,length from public interface
ec0577f [Korn, Uwe] PARQUET-867: Support writing sliced Arrow arrays

Change-Id: Ie00a05d627200e6f4919b2c2aaa3dae176a0b7c2
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 6, 2018
Author: Korn, Uwe <Uwe.Korn@blue-yonder.com>

Closes apache#238 from xhochy/PARQUET-867 and squashes the following commits:

e9f79d3 [Korn, Uwe] Remove alt-space from code
5fe849d [Korn, Uwe] Address review comments
fa7a1e0 [Korn, Uwe] Use Slice instead of offset,length
d68e9db [Korn, Uwe] Update Arrow hash
61e3ac0 [Korn, Uwe] Use references instead of pointers
d3c4ec6 [Korn, Uwe] Remove offset,length from public interface
ec0577f [Korn, Uwe] PARQUET-867: Support writing sliced Arrow arrays

Change-Id: Ie00a05d627200e6f4919b2c2aaa3dae176a0b7c2
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 7, 2018
Author: Korn, Uwe <Uwe.Korn@blue-yonder.com>

Closes apache#238 from xhochy/PARQUET-867 and squashes the following commits:

e9f79d3 [Korn, Uwe] Remove alt-space from code
5fe849d [Korn, Uwe] Address review comments
fa7a1e0 [Korn, Uwe] Use Slice instead of offset,length
d68e9db [Korn, Uwe] Update Arrow hash
61e3ac0 [Korn, Uwe] Use references instead of pointers
d3c4ec6 [Korn, Uwe] Remove offset,length from public interface
ec0577f [Korn, Uwe] PARQUET-867: Support writing sliced Arrow arrays

Change-Id: Ie00a05d627200e6f4919b2c2aaa3dae176a0b7c2
wesm pushed a commit to wesm/arrow that referenced this pull request Sep 8, 2018
Author: Korn, Uwe <Uwe.Korn@blue-yonder.com>

Closes apache#238 from xhochy/PARQUET-867 and squashes the following commits:

e9f79d3 [Korn, Uwe] Remove alt-space from code
5fe849d [Korn, Uwe] Address review comments
fa7a1e0 [Korn, Uwe] Use Slice instead of offset,length
d68e9db [Korn, Uwe] Update Arrow hash
61e3ac0 [Korn, Uwe] Use references instead of pointers
d3c4ec6 [Korn, Uwe] Remove offset,length from public interface
ec0577f [Korn, Uwe] PARQUET-867: Support writing sliced Arrow arrays

Change-Id: Ie00a05d627200e6f4919b2c2aaa3dae176a0b7c2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants