-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strange bug that appears for our mock catalogs when upgraded to corrfunc v2 #140
Comments
At first glance, the input does appear to all be in the expected range, so this does look like a Corrfunc bug of some kind. I'll try to look at this some more in the next few days. |
This is weird. Here are the flags associated with C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False If the data are I can solve the issue by round-tripping x = np.array(x.tolist(), dtype=np.float32) The only difference I see with the initial I do not have a detailed enough understanding of |
(Also, the requirements I specified spelled 'COW') |
This is very weird. C_CONTIGUOUS should be all that is needed to treat
the data like a C array. It's possible the flag is set incorrectly.
Should be testable by iterating through x[i] via cython and passing x[0]
as a pointer to the cython function. This has to be some kind of bug in
h5py or (less likely) numpy...? I don't see how else
x=np.array(x.tolist()) would fix the problem.
…On 10/31/17 1:41 AM, Manodeep Sinha wrote:
This is weird. Here are the flags associated with |x/y/z|
C_CONTIGUOUS : True
F_CONTIGUOUS : True
OWNDATA : False
WRITEABLE : True
ALIGNED : True
UPDATEIFCOPY : False
If the data are |C_CONTIGUOUS|, then any issue out of different
|strides| should not occur. And I can see that the arrays values are
all in |[0.0, 720.0)|.
I can solve the issue by round-tripping
x = np.array(x.tolist(), dtype=np.float32)
The only difference I see with the initial |x.flags| and this new
array->list->array |x.flags| is the *OWNDATA* flag. However, neither
|x = np.ascontiguousarray(x)| or by |x = np.require(x,
requirements=['C', 'O', 'W'])| work.
I do not have a detailed enough understanding of |numpy| :(
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#140 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABswn_I0S1hIrQTT0njFzEJGxUfC945Cks5sxrMYgaJpZM4QL-RQ>.
|
Think I figured it out. The position arrays are big-endian:
but Corrfunc is probably interpreting them as native-endian, which is little-endian in most cases. Converting the arrays to little-endian with HDF5 is supposed to take care of some of these endianness issues, but h5py may simply be preserving the endianness of Numpy array inputs. In that case, another quick fix is to make sure the arrays are little-endian before they go into the HDF5 files. |
Wow, that's amazing. I have no idea how I managed to create big-endian
arrays! Thanks, Lehman. This is very helpful.
…On 10/31/17 12:44 PM, Lehman Garrison wrote:
Think I figured it out. The position arrays are big-endian:
|>>> x.dtype() dtype('>f4') |
but Corrfunc is probably interpreting them as native-endian, which is
little-endian in most cases. Converting the arrays to little-endian
with |x.byteswap()| does the trick, but the correct solution is
probably to teach Corrfunc to detect the endianness. This is easy to
do in the Python wrappers, which is probably sufficient, since the
only way we can ever detect endianness is if the arrays come from
Python anyway (i.e. the C interfaces know nothing about endianness).
HDF5 is supposed to take care of some of these endianness issues, but
h5py may simply be preserving the endianness of Numpy array inputs. In
that case, another quick fix is to make sure the arrays are
little-endian before they go into the HDF5 files.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#140 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABswn2A7bhaFYmNkQ6Sql3Z8BbLy01ruks5sx05rgaJpZM4QL-RQ>.
|
It's not that Corrfunc is expecting little-endian, Corrfunc is expecting everything to be in machine-endian. And the array from the hdf5 file is big-endian while the computer itself is little-endian. If the computer were big-endian, then there would not be any problems. The fix would be to ensure (in the python wrappers) that all arrays in machine-endian order. |
(The biggest reason I don't worry about this particular endian-ness mis-match bug is that Corrfunc will crash. Fixing the endian-ness handling is part of #101 ) @BenWibking Do you mind if we close this particular issue? |
Fixing it and saving future graduate students from tearing their hair
out should take just 10 additional lines of code in the python wrapper:
def convert_to_native_endian(array):
import sys
system_is_little_endian = (sys.byteorder == 'little')
array_is_little_endian = (array.dtype.byteorder == '<')
is_native_endian = (system_is_little_endian and
array_is_little_endian) or (not system_is_little_endian and not
array_is_little_endian) or (array.dtype.byteorder == '=')
if not is_native_endian:
return array.byteswap().newbyteorder()
else:
return array
x,y,z,weights = [convert_to_native_endian(arr) for arr in [x,y,z,weights]]
…On 10/31/17 4:37 PM, Manodeep Sinha wrote:
(The biggest reason I don't worry about this particular endian-ness
mis-match bug is that Corrfunc will crash. Fixing the endian-ness
handling is part of #101
<#101> )
@BenWibking <https://github.com/benwibking> Do you mind if we close
this particular issue?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#140 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABswn2fMJIlyRCN4O3J7IcYuFkRFis03ks5sx4T4gaJpZM4QL-RQ>.
|
@BenWibking You know what I am going to say right? ....PR... :) I think you can make the def convert_to_native_endian(array):
'''
Returns the supplied array in native endian byte-order
Parameters
------------
array: A numpy array
Returns
--------
array: A numpy array in native-endian byte-order
Example
----------
>>> import numpy as np
>>> create a big-endian array
>>> and then covert to little-endian
>>> and make sure that the output dtype is little-endian
>>> This serves as a doctest
>>> create a native-endian array
>>> run through function
>>> ensure output is the same
'''
import sys
system_is_little_endian = (sys.byteorder == 'little')
array_is_little_endian = (array.dtype.byteorder == '<')
if (array_is_little_endian != system_is_little_endian) or (array.dtype.byteorder == '='):
return array.byteswap().newbyteorder()
else:
return array The doctests are left for you :) |
Plus, some tests to make sure that the array is indeed a numpy array or returning back |
I can work on this. |
When I run it on my mocks, the code fails and says to report a bug:
A minimal working example is contained in this tarball: http://www.astronomy.ohio-state.edu/~wibking.1/corrfunc/mwe.tar
The conditions needed to reproduce it are: compute xi(s) from any mock produced by my HOD code with Corrfunc v2.
The text was updated successfully, but these errors were encountered: