Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strange bug that appears for our mock catalogs when upgraded to corrfunc v2 #140

Closed
BenWibking opened this issue Oct 30, 2017 · 13 comments
Closed
Assignees

Comments

@BenWibking
Copy link

BenWibking commented Oct 30, 2017

When I run it on my mocks, the code fails and says to report a bug:

mwe_corrfunc_screenshot

A minimal working example is contained in this tarball: http://www.astronomy.ohio-state.edu/~wibking.1/corrfunc/mwe.tar

The conditions needed to reproduce it are: compute xi(s) from any mock produced by my HOD code with Corrfunc v2.

@BenWibking
Copy link
Author

@ANSalcedo

@lgarrison
Copy link
Collaborator

At first glance, the input does appear to all be in the expected range, so this does look like a Corrfunc bug of some kind. I'll try to look at this some more in the next few days.

@manodeep
Copy link
Owner

This is weird. Here are the flags associated with x/y/z

    C_CONTIGUOUS : True
    F_CONTIGUOUS : True
    OWNDATA : False
    WRITEABLE : True
    ALIGNED : True
    UPDATEIFCOPY : False

If the data are C_CONTIGUOUS, then any issue out of different strides should not occur. And I can see that the arrays values are all in [0.0, 720.0).

I can solve the issue by round-tripping

   x = np.array(x.tolist(), dtype=np.float32)

The only difference I see with the initial x.flags and this new array->list->array x.flags is the OWNDATA flag. However, neither x = np.ascontiguousarray(x) or by x = np.require(x, requirements=['C', 'O', 'W']) work.

I do not have a detailed enough understanding of numpy :(

@manodeep
Copy link
Owner

(Also, the requirements I specified spelled 'COW')

@BenWibking
Copy link
Author

BenWibking commented Oct 31, 2017 via email

@lgarrison
Copy link
Collaborator

lgarrison commented Oct 31, 2017

Think I figured it out. The position arrays are big-endian:

>>> x.dtype
dtype('>f4')

but Corrfunc is probably interpreting them as native-endian, which is little-endian in most cases. Converting the arrays to little-endian with x.byteswap() does the trick, but the correct solution is probably to teach Corrfunc to detect the endianness. This is easy to do in the Python wrappers, which is probably sufficient, since the only way we can ever detect endianness is if the arrays come from Python anyway (i.e. the C interfaces know nothing about endianness).

HDF5 is supposed to take care of some of these endianness issues, but h5py may simply be preserving the endianness of Numpy array inputs. In that case, another quick fix is to make sure the arrays are little-endian before they go into the HDF5 files.

@BenWibking
Copy link
Author

BenWibking commented Oct 31, 2017 via email

@manodeep
Copy link
Owner

It's not that Corrfunc is expecting little-endian, Corrfunc is expecting everything to be in machine-endian. And the array from the hdf5 file is big-endian while the computer itself is little-endian. If the computer were big-endian, then there would not be any problems.

The fix would be to ensure (in the python wrappers) that all arrays in machine-endian order.

@manodeep
Copy link
Owner

(The biggest reason I don't worry about this particular endian-ness mis-match bug is that Corrfunc will crash. Fixing the endian-ness handling is part of #101 )

@BenWibking Do you mind if we close this particular issue?

@BenWibking
Copy link
Author

BenWibking commented Oct 31, 2017 via email

@manodeep
Copy link
Owner

manodeep commented Oct 31, 2017

@BenWibking You know what I am going to say right? ....PR... :)

I think you can make the if condition slightly concise:

def convert_to_native_endian(array):
    '''
    Returns the supplied array in native endian byte-order

    Parameters
    ------------

    array: A numpy array

    Returns
    --------

    array: A numpy array in native-endian byte-order

    Example
    ----------

    >>> import numpy as np
    >>> create a big-endian array
    >>> and then covert to little-endian
    >>> and make sure that the output dtype is little-endian
    >>> This serves as a doctest
    >>> create a native-endian array
    >>> run through function
    >>> ensure output is the same
    '''

    import sys
    system_is_little_endian = (sys.byteorder == 'little')   
    array_is_little_endian = (array.dtype.byteorder == '<')
    if (array_is_little_endian != system_is_little_endian) or (array.dtype.byteorder == '='):
        return array.byteswap().newbyteorder()
    else:
        return array

The doctests are left for you :)

@manodeep
Copy link
Owner

Plus, some tests to make sure that the array is indeed a numpy array or returning back None if the input is None

@lgarrison
Copy link
Collaborator

I can work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants