Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support byteswap #5689

Open
Cadair opened this issue Dec 8, 2019 · 3 comments
Open

Support byteswap #5689

Cadair opened this issue Dec 8, 2019 · 3 comments
Labels

Comments

@Cadair
Copy link

Cadair commented Dec 8, 2019

I am trying to add Dask support to astropy.io.fits and because it's such a lovely file format FITS is big endian on disk, so before I can .store() the data to the file I need to byteswap the array.

What I have currently done which seems to work is:

        @delayed
        def _byteswap(arr):
            return np.array(arr).byteswap(True).newbyteorder("S")

        return dask.array.from_delayed(_byteswap(array), shape=array.shape,
                                       dtype=array.dtype.newbyteorder("S"))

which appears to work but is somewhat weird.

I was wondering what would be involved in adding .byteswap to Array this seems to suggest it wouldn't be that hard, but I wouldn't know where to start.

@mrocklin
Copy link
Member

mrocklin commented Dec 8, 2019

@Cadair it looks like this is an embarassingly parallel operation, in which case, your current workaround might be faster as ...

out = x.map_blocks(np.ndarray.byteswap, True).map_blocks(np.ndarray.newbyteorder, "S")

Or, if you wanted to keep things numpy agnostic, maybe swap out np.ndarray for M as follows

from dask.utils import M

out = x.map_blocks(M.byteswap, True).map_blocks(M.newbyteorder, "S")

See https://docs.dask.org/en/latest/best-practices.html#learn-techniques-for-customization

If it is this simple then this would probably make a fine contribution to the dask array module. You might search for similarly simple methoods, like clip to see examples of what needs to be touched.

@Cadair
Copy link
Author

Cadair commented Dec 9, 2019

Thanks @mrocklin

I swapped out my implementation with what you suggested, and it didn't work right away. I haven't looked into it any more though, will try and debug later.

@TomAugspurger
Copy link
Member

How are things going here @Cadair? Anything we can help with?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants