Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI for s390x and fix big-endian arch #98

Merged
merged 3 commits into from
Dec 17, 2022
Merged

Add CI for s390x and fix big-endian arch #98

merged 3 commits into from
Dec 17, 2022

Conversation

wismill
Copy link
Collaborator

@wismill wismill commented Dec 1, 2022

Fix for 32bits words on big-endian arch.

Fixes #97

Copy link
Member

@harendra-kumar harendra-kumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix the following files as well for the same issue:

  • unicode-data-scripts/test/Unicode/Char/General/ScriptsSpec.hs
  • unicode-data-scripts/lib/Unicode/Char/General/Scripts.hs

We may have problems in future if we forget to include MachDeps.h , this is a nasty thing that is not caught by any tool and can be easily missed. We can do two things to mitigate this:

  • define a CPP macro for reading Word32 with endianness handling and then use that macro everywhere. That way we need to include MachDeps only in the file defining that macro. Both byteswap and narrowing can be included in that macro so we do not miss it anywhere.
  • Add a CI for s390x. We can copy it from unicode-transforms repo.

@@ -51,9 +51,9 @@ name (C# c#) = getName 0# 149185#
indexInt32OffAddr'# k# =
#ifdef WORDS_BIGENDIAN
#if MIN_VERSION_base(4,16,0)
word2Int# (byteSwap32# (word32ToWord# (indexWord32OffAddr# offsets# k#)))
word2Int# (narrow32Word# (byteSwap32# (word32ToWord# (indexWord32OffAddr# offsets# k#))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume byteSwap32# changes the byte ordering. The doc (https://hackage.haskell.org/package/base-4.17.0.0/docs/GHC-Exts.html#v:byteSwap32-35-) says:

Swap bytes in the lower 32 bits of a word. The higher bytes are undefined. 

I do not understand what "swap bytes" means for 4 bytes, the term "swap" is applicable only to 2 bytes.

Is there a documentation somewhere about what narrow32Word# does? Does it zero the higher 32 bits? The doc of byteSwap32# above says the "higher bytes are undefined" so perhaps that is what the bug was?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find much information. I found some hints in cryptonite package.

My understanding is that byte swapping adapt the endianness and is not limited to 2 bytes. Here byteSwap32# is for 4 bytes, and byteSwap16# would be for 2, etc.

@wismill wismill force-pushed the wip/fix_be branch 2 times, most recently from 56e57e1 to 1a75a0b Compare December 4, 2022 14:01
@wismill
Copy link
Collaborator Author

wismill commented Dec 4, 2022

We need to fix the following files as well for the same issue:

* unicode-data-scripts/test/Unicode/Char/General/ScriptsSpec.hs

* unicode-data-scripts/lib/Unicode/Char/General/Scripts.hs

These are already fixed.

We may have problems in future if we forget to include MachDeps.h , this is a nasty thing that is not caught by any tool and can be easily missed. We can do two things to mitigate this:

* define a CPP macro for reading Word32 with endianness handling and then use that macro everywhere. That way we need to include MachDeps only in the file defining that macro. Both byteswap and narrowing can be included in that macro so we do not miss it anywhere.

* Add a CI for s390x. We can copy it from unicode-transforms repo.

I think the latter is better. I am giving a try.

@wismill
Copy link
Collaborator Author

wismill commented Dec 4, 2022

@harendra-kumar there is an issue with the CI for GHC 8.0.2 and 8.2.2. It seems they are not supported anymore.

@wismill wismill changed the title Draft: Fix for big-endian arch Draft: Add CI for s390x and fix big-endian arch Dec 4, 2022
@harendra-kumar
Copy link
Member

We need to fix the following files as well for the same issue:

* unicode-data-scripts/test/Unicode/Char/General/ScriptsSpec.hs

* unicode-data-scripts/lib/Unicode/Char/General/Scripts.hs

These are already fixed.

It see this in unicode-data-scripts/lib/Unicode/Char/General/Scripts.hs:

#ifdef WORDS_BIGENDIAN
#if MIN_VERSION_base(4,16,0)
            byteSwap32# (word32ToWord# (indexWord32OffAddr# addr# k#));
#else
            byteSwap32# (indexWord32OffAddr# addr# k#);
#endif

Don't we need a call to narrowWord here?

@harendra-kumar
Copy link
Member

@harendra-kumar there is an issue with the CI for GHC 8.0.2 and 8.2.2. It seems they are not supported anymore.

The installation seems to be failing, not sure what's wrong. We can investigate if someone asks for it, otherwise let's ignore it for now.

- Add missing include "MachDeps.h"
- Use narrow32Word# to ensure correct result after byteSwap32# on Big-Endian arch
Adapted from unicode-transforms
Try to spare memory for s390x CI.
@wismill
Copy link
Collaborator Author

wismill commented Dec 4, 2022

@harendra-kumar After the experiment in #99, I think this is ready to merge if the relevant tests pass (let’s discard GHC ≤ 8.2). I reorganized and squashed some commits.

Keep in mind that we can fully test only against GHC 9.6. And GHC support for Unicode < 12.1 is not reliable, because there are some discrepancies (I checked manually some errors).

@wismill wismill changed the title Draft: Add CI for s390x and fix big-endian arch Add CI for s390x and fix big-endian arch Dec 4, 2022
@harendra-kumar
Copy link
Member

It seems you missed this #98 (comment) . Can you please respond to this.

@wismill
Copy link
Collaborator Author

wismill commented Dec 5, 2022

It seems you missed this #98 (comment) . Can you please respond to this.

I think I did fix it. Are you using the latest version?

@wismill
Copy link
Collaborator Author

wismill commented Dec 16, 2022

@harendra-kumar kind reminder

@harendra-kumar
Copy link
Member

I think I did fix it. Are you using the latest version?

Looks like I missed it, maybe I did not pull it again or ignored the pull error.

@wismill wismill merged commit 3ba60c1 into master Dec 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Testsuite fails on s390x with GHC 9.0.2
2 participants