Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect simple-array access with (declaim (optimize (speed 3))) #446

Closed
MattKaufmann opened this issue May 27, 2023 · 13 comments
Closed
Assignees

Comments

@MattKaufmann
Copy link

The example below ends with a form that gives an incorrect array value on my MacBook Pro. I extracted it from an example supplied by Warren A. Hunt, Jr. and Yahya Sohail that runs in ACL2, but the example below runs in pure CCL. After seeing the problem with CCL build from git version 01cf6cd, I rebuilt CCL from the latest version, ff51228, and got the same (unfortunate) result.

(declaim (optimize (speed 3)))

(defun arr-fieldi (i ar)
  (aref (the (simple-array (signed-byte 64) (10))
             ar)
        i))

(defun update-arr-fieldi (i v ar)
  (progn (setf (aref ar i) v)
         ar))

(defvar my-ar (make-array 10
                          :element-type '(signed-byte 64)
                          :initial-element '0))

; True:
(typep my-ar 'simple-array)

(update-arr-fieldi 0 #x-7FFFFFFE47AFEF96 my-ar)

; The following returns 537, not the expected value, #x-7FFFFFFE47AFEF96.
(arr-fieldi 0 my-ar)
@yaso9
Copy link

yaso9 commented May 27, 2023

The issue seems to only occur with the 64-bit version of CCL. Using the 32-bit version yields the correct result. I tried using git bisect to find the commit that introduced the bug, but I had trouble compiling many of the older versions of CCL (it kept getting stuck on compiling compiler/optimizers.lisp).

@MattKaufmann
Copy link
Author

I've discovered that CCL 16365, apparently from 2015, doesn't have the bug. I seem to have found versions from 2017 that do have the bug.

@svspire
Copy link
Contributor

svspire commented Jun 9, 2023

Changing first line to
(declaim (optimize (speed 3) (safety 3)))
fixes it for me. That might be a workaround until we can figure out the real solution.

@MattKaufmann
Copy link
Author

@svspire Thank you for the reply! Unfortunately, that seems to slow things down by two orders of magnitude; see attached example. That example uses the default optimization settings that ACL2 uses. I tried some other fixes that didn't work, but one that did: changing the accessor from (the (simple-array (signed-byte 64) (10)) ar) to (the (simple-array * (10)) ar). That one adds 37% to the time in my test, so still not great; but I guess it will have to do until CCL is fixed.
(Sorry about the ".txt" extension on the attached file; I couldn't seem to attach it as a ".lsp" file.)
ccl-array-bug.txt

@MattKaufmann
Copy link
Author

I have attached an example showing that the problem occurs at the fixnum boundary. I wonder if CCL is mistakenly assuming here that a (signed-byte 64) is a fixnum.
ccl-array-bug-2.txt

@MattKaufmann
Copy link
Author

One more clue (and then I may stop): The bug seems to require that the integer type contains a negative number; see attached.
ccl-array-bug-3.txt

MattKaufmann pushed a commit to acl2/acl2 that referenced this issue Jun 10, 2023
…ard simplification.

Quoting :DOC note-8-6:

  (CCL only) [Stobj] array code now has a workaround for a CCL bug
  found by Yahya Sohail, in the case of reading a stobj array of
  integers where the element type includes at least one negative
  number and one non-fixnum.  This fix may slow down such stobj array
  reads a bit in CCL, in such cases; one measurement showed about 37%
  more time for such a read.  Thanks to Yahya for the bug report and
  to Warren Hunt for encouraging a workaround.

Thanks to Alessandro Coglio and Eric Smith for a discussion leading to
improvements in :DOC xargs and :DOC guard-simplification, regarding
simplification of guard proof obligations and legal values for
controlling that in various utilities.

The rest of this message is about the CCL bug workaround.

The CCL bug page is Clozure/ccl#446.

A pair of regression runs on the same machine, one without the fix and
one with the fix, showed no significant slowdown:

;;; without fix
156021.842u 3586.534s 5:53:12.65 753.1% 0+0k 2980480+9345808io 199pf+0w

;;; with fix
156317.165u 3357.513s 5:52:27.84 755.0% 0+0k 2284760+9073896io 4pf+0w

However there could be some slowdown with heavy use of stobj reads in
the case shown above.  (Maybe not; even if they slow down 37%, how
much of a computation is typically from such reads?)

Here is the example from Yahya Sohail.

(defstobj test-stobj
  (arr-field :type (array (signed-byte 64) (10))
             :initially 0))
(update-arr-fieldi 0 #x-7FFFFFFE47AFEF96 test-stobj)
(arr-fieldi 0 test-stobj)

Before the fix, the last form evaluates to the odd result of 537.
After the fix the result is as expected (the value that was just
written at index 0).

This CCL bug was apparently introduced sometime between 2015 and 2017.
A CCL developer has commented on it.  I expect to remove this special
code not long after it's fixed (which I hope is soon).
@xrme xrme self-assigned this Jun 12, 2023
@xrme
Copy link
Member

xrme commented Jun 12, 2023

Suspiciously, the number 537 (#x219) is a two-digit bignum header. That would seem to indicate that something in the process of turning an element of a (signed-byte 64) vector into an integer is getting mixed up.

@svspire
Copy link
Contributor

svspire commented Jun 12, 2023

Aha! If you disassemble arr-fieldi compiled with (safety 0), you'll see
(movl ($ #x219) (% imm0.l))
I knew that was where the 537 came from, but I couldn't figure out what that constant was doing there.

@gmpalter
Copy link
Member

It also fails with positive values as long as they're too big to fit in a fixnum.

The generated code properly creates the 2-digit bignum to hold the result and stores it in the return value. But, the code then falls through to the code which creates a new signed-byte-64 from whatever's in %imm0, which is the 2-digit bignum header. The culprit is x862-box-s64 but I don't know how to fix it yet.

@gmpalter
Copy link
Member

Found it and just checked in the fix.

@MattKaufmann
Copy link
Author

This is great! Is there an easy way to check programmatically for whether CCL has this fix, other than to run an example that formerly gave an incorrect result?

@gmpalter
Copy link
Member

I can't think of one as it was a compiler bug. (There's a test for it now in the test suite at the end of ccl.lsp.)

@MattKaufmann
Copy link
Author

That's very helpful actually. Thank you again (and all those who helped) for the fix!

MattKaufmann pushed a commit to acl2/acl2 that referenced this issue Jun 13, 2023
…hat have the fix for that bug.

If you use a git version of CCL with that bug fix (commit 6b1f1d3a30
or later; it's dated "Mon Jun 12 16:10:42 2023 -0400"), then ACL2 will
avoid working around that bug and you'll get the original, unimpeded
performance for stobj array reads.  For older CCL versions with the
bug, the workaround will continue to take effect.  Quoting a revised
version of the relevant release note that was recently added:

  (CCL only) [Stobj] array code now has a workaround for a {CCL bug |
  Clozure/ccl#446} found by Yahya Sohail,
  in the case of reading a stobj array of integers where the element
  type includes at least one negative number and one non-fixnum.
  That bug has been around since at least as far back as 2017, and
  was fixed on June 12, 2023.  For those using a CCL version with the
  bug, this fix may slow down such stobj array reads a bit in the
  case described above; one measurement showed about 37% more time
  for such a read.  Thanks to Yahya for the bug report, to Warren
  Hunt for encouraging a workaround, and to the CCL developers (in
  particular Gary Palter) for fixing the CCL bug.

Note: I successfully ran the usual fresh "make regression-everything"
twice: once using an older CCL with CCL bug #446, and once using the
latest CCL iwth the fix for that bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants