Skip to content

Commit

Permalink
0.8.7.5:
Browse files Browse the repository at this point in the history
	Implement modular (unsigned-byte 32) multiplication on x86
  • Loading branch information
csrhodes committed Jan 2, 2004
1 parent f73c1f3 commit 3a13d77
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 15 deletions.
2 changes: 2 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -2237,6 +2237,8 @@ changes in sbcl-0.8.8 relative to sbcl-0.8.7:
* bug fix: DECODE-UNIVERSAL-TIME now accepts timezone arguments with
second-resolution: integer multiples of 1/3600 between -24 and 24.
(thanks to Vincent Arkesteijn)
* optimization: implemented multiplication as a modular
(UNSIGNED-BYTE 32) operation on the x86 backend.

planned incompatible changes in 0.8.x:
* (not done yet, but planned:) When the profiling interface settles
Expand Down
43 changes: 29 additions & 14 deletions src/compiler/x86/arith.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -1154,6 +1154,11 @@
(define-vop (fast---mod32-c/unsigned=>unsigned fast---c/unsigned=>unsigned)
(:translate --mod32))

(define-modular-fun *-mod32 (x y) * 32)
(define-vop (fast-*-mod32/unsigned=>unsigned fast-*/unsigned=>unsigned)
(:translate *-mod32))
;;; (no -C variant as x86 MUL instruction doesn't take an immediate)

(define-vop (fast-ash-left-mod32-c/unsigned=>unsigned
fast-ash-c/unsigned=>unsigned)
(:translate ash-left-mod32))
Expand Down Expand Up @@ -1656,25 +1661,35 @@
(t (incf count)))))
(decompose-multiplication arg x n-bits condensed)))

(defun *-transformer (y)
(cond
((= y (ash 1 (integer-length y)))
;; there's a generic transform for y = 2^k
(give-up-ir1-transform))
((member y '(3 5 9))
;; we can do these multiplications directly using LEA
`(%lea x x ,(1- y) 0))
((member :pentium4 *backend-subfeatures*)
;; the pentium4's multiply unit is reportedly very good
(give-up-ir1-transform))
;; FIXME: should make this more fine-grained. If nothing else,
;; there should probably be a cutoff of about 9 instructions on
;; pentium-class machines.
(t (optimize-multiply 'x y))))

(deftransform * ((x y)
((unsigned-byte 32) (constant-arg (unsigned-byte 32)))
(unsigned-byte 32))
"recode as leas, shifts and adds"
(let ((y (lvar-value y)))
(cond
((= y (ash 1 (integer-length y)))
;; there's a generic transform for y = 2^k
(give-up-ir1-transform))
((member y '(3 5 9))
;; we can do these multiplications directly using LEA
`(%lea x x ,(1- y) 0))
((member :pentium4 *backend-subfeatures*)
;; the pentium4's multiply unit is reportedly very good
(give-up-ir1-transform))
;; FIXME: should make this more fine-grained. If nothing else,
;; there should probably be a cutoff of about 9 instructions on
;; pentium-class machines.
(t (optimize-multiply 'x y)))))
(*-transformer y)))

(deftransform sb!vm::*-mod32
((x y) ((unsigned-byte 32) (constant-arg (unsigned-byte 32)))
(unsigned-byte 32))
"recode as leas, shifts and adds"
(let ((y (lvar-value y)))
(*-transformer y)))

;;; FIXME: we should also be able to write an optimizer or two to
;;; convert (+ (* x 2) 17), (- (* x 9) 5) to a %LEA.
2 changes: 1 addition & 1 deletion version.lisp-expr
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@
;;; checkins which aren't released. (And occasionally for internal
;;; versions, especially for internal versions off the main CVS
;;; branch, it gets hairier, e.g. "0.pre7.14.flaky4.13".)
"0.8.7.4"
"0.8.7.5"

0 comments on commit 3a13d77

Please sign in to comment.