Skip to content

Commit

Permalink
Lists: proper error messages for linear update procedures
Browse files Browse the repository at this point in the history
  • Loading branch information
gambiteer committed Jan 3, 2024
1 parent 8f2e71c commit 8e27b63
Showing 1 changed file with 22 additions and 14 deletions.
36 changes: 22 additions & 14 deletions lib/gambit/list/list.scm
Original file line number Diff line number Diff line change
Expand Up @@ -680,35 +680,43 @@

(define-procedure (filter! (pred procedure)
(list proper-list))
(filter pred list)) ;; no optimization due to continuations
(macro-filter (filter! pred list) list x (macro-auto-force (pred x)))) ;; no optimization due to continuations

(define-prim&proc (remove (pred procedure)
(list proper-list))
(macro-filter (remove pred list) list x (not (macro-auto-force (pred x)))))

(define-procedure (remove! (pred procedure)
(list proper-list))
(remove pred list)) ;; no optimization due to continuations
(macro-filter (remove! pred list) list x (not (macro-auto-force (pred x))))) ;; no optimization due to continuations

(define-prim&proc (remq (x object)
(list proper-list))
(macro-filter (remq x list) list elem (not (eq? x elem))))

(begin

(define-macro (macro-partition)
'(macro-force-vars (list)
(let loop ((probe list) (rev-in '()) (rev-out '()))
(if (pair? probe)
(let ((x (car probe)))
(if (macro-auto-force (pred x))
(loop (macro-auto-force (cdr probe)) (cons x rev-in) rev-out)
(loop (macro-auto-force (cdr probe)) rev-in (cons x rev-out))))
(macro-check-proper-list-null* probe list '(2 . list) ((%procedure%) pred list)
(values (reverse rev-in) ;; reverse! unusable due to continuations
(reverse rev-out)))))))


(define-prim&proc (partition (pred procedure)
(list proper-list))
(macro-force-vars (list)
(let loop ((probe list) (rev-in '()) (rev-out '()))
(if (pair? probe)
(let ((x (car probe)))
(if (macro-auto-force (pred x))
(loop (macro-auto-force (cdr probe)) (cons x rev-in) rev-out)
(loop (macro-auto-force (cdr probe)) rev-in (cons x rev-out))))
(macro-check-proper-list-null* probe list '(2 . list) ((%procedure%) pred list)
(values (reverse rev-in) ;; reverse! unusable due to continuations
(reverse rev-out)))))))
(macro-partition))

(define-prim&proc (partition! (pred procedure) (list proper-list))
(partition pred list)) ;; no optimization due to continuations
(macro-partition)) ;; no optimization due to continuations

)

(define-prim&proc (delete (x object)
(list proper-list)
Expand All @@ -720,7 +728,7 @@
(list proper-list)
(= procedure (primitive equal?)))
;; TODO: don't put third parameter in error messages when it is absent
(delete x list =)) ;; no optimization due to continuations
(macro-filter (delete! x list =) list elem (not (macro-auto-force (= x elem))))) ;; no optimization due to continuations

;;;----------------------------------------------------------------------------

Expand Down

7 comments on commit 8e27b63

@gambiteer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good way to do it, there should be one "private" procedure for each pair, e.g., delete|delete!, and the public procedures should call that one procedure and pass the name of the public procedure for the error message.
I'll think about it some more.

@feeley
Copy link
Member

@feeley feeley commented on 8e27b63 Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is not the best way to do it because it creates a lot of code bloat for procedures that are superfluous (because delete and delete! do exactly the same thing, so a programmer knowing this will likely just use the shorter name). Creating this much bloat just to get the right name in the error messages is not a good tradeoff.

What you suggest, i.e. passing the procedure as a prim parameter to a common auxiliary procedure, is probably the best solution. It has been used in other places in the Gambit runtime library (see for example open-tcp-client). It will reduce bloat but unfortunately it will probably impact the execution speed because the prim parameter may cause stack spilling. Still, probably the right tradeoff here.

@gambiteer
Copy link
Collaborator Author

@gambiteer gambiteer commented on 8e27b63 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marc: I now think that capturing and replaying the continuation of one of these comparison procedures is a violation of the "linear update" assumption for the ! procedures, and that the Gambit implementation of these procedures should correspond more with the SRFI 1 sample implementation and not take care that continuations might screw things up. I sent a too-long comment to the SRFI 1 list about this yesterday. Brad
Edit: There's this patch to chibi, which I find interesting and might be relevant: ashinn/chibi-scheme@967b888 that

@gambiteer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marc: I've been thinking about an implementation of filter! along the lines of SRFI 1's sample implementation with proper error checking.
It appears that one wants to check that the list argument is indeed a proper list before set-cdr!ing any of the pairs in list.
That implies at least two passes through the list argument, one to check that it's a proper list, the next to actually call pred on the car of each pair to see whether that pair should be part of the result.
That may be faster than just calling filter, which allocates new pairs, but it's not clear it's worth the trouble.

@gambiteer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried checking that the argument is a proper list, and then using Olin's filter! implementation:

(declare (standard-bindings)
  (extended-bindings)
  (block)
  (not safe))

(define (filter-new! pred lis)
  (if (procedure? pred)
      (let loop1 ((probe lis))
        (if (pair? probe)      ;; check that lis is a proper list
            (loop1 (cdr probe))
            (if (null? probe)  ;; yes
                (let loop2 ((ans lis))
                  (if (null? ans)
                      ans
                      (if (pred (car ans))
                          (letrec ((scan-in (lambda (prev lis)
				              (if (pair? lis)
					          (if (pred (car lis))
					              (scan-in lis (cdr lis))
					              (scan-out prev (cdr lis))))))
			           (scan-out (lambda (prev lis)
				               (let lp ((lis lis))
				                 (if (pair? lis)
					             (if (pred (car lis))
					                 (begin (set-cdr! prev lis)
						                (scan-in lis (cdr lis)))
					                 (lp (cdr lis)))
					             (set-cdr! prev lis))))))
		            (scan-in ans (cdr ans))
		            ans)
                          (loop2 (cdr ans)))))
                (error "not a proper list" pred lis))))
      (error "not a procedure" pred lis)))

and got, with a very big heap,

> (define long (iota 1000000))            
> (define a (time (filter-new! ##even? long)))
(time (filter-new! ##even? long))
    0.017120 secs real time
    0.017104 secs cpu time (0.016737 user, 0.000367 system)
    no collections
    64 bytes allocated
    no minor faults
    no major faults
    61443009 cpu cycles
> (define long (iota 1000000))                
> (define a (time (filter! ##even? long)))    
(time (filter! ##even? long))
    0.028590 secs real time
    0.028568 secs cpu time (0.021533 user, 0.007035 system)
    no collections
    1062064 bytes allocated
    8328 minor faults
    no major faults
    102591730 cpu cycles
> (length long)
1000000
> (define a (time (filter-new! ##odd? long))) 
(time (filter-new! ##odd? long))
    0.018048 secs real time
    0.018038 secs cpu time (0.018037 user, 0.000001 system)
    no collections
    64 bytes allocated
    no minor faults
    no major faults
    64787112 cpu cycles

So this is noticeably faster than Gambit's implementation of filter!=filter.
call/cc in pred can screw it up, but I believe "linear update" implies "Don't use call/cc in pred."
Should I pursue this?

@feeley
Copy link
Member

@feeley feeley commented on 8e27b63 Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation you propose is correct and it would be preferable to filter! = filter.

I want to point out that in general it is not possible to check a property of the list (such as being a proper list) and then acting upon that when walking the list because it is not thread safe. This is because there are separate moments when the list is checked for being a proper list and when the list is walked "knowing" that it is a proper list. There could be a separate thread that mutates the list (into an improper list) after the proper list check and before walking the list. So be mindful of this trap.

A typical example is computing the length of a list to make sure it is of length 2, and then doing an unsafe (##cadr list) to get the second element. This is not safe because a separate thread could (set-cdr! list '()) just before the (##cadr list).

I think your code doesn't suffer from that because even if a separate thread mutates the list it will give some coherent result and not crash the program.

Note that your code could be improved by interleaving the proper list check and the walking of the list. The only downside is that if an improper list is detected, some of the list will already have been mutated so there isn't a clean separation between checking that the parameters are OK and doing the requested operation. Note that other Gambit procedures mix type checking and performing the operation:

> (map pp '(1 2 3 . 4))
1
2
3
*** ERROR IN (stdin)@1.1-1.22 -- (Argument 2, list1) PROPER LIST expected
(map '#<procedure #2 pp> '(1 2 3 . 4))

@gambiteer
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm beginning to think that mutation with the goal of "speeding things up" makes "defensive programming" (i.e., correct programming) a pain in the ass,

I've only sorta understood how continuations and mutation interact for about 18 months now, learning how threads and mutation interact is hurting my brain too much.

I decided that this "downside" you mentioned:

The only downside is that if an improper list is detected, some of the list will already have been mutated so there isn't a clean separation between checking that the parameters are OK and doing the requested operation.

is enough reason to keep filter=filter!, as any error message would be referring to an already modified argument and would be impossible to debug.

And I'm OK with the code duplication, too, unless we want to refactor to have both filter and filter! be one-liners that call a procedure ##filter that does all the work and accepts an extra argument for proper error messages.

Please sign in to comment.