Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dash a sequence API instead of a list API #43

Closed
bbatsov opened this issue Aug 16, 2013 · 29 comments
Closed

Make dash a sequence API instead of a list API #43

bbatsov opened this issue Aug 16, 2013 · 29 comments

Comments

@bbatsov
Copy link
Contributor

bbatsov commented Aug 16, 2013

While lists are the most popular portion of Emacs Lisp's sequential API we still have to consider other sequential structures like vectors and strings. Without support for them dash cannot replace the sequential API provided by cl-lib in all possible scenarios and that's a shame, since dash's API is so much better.

We've discussed this already at Twitter and I know that at least a few people like @lunaryorn and @joelmccracken support my suggestion. Ideally when operating on a string/vector the functions would return a string or vector, but even a list would do (I assume this would be much easier to implement).

@swsnr
Copy link
Contributor

swsnr commented Aug 16, 2013

While I don't really care about vectors (can't remember having them ever used), it'd be really nice to use dash functions with strings.

So 👍 from me!

@magnars
Copy link
Owner

magnars commented Aug 16, 2013

Could the functions be wrapped somehow to convert strings/vectors to lists and back again, without every single function having to handle it? We're heading to the land of monads and endofunctors here, so think hard. ;)

@bbatsov
Copy link
Contributor Author

bbatsov commented Aug 16, 2013

Yep, I guess we can have some around advice that does the conversion and remember the type, does the list manipulation and converts back the result to the remembered type . Another idea would be see how cl-lib does it as the sequence functions in it handle all sequential types.

@magnars
Copy link
Owner

magnars commented Aug 16, 2013

I guess it should also support seq'ing over hashmaps then, with cons-cells (key . value).

@bbatsov
Copy link
Contributor Author

bbatsov commented Aug 16, 2013

Yep, that would be awesome.

@Fuco1
Copy link
Collaborator

Fuco1 commented Aug 16, 2013

Hm, the advices sound like an easy fix, but if possible I'd rather avoid them. Though right now I can't think of any solution other than e.g. (-vectorify '-take) which isn't very convenient api.


Btw, speaking about monads, a list applicative would be nice. One pretty sweet usage is "get a function that is partial application of OP to list of values". Example

(+) <$> [1,2]

gives a function that takes a list and produces a list with [x + 1, x + 2] concatenated for each x

(+) <$> [1,2] <*> [3]
(+) <$> [1,2] <*> pure 3 -- this is the same as above, `pure` inserts the value in default context.
= [4,5]

(btw <$> is just infix version of fmap and <*> is ap from monads)

This can be interpreted as non-deterministic calculation. For example

(,,,) <$> [0,1] <*> [0,1] <*> [0,1] <*> [0,1] -- or same as
fmap (,,,) <*> [0,1] <*> [0,1] <*> [0,1] <*> [0,1] -- or same as
[(,,,)] <*> [0,1] <*> [0,1] <*> [0,1] <*> [0,1]

generates all 4-tuples of 0/1

Now, the cool thing about applicative application <*> is that it takes a "functor of functions" and lift it on the "functors of values", so we can do (here "functor" being []):

[(+),()] <> [2] <_> [3] -> [5,6] -- that is [2+3, 2_3]

which I'm pretty sure is what -juxt does :P

Edit: oh and by the way, we already have monads in dash, namely -mapcat is the same as >>= for list instance.

Edit: Since I've hijacked this thread already, there's some nonsense I've thrown together :P

(defun -bind (m k)
  (-reduce-r-from (lambda (item ac) (-concat (funcall k item) ac)) nil m))

;; shorter definition
(defun -bind (m k)
  (-mapcat k m))

(defun -lift2M (f x y)
  (-bind x (lambda (x) (-bind y (lambda (y) (list (funcall f x y)))))))

(defun -id2 (x y)
  (-partial x y))

(defun -ap (x y)
  (-lift2M '-id2 x y))

(defun -fmap (x y)
  (-ap (list x) y))

;; unfortunately `-ap' returns closures that aren't evaluated
;; automatically, so they have to be funcall'd with no arguments
;; instead.  I'm sure there's some solution though.
(-lift2M 'plus '(1 2) '(2 3)) => (3 4 4 5)

(-map 'funcall (-ap (-ap '(+) '(1 2)) '(2 3))) => (3 4 4 5) 

(-lift2M 'list '(0 1) '(0 1)) => ((0 0) (0 1) (1 0) (1 1))

(-map 'funcall (-ap (-ap '(list) '(0 1)) '(0 1))) => ((0 0) (0 1) (1 0) (1 1))
(-map 'funcall (-ap (-fmap 'list '(0 1)) '(0 1))) => ((0 0) (0 1) (1 0) (1 1))

;; (-lift2M '(+ -) '(1 1) '(1 2)) we can't do this! `-ap' to the rescue

(-map 'funcall (-ap (-ap '(+ -) '(1 1)) '(1 2))) => (2 3 2 3 0 -1 0 -1)

(-map 'funcall (-ap '(1+ 1-) '(1 2))) => (2 3 0 1)

;; get all the active windows in all frames
(-bind (frame-list) 'window-list)
(-mapcat 'window-list (frame-list))

@johnmastro
Copy link

Generic functions would be one way to do this. It's a shame that EIEIO doesn't currently support dispatching on the built-in types relevant to this. (And while I haven't had a chance to investigate this I imagine adding support would be non-trivial or someone would have done it).

But maybe generics as a general approach would provide some useful ideas.

@Fuco1
Copy link
Collaborator

Fuco1 commented Aug 18, 2013

I was just thinking up some examlpes for -scan and family and came up with this

(apply 'string (cdr (--scan-from (if (eq acc 32) (upcase it) it) 32 (string-to-list "foo bar baz bpp"))))
=> "Foo Bar Baz Bpp"

that would be really elegant if it weren't for the string<->list<->string conversions. So there's definitely a case to somehow make this working! I'm gonna try the advice hack.

Edit: not that bad, plus could probably be auto-generated

(substring (--scan-from (if (eq acc 32) (upcase it) it) 32 "foo bar baz bpp") 1) => "Foo Bar Baz Bpp"

(defadvice -scan-from (around sequence-support activate)
  (let ((lst (ad-get-arg 2)))
    (cond
     ((listp lst) ad-do-it)
     ((stringp lst) 
      (ad-set-arg 2 (string-to-list lst))
      (setq ad-return-value (apply 'string ad-do-it))))))

Edit: this reminds me, there isn't a -tail yet! I had to substring, blerg.

@magnars
Copy link
Owner

magnars commented Aug 19, 2013

What's -scan?

this reminds me, there isn't a -tail yet! I had to substring, blerg.

There's -drop

@Fuco1
Copy link
Collaborator

Fuco1 commented Aug 19, 2013

-scan is like -reduce but keeps all the intermediate results. It can also be viewed as map with context (the previous value).

-tail is -drop 1 or basically cdr, only in dash it would be "polymorphic", so it'd work on strings, lists, vectors alike. There's already -first-item which is just car (which I would've called -head, maybe alias it?)

@bbatsov
Copy link
Contributor Author

bbatsov commented Sep 2, 2013

As far as generic functions go, this might be useful https://github.com/kurisuwhyte/emacs-multi

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 2, 2013

It looks quite interesting, but requires E24 for closures. And dash wants to support E23 as well. I suppose hacking a less-general approach inspired by this just for the built-in sequences would be possible.

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 7, 2013

Okey. I have this. One disadvantage is that when you define the function you need to "type" it. But then it generates the coercion code automatically and the existing code need not to be changed at all (provided the function is typed properly).

Automatic generation of anaphoric versions can also be added automatically.

One other requirement is that all functions should be really written as functions and not as macro expansions. I personally consider that a bad style anyway and my suspicion is also that it causes the misterious "void variable it" bugs. All the -- functions should be (imo) only very thin wrappers basically turning the form into a lambda and that's it.

Right now, all types except :seq has no meaning, but they can be used for the aformentioned generation of anaphora etc, so then some very simple specification will be needed.

(edit: code updated to reflect comment below)

(defmacro defdash (name args return-type docstring &rest body)
  (declare (doc-string 4)
           (debug (&define name (&rest arg) keywordp stringp def-body)))
  (let* ((args (apply 'format-spec-make args))
         type)
    `(defun ,name ,(mapcar 'car args) ,docstring
       (let* (,@(delq nil (mapcar
                           (lambda (x)
                             (when (eq (cdr x) :seq)
                               (list (car x)
                                     `(cond
                                       ((listp ,(car x)) (setq type :list) ,(car x))
                                       ((stringp ,(car x)) (setq type :string) (string-to-list ,(car x)))
                                       ((vectorp ,(car x)) (setq type :vector) (append ,(car x) nil))))))
                           args))
              (result (progn ,@body)))
         ,(cond
           ((eq return-type :seq)
            '(cond
              ((eq type :list) result)
              ((eq type :string) (if (listp result) (apply 'string result) (string result)))
              ((eq type :vector) (vconcat result))))
           (t 'result))))))

(defdash head (list :seq) :elem
  "Return the first element of sequence LIST"
  (car list))

(defdash d-filter (pred :pred list :seq) :seq
  "Return elements of LIST where PRED is non-nil"
  (let (r)
    (while list
      (when (funcall pred (car list)) (push (car list) r))
      (!cdr list))
    (nreverse r)))

(defdash d-index (elem :elem list :seq) :num
  "Return the index of first element of LIST that is `eq' to LIST"
  (let ((i 0))
    (while (not (eq elem (car list))) (incf i) (pop list))
    i))

(head "abc")
(head [1 2 3])
(head '(1 2 3))

(d-filter (lambda (x) (equal x ?c)) "abc")
(d-filter 'odd? [1 2 3])
(d-filter 'odd? '(1 2 3))

(d-index ?b "abc")
(d-index 2 [1 2 3])
(d-index 3 '(1 2 3))

(defun odd? (x) (= 1 (mod x 2)))

@magnars
Copy link
Owner

magnars commented Sep 7, 2013

Very interesting. Thanks, I'll have to dig into this code. But yes, that was pretty much what I had in mind.

As for functions vs macros being default, I remember there being a reason for doing it like that. Just wish I could remember. Guess I should learn to write such decisions down.

I guess, since we have a pretty good test suite, this change might actually be possible to pull off.

@magnars
Copy link
Owner

magnars commented Sep 7, 2013

Isn't this

(mapcar (lambda (x) (car x)) args)

just

(mapcar 'car args)

this?

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 7, 2013

mapcar example

Yes :P Silly me.

I remember there being a reason

Possibly to save a funcall on the form, but that's a silly optimization anyway.


Plus I've always disliked the fact that --each and co. expand to quite large blobs of code, when using macrostep or something do debug code it can turn out pretty ugly quite fast. But that's minor detail, since it's not that common.

Lot of macros in dash are of form

(let (make symbols & copy arguments here)
  `(stuff))

where "stuff" can basically be extracted into function in every case I've looked at now (I haven't checked all).

@magnars
Copy link
Owner

magnars commented Sep 7, 2013

Possibly to save a funcall on the form, but that's a silly optimization anyway.

Yeah, that might be it.

As for optimizations, from what I can tell, when you're using this with just lists like before, there's isn't much overhead. So that's pretty nice.

The macro needs to be hygienic tho, making type and args into uninterned symbols. Other than that, this looks very nice.

I guess the anaphoric macros could be generated using a type designating a lambda + type of args. Like

(defdash -filter (pred :lambda-it list :seq))
(defdash -reduce (pred :lambda-acc-it list :seq))
(defdash -min-by (pred :lambda-it-other list :seq))

Thoughts?

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 7, 2013

Yep, that was my idea as well. Basically, there are three types of lambdas: predicates, comparators and functions, so my original names for types were something like :pred, :comp and :fun, possibly :fun-argn. However, this approach is more generic as we can auto-generate any artity function and also order the arguments. I like it.

The return type is needed because the backward transformation into original type isn't always required (or even wanted), e.g. for functions that return booleans or indices. I can't see any other meaningful value than :seq and anything else (where anything else signifies no conversion). But since we have to add the type we can as well come up with something that has at least a bit of semantics attached, a list of :num, :list, :bool, :seq should be enough.

Edit: oh, and I also made it work with hashtables, using @nicferrier 's kv library. However, with large hash tables it would be rather slow converting it to alist and then back to hashtable. For anything under 100 elements however, I guess the overhead wouldn't be noticable at all.

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 8, 2013

Here's an updated version to also handle the anaphora macros. It still uses not-uninterned symbols as that makes it even more complicated, so I'll fix that when the code is "finalized". The code that generates the defmacro is pretty ugly since it has to escape the backquotes in the generated code (so basically it's two levels of quotation)

(defmacro defdash (name args return-type docstring &rest body)
  (declare (doc-string 4)
           (debug (&define name (&rest arg) keywordp stringp def-body)))
  (let* ((args (apply 'format-spec-make args))
         (macro-name (intern (concat "-" (symbol-name name)))))
    (list
     'progn
     `(defun ,name ,(mapcar 'car args) ,docstring
        (let* (type
               ,@(delq nil (mapcar
                            (lambda (x)
                              (when (eq (cdr x) :seq)
                                (list (car x)
                                      `(cond
                                        ((listp ,(car x)) (setq type :list) ,(car x))
                                        ((stringp ,(car x)) (setq type :string) (string-to-list ,(car x)))
                                        ((vectorp ,(car x)) (setq type :vector) (append ,(car x) nil))
                                        ((hash-table-p ,(car x)) (setq type :hash) (kvhash->alist ,(car x)))))))
                            args))
               (result (progn ,@body)))
          ,(cond
            ((eq return-type :seq)
             '(cond
               ((eq type :list) result)
               ((eq type :string) (apply 'string result))
               ((eq type :vector) (vconcat result))
               ((eq type :hash) (kvalist->hash result))))
            (t 'result))))
     `(defmacro ,macro-name ,(mapcar 'car args)
        ,(concat "Anaphoric version of `" (symbol-name name)  "'.")
        ,(list '\` `(,name ,@(mapcar
                              (lambda (x)
                                (cond
                                 ((string-prefix-p ":lambda" (symbol-name (cdr x)))
                                  (let ((lambda-arg-list (mapcar 'intern (cdr (split-string (symbol-name (cdr x)) "-")))))
                                    (list 'lambda lambda-arg-list (list '\, (car x)))))
                                 (t (list '\, (car x)))))
                              args)))))))

Generated code for (example) filter defdash d-filter (pred :lambda-it list :seq) :seq

(progn
  (defun d-filter (pred list)
    "Return elements of LIST where PRED is non-nil"
    (let* (type
           (list
            (cond
             ((listp list) (setq type :list) list)
             ((stringp list) (setq type :string) (string-to-list list))
             ((vectorp list) (setq type :vector) (append list nil))
             ((hash-table-p list) (setq type :hash) (kvhash->alist list))))
           (result
            (progn
              (let (r)
                (while list
                  (when (funcall pred (car list))
                    (push (car list) r))
                  (!cdr list))
                (nreverse r)))))
      (cond
       ((eq type :list) result)
       ((eq type :string) (apply 'string result))
       ((eq type :vector) (vconcat result))
       ((eq type :hash) (kvalist->hash result)))))

  (defmacro -d-filter (pred list)
    "Anaphoric version of `d-filter'."
    `(d-filter (lambda (it) ,pred) ,list)))

@magnars
Copy link
Owner

magnars commented Sep 8, 2013

Very nice. I see you also added the hash-table support.

I'll spend some time playing around with this. And as for the readability of the macro, I don't find it overwhelming. It's good to know in advance about the double quoting tho.

Thanks again. :)

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 8, 2013

Yea but the hash table stuff depends on kv, I just used that to save time. However those two functions could probably be written more effectively for just this single use case (they are a bit more general)

Edit: oh, the macro still needs an additional check to not generate anaphora if there is no function present.

@magnars
Copy link
Owner

magnars commented Sep 9, 2013

Issues:

  • what happens if I map over a string, but don't return char values?
  • what happens if I map over a hash-map, but don't return cons cells?
  • what happens if I pass a string and a list of chars to a function that takes two lists?

They're all examples of undefined behavior with the transformation back to the original type.

I'm thinking that it shouldn't. That it always returns a list. However, we could offer some nice functions to help out:

(->str list-of-chars)
(->vec list)
(->map list-of-cons)

or maybe

(-into "" list-of-chars)
(-into [] list)
(-into {} list-of-cons)

Maybe the map transformation could also take a list, and chunk it into pairs before transforming.

@Fuco1
Copy link
Collaborator

Fuco1 commented Sep 9, 2013

Well, if someone wants to input a string and output a list (or other type) they would simply need to convert the string into list first (then there is no output conversion). I think that without string input returning strings, this loses a major part of the utility. So

(-map 'upcase "foobar") => "FOOBAR"
(-map 'upcase (->list "foobar")) => ("F" "O" "O" "B" "A" "R")

Where ->list would be a "casting" function from various things into list.

As always in elisp (and dynamic languages), the burden of giving properly typed input/transformation is on the programmer. We shouldn't try to safeguard the programmer when the entire point of these auto-coercions is to make the language more expressive. So for example

(-map (lambda (x) "a") "foobar")

will simply error, and it should since it doesn't make sense.

The string and list-of-chars issue is simply unsolvable. Elisp allows lists of mixed types, and so we would need to scan the entire list, confirm it really is a list of chars, then proceed. This would be too much of a waste. We're never going to get true, type-safe polymorphism if we don't add type checks, and that's just not going to happen in elisp. We can simply document the ways it can be used, and all the rest is error.


There are still functions that return e.g. "list of lists" so these would need to be handled too, or simply left out as non-polymorphic. Though I think that especially with strings, the partition functions would be of the highest importance, as the rest is less applicable,

Dash is I think the most downloaded library on melpa now, but only 5 people are in this thread, which indicates the interest maybe isn't so big and this is not worth musing about. So, maybe another solution would simply be adding those conversion functions you've mentioned and just "compose" them with the "operation" (e.g. -map).

(->> (->list "foo")
   (-map 'dostuff)
   (->str))

However, I would still like it if all the macros were transformed into functions and anaphora macros were turned into simple wrappers. I also feel it would simplify the code a lot.

@bbatsov
Copy link
Contributor Author

bbatsov commented Sep 9, 2013

👍 for @Fuco1's comment.

@magnars
Copy link
Owner

magnars commented Sep 9, 2013

Good points, Fuco.

@bbatsov
Copy link
Contributor Author

bbatsov commented Sep 25, 2013

Any progress with this?

@magnars
Copy link
Owner

magnars commented Sep 25, 2013

No. I'm basically not making progress anywhere at the moment. Too much work.

@CestDiego
Copy link

any progress here? this is so needed by me now that I can't use --group-by on vectors that are returned when I do a json-read from a json received after an http-request :( so changing it to lists is not really convenient

@bbatsov
Copy link
Contributor Author

bbatsov commented Nov 4, 2015

I've now moved to seq.el, so I no longer care about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants