-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: type annotations docs #96
Changes from all commits
ac36ccf
53c6839
df018d3
e216937
00fdb5f
1e5416d
606e1ea
3b59bbb
479ec19
c989b4c
5eadcc9
b9218d2
8b4aaab
ae1673a
713fd69
151124e
f9aaac4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,238 @@ | ||
* Type annotations | ||
|
||
In Elisp users are not required to provide type annotations to their | ||
code. While at many places the types can be inferred there are | ||
places, especially in user-defined functions, where we can not infer | ||
the correct type. | ||
|
||
Therefore Elsa provides users with the option to annotate their | ||
function definitions. The annotations are placed in the body of the | ||
function inside a =declare= form: | ||
|
||
#+BEGIN_SRC elisp | ||
(defun add-one (x) | ||
(declare (elsa (int) int)) | ||
(1+ x)) | ||
#+END_SRC | ||
|
||
The =declare= form starts with =elsa= followed by one or two values. If | ||
two are provided, first must be a list which is the list of input | ||
types and the second is the return type. | ||
|
||
If only one is provided it is taken to be the return type and the | ||
function must have no input arguments. | ||
|
||
Here are general guidelines on how the types are constructed. | ||
|
||
** Built-in types | ||
|
||
For built-in types with test predicates, drop the =p= or =-p= suffix to | ||
get the type: | ||
|
||
- =stringp= → =string= | ||
- =integerp= → =integer= (=int= is also accepted) | ||
- =markerp= → =marker= | ||
- =hash-table-p= → =hash-table= | ||
|
||
Some additional special types: | ||
|
||
- =t= stands for =t= and is always true | ||
- =nil= stands for =nil= and is always false | ||
- =bool= is a combination of explicitly =t= or =nil= | ||
|
||
** Nullable types | ||
|
||
By default types do not accept =nil= values. This helps preventing | ||
errors where you would pass =nil= to a function expecting a value. To | ||
mark a type nullable you can combine it with =nil= type using the [[id:5a21a68a-4df1-4d44-a854-1d9700858a1a][or]] | ||
combinator: =(or string nil)= is a type which takes any string but also | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the rationale for dropping the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea I like it and I would not oppose having it remain. It is a lot nicer to read I don't want to overdo it with the suffixes though. There will also be the generic There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you can combine There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you know Haskell I can think of this as an analogy of
We can also put some things to the front, (e)lisp has a tradition with the Or we can go entirely different way and for example use strings or quoted symbols for generic names, so that it would be |
||
=nil=. | ||
|
||
** Most general type | ||
|
||
A type for everything is called =mixed=. It accepts anything and is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that "accepts anything" is a bit wrong here: apparently, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should so that's a bug. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue in #112 is the other way around though. We want an array |
||
always nullable (that means it accepts =nil=). This is the default type | ||
for when we lack type information. | ||
|
||
Because a =mixed= type can be anything it itself is only accepted by | ||
=mixed=, so that the following would not type-check: | ||
|
||
#+BEGIN_SRC elisp | ||
(defun a-goes-in (a) ;; a is mixed | ||
(declare (elsa (mixed) int)) | ||
;; a *might* be an int, but we don't know for sure | ||
(1+ a)) | ||
#+END_SRC | ||
|
||
** Composite (higher order) types | ||
|
||
Composite or higher-order types are types which take other types as | ||
arguments. | ||
|
||
Composite types usually correspond to data constructors such as =cons=, | ||
=list=, =vector=... | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how do you specify a plist where odd items are symbols and even items are strings? E.g.,:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For plists I'm planning to have a special type There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually it will be a bit more involved since you might also want to express what keys exactly are allowed. For that I'm pondering a
Allows any of Then
Is a type of a plist with string values and keys Now this still does not allow us to specify what value type we want for what key type exactly. In javascript (flow/typescript) they do it simply by enumerating: {
name: string,
age: int
} so maybe we can also have such a syntax:
I'm not decided yet here what is the best way. Ideally somehow combine it all for maximal flexibility. |
||
|
||
- =(cons a b)= where =a= is the type of =car= and =b= is the type of =cdr=. If | ||
the =car= and =cdr= can be anything write =(cons mixed mixed)= or simply | ||
=cons= for short. | ||
- =(list a)= where =a= is the type of items in the list. If the list can | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With this syntax, you can't distinguish between:
I suggest There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. I never actually had in mind to have lists of finite/known length with the type constructor Soy idea was I think I did not write it down anywhere, so I should. Does this make sense to you? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I had another thought at
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's also a bit related to structures like plists or alists, where alists are lists of lists where the In the future I also want to support "constant" types, so you can explicitly require specific values, for example, using your notation
could mean "this is a list of strings which should be interpreted as regular expressions if the The constants will probably have a constructor There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We wanted to reuse the defcustom stuff so I dived into it and I realized that I really like the constructor-as-type principle. There's also this cool pattern-matching library which uses a similar concept: https://github.com/VincentToups/shadchen-el There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So actually maybe we can lift the The same can work in parallel with This might be it! :D There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about this Lists are made by chaining cons cells together and leaving =nil= in the Elisp provides a data constructor =list= to make creating such lists In general =(list a b c ...)= is a list type of known length where =a=, =b=, Sometimes we need to express lists of unknown length which contain The type =(list a b c . rest-type)= is a type similar to =list= except the If the list can hold any number of anything, write =(list . mixed)= or There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like it very much, good job. I'm not so sure about this phrasing though:
If the reader doesn't pay attention, she might understand that you are talking about
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point, I wasn't so sure myself but couldn't phrase it better. I'm also thinking about maybe introducing some other shorter notation without the dot as that is confusing and we maybe might want to use it for dotted lists (albeit those are really rare so I wouldn't care about those very much). Something like |
||
hold anything, write =(list mixed)= or simply =list= for short. | ||
- =(vector a)= where =a= is the type of items in the vector. If the | ||
vector can hold anything, write =(vector mixed)= or simply =vector= for | ||
short. | ||
- =(hash-table k v)= where =k= is the key type and =v= is the value type. | ||
If the hash table can hold anything, write =(hash-table mixed mixed)= | ||
or simply =hash-table= for short. | ||
|
||
** Constant types | ||
|
||
A constant type always holds a specific value. Functions often take | ||
flags which can be symbols such as ='append= or ='prepend= or constant | ||
strings. | ||
|
||
To specify a constant type wrap the value in a =(const)= constructor, so | ||
that: | ||
|
||
- =(const a)= is the symbol =a= (when used in a lisp program you would | ||
pass it around as ='a=), | ||
- =(const 1)= is the integer =1=, | ||
- =(const "foo")= is the string ="foo"=. | ||
|
||
** Function types | ||
|
||
Function types are types of functions. They have input argument types | ||
and a return type. | ||
|
||
The function =add-one= from the introduction has a function type =(function | ||
(int) int)= which means it takes in one integer and returns an integer. | ||
|
||
A =lambda= form =(lambda (x) (number-to-string x))= has function type | ||
=(function (number) string)=, it takes in a number and returns a string. | ||
|
||
A function can have a function type as one of its input types. An | ||
Fuco1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
example of such a function is =mapcar= which takes a function and a list | ||
and applies the function to every item of the list. | ||
|
||
#+BEGIN_SRC elisp | ||
(defun app (fn) | ||
"Apply FN to the list (1 2 3 4)" | ||
(declare (elsa ((function (number) number)) (list number))) | ||
(mapcar fn (list 1 2 3 4))) | ||
|
||
(app (lambda (x) (* x x))) | ||
#+END_SRC | ||
|
||
The =app= function requires that we pass in a function which processes a | ||
Fuco1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
number into a number and returns a list of numbers. | ||
|
||
** TODO Generic types | ||
|
||
Generic types are types where some of the type arguments are variable. | ||
Both basic and composite types can be turned into generic types. | ||
|
||
*** Motivation | ||
|
||
An example of a generic function is =identity=. This function takes | ||
anything in and anything out. We could therefore give it a type | ||
annotation =(elsa (mixed) mixed)=. | ||
|
||
However, we can do better! We know that whatever was passed in will | ||
be returned and so the type actually must be the same. The =(elsa | ||
(mixed) mixed)= signature allows us to pass in an =int= and it can return | ||
back a =string= no problem and so it would not catch a huge number of | ||
possible errors. | ||
|
||
What we want to express here is "X comes in, X comes out". | ||
Fuco1 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
*** Syntax | ||
|
||
The syntax for generic types is "generic type name" + =*= suffix. Any | ||
string can be used for the generic type name, but customarily | ||
single-letter names are used. | ||
|
||
For the above mentioned identity function we therefore write the type | ||
as =(elsa (a*) a*)= where =a*= stands for a generic type =a=. | ||
|
||
A function such as =car= can be typed as follows: | ||
|
||
#+BEGIN_SRC elisp | ||
(elsa ((cons a* b*)) a*) | ||
#+END_SRC | ||
|
||
It takes a cons with =a= in the =car= and =b= in the =cdr= and return the =car= | ||
which is of type =a= , whatever that happens to be. | ||
|
||
** Optional types | ||
|
||
If a function can take optional arguments we need to convert them into | ||
a nullable type =(or type nil)=. | ||
|
||
#+BEGIN_SRC elisp | ||
(defun drop-items (list &optional n) | ||
"Drop first item of LIST or N items if N is provided." | ||
(declare (elsa ((list a*) (or int nil)) (list a*))) | ||
(setq n (or n 1)) | ||
(dotimes (_ n list) | ||
(setq list (cdr list)))) | ||
#+END_SRC | ||
|
||
** Variadic types | ||
|
||
If a function can take arbitrary number of arguments we preceed the | ||
last variadic argument with =&rest= marker just as we do in the argument | ||
list. | ||
|
||
#+BEGIN_SRC elisp | ||
(defun join (separator &rest strings) | ||
"Join STRINGS with SEPARATOR." | ||
(declare (elsa (string &rest string) string)) | ||
(mapconcat 'identity strings separator)) | ||
#+END_SRC | ||
|
||
** Type combinators | ||
*** Sum types | ||
:PROPERTIES: | ||
:ID: 5a21a68a-4df1-4d44-a854-1d9700858a1a | ||
:END: | ||
|
||
Sum types can be specified as a list form starting with =or=, so =(or | ||
string int)= is a type accepting strings or integers. | ||
|
||
A sum type is useful if the function internally checks the passed | ||
value and decides what processing to do: | ||
|
||
#+BEGIN_SRC elisp | ||
(defun to-number (x) | ||
(declare (elsa ((or int string)) int)) | ||
(cond | ||
((numberp x) x) | ||
((stringp x) (string-to-number x)))) | ||
#+END_SRC | ||
|
||
*** Intersection types | ||
|
||
Intersection types can be specified as list form starting with =and=, so | ||
=(and string float)= is a type which is at the same time string and | ||
float (such a type has empty domain, nothing can be string and float | ||
at the same time). Intersection types are used to track impossible | ||
assignments. | ||
|
||
#+BEGIN_SRC elisp | ||
;; Such a condition can never evaluate to true | ||
(if (and (stringp x) (integerp x)) | ||
"X is both string and int which is impossible, this branch never executes" | ||
"This branch always executes") | ||
#+END_SRC | ||
|
||
*** Difference types | ||
|
||
Difference types can be specified as list form starting with =diff= so =(diff | ||
mixed string)= is a type which can be anything except a string. | ||
|
||
Difference types are useful in narrowing the possible values of variables after conditional checks. | ||
|
||
#+BEGIN_SRC elisp | ||
(if (stringp x) | ||
"X is definitely string here" | ||
"X is anything but string here") | ||
#+END_SRC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to drop the parens around arguments and just have a sequence of types where the last one is automatically return type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that would be awkward in case you want a void function or rather, ignore the return value. Requiring parentheses around arguments would model this naturally (nothing would follow them) whereas without them you'd always need to specify the last thing. Also, what if you have an empty argument list, but a meaningful return value (like,
emacs-version
)? How would you tell both apart without parentheses around the arguments? If you need a better separator of arguments and return value, I like:
in the Wirth family, an arrow would be another option.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can always inspect the argument list, so incase of
(point)
you know theint
is the return value since there is no argument.I have never considered void functions, such a thing does not really exist in Elisp does it? At the very least you always return
nil
or some such.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course there are void functions, everything you call for its side effect, like
save-buffer
(returnsnil
) orrevert-buffer
(returnst
). What I'm pointing out is that we could avoid ambiguity with sexp-syntax.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A void function returns nothing. Every function in elisp returns something.
Doing
(string-to-number (save-buffer))
would error. We don't like errors, therefore we must always consider the return type... the above code is silly but might be a result of a failed refactoring or a mistake and we want to emit a warning there.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and my conclusion is that we should consider the
void
return type. The result of a function returningvoid
should not be passed as argument to another function nor stored into a variable. When calling avoid
function is the last statement of a function, then that function itself returnsvoid
.I agree with @Fuco1 that in ELisp, every function returns something. But not all functions are advertised as returning something: those functions should have the
void
return type.Consider
save-buffer
for example and consider that novoid
return type exists. A static analysis could detect that this function returns abool
for example. Then, Elsa should accept code like this(if (save-buffer) ...)
. My opinion is that this is wrong because the implementation ofsave-buffer
can change tomorrow to return anint
instead. This will be an implementation detail untilsave-buffer
docstring describes what the return value is.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Void is nasty because it doesn't play well with the type algebra.
What would this function return?
A type like
void | int
makes no sense because void basically poisons everything. If you would want to use such a function you would have to test the return type anyway (did I get an int back?). So then the type ofsave-buffer
doesn't really matter.It's a bit silly example, but the problem is that void is not a type in a sense every other type is a type (i.e. a set). Void is not even an empty set type, it's something outside the hierarchy.
This function
also returns void? Or is that something else.
I have this issue #68 to check for unused return values so I would rather go with a "this function should be only used for side-effect" warning but not on the type level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are much more knowledgeable than me in this area. Do what you think is best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some languages (Rust, OCaml) have unit types for this purpose
()
. In this example,(or void int)
tells you that you may get a useful value (int
) or you may get a useless one (void
). You have to test the return type in order to use theint
, because you cannot do anything withvoid
. But that's the point of the type analysis: to make sure your code is correct in all situations.But it does matter. Consider the alternative:
(or bool int)
, in which case you may freely use thebool
return value even though it's undocumented, and may change in the future as argued by @DamienCassou. If it changes toint
, then if you previously made use of thebool
value your code won't typecheck anymore, but there is value in an annotation to indicate "this value is not relevant and subject to change".Something else: it's the empty type (or bottom type). It indicates that this function never returns, which is different than
void
as argued above, which means returns something you cannot use.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To complete: the empty type may be more tricky to deal with, but in practice may not be that useful. The
void
unit type however, though by not mean indispensable. is useful.